Identification of pseudogenes in the Drosophila melanogaster genome.

Abstract:

:Pseudogenes are copies of genes that cannot produce a protein. They can be detected from disruptions to their apparent coding sequence, caused by frameshifts and premature stop codons. They are classed as either processed pseudogenes (made by reverse transcription from an mRNA) or duplicated pseudogenes, arising from duplication in the genomic DNA and subsequent disablement. Historically, there is anecdotal evidence that the fruit fly (Drosophila melanogaster) has few pseudogenes. Investigators have linked this to a high deletion rate of genomic DNA, for which there is evidence from genetic experiments on genome size. Here, we apply a homology-based pipeline that was developed previously to identify pseudogenes in other eukaryotic genomes, to the fruit fly, so as to derive the first complete survey of its pseudogene population. We find approximately 100 pseudogenes, with at least a sixth of these as candidate processed pseudogenes. This gives a much lower proportion of pseudogenes (compared with the size of the proteome) than in the genomes of other eukaryotes for which data are available (human, nematode and budding yeast). Closest matching proteins to Drosophila pseudogenes are significantly longer than the average protein in its proteome (up to approximately 60% more than the average protein's length), in contrast to the situation in the three other eukaryotic genomes. This may be due to the persistence of fragments of longer genes. In the fly pseudogene population, we found most pseudogenes for serine proteases (which are more abundant in the Drosophila lineage compared with the other eukaryotes), immunoglobulin-motif-containing proteins and cytochromes P450. Data on the sequences and positions of the putative pseudogenes are available at: http://www.pseudogene.org/fly. The detection of a small number of pseudogenes in the Drosophila genome and the higher mean length for the closest matching proteins to pseudogenes (possibly because remnants of genes encoding longer proteins are more likely to persist) are further evidence for a high deletion rate of genomic DNA in the fruit fly. The data are useful for molecular evolution study in Drosophila.

journal_name

Nucleic Acids Res

journal_title

Nucleic acids research

authors

Harrison PM,Milburn D,Zhang Z,Bertone P,Gerstein M

doi

10.1093/nar/gkg169

keywords:

subject

Has Abstract

pub_date

2003-02-01 00:00:00

pages

1033-7

issue

3

eissn

0305-1048

issn

1362-4962

journal_volume

31

pub_type

杂志文章
  • The use of beta-galactosidase as a marker gene to define the regulatory sequences of the herpes simplex virus type 1 glycoprotein C gene in recombinant herpesviruses.

    abstract::The expression of Herpes Simplex Virus 1 (HSV-1) glycoprotein C (gC), a well defined herpesvirus late gene, was studied by linking the promoter-regulatory region of this gene to the coding sequences for the bacterial enzyme, beta-galactosidase (beta-gal). A chimeric gene, containing the beta-gal gene under the control...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/16.21.10267

    authors: Weir JP,Narayanan PR

    更新日期:1988-11-11 00:00:00

  • Evolution of the casein multigene family: conserved sequences in the 5' flanking and exon regions.

    abstract::The rat alpha- and bovine alpha s1-casein genes have been isolated and their 5' sequences determined. The rat alpha-, beta-, gamma- and bovine alpha s1-casein genes contain similar 5' exon arrangements in which the 5' noncoding, signal peptide and casein kinase phosphorylation sequences are each encoded by separate ex...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/14.4.1883

    authors: Yu-Lee LY,Richter-Mann L,Couch CH,Stewart AF,Mackinlay AG,Rosen JM

    更新日期:1986-02-25 00:00:00

  • ModBase, a database of annotated comparative protein structure models, and associated resources.

    abstract::ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modelle...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkq1091

    authors: Pieper U,Webb BM,Barkan DT,Schneidman-Duhovny D,Schlessinger A,Braberg H,Yang Z,Meng EC,Pettersen EF,Huang CC,Datta RS,Sampathkumar P,Madhusudhan MS,Sjölander K,Ferrin TE,Burley SK,Sali A

    更新日期:2011-01-01 00:00:00

  • Differential involvement of E2A-corepressor interactions in distinct leukemogenic pathways.

    abstract::E2A is a member of the E-protein family of transcription factors. Previous studies have reported context-dependent regulation of E2A-dependent transcription. For example, whereas the E2A portion of the E2A-Pbx1 leukemia fusion protein mediates robust transcriptional activation in t(1;19) acute lymphoblastic leukemia, ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt855

    authors: Gow CH,Guo C,Wang D,Hu Q,Zhang J

    更新日期:2014-01-01 00:00:00

  • A bidirectionally active signal for termination of transcription is located between tetA and orfL on transposon Tn10.

    abstract::A terminator of transcription with bidirectional activity has been located between the translation termination codons of the genes tetA and orfL on Tn10. These genes are transcribed towards each other. Each orientation of the intervening sequence is shown to reduce the expression of the lacZ and galK genes when cloned...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/13.12.4227

    authors: Schollmeier K,Gärtner D,Hillen W

    更新日期:1985-06-25 00:00:00

  • 2'-O-methyl-modified phosphorothioate antisense oligonucleotides have reduced non-specific effects in vitro.

    abstract::Antisense oligodeoxynucleotides (ODNs) have biological activity in treating various forms of cancer. The antisense effects of two types of 20mer ODNs, phosphorothioate-modified ODNs (S-ODNs) and S-ODNs with 12 2'-O-methyl groups (Me-S-ODNs), targeted to sites 109 and 277 of bcl-2 mRNA, were compared. Both types were a...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkh516

    authors: Yoo BH,Bochkareva E,Bochkarev A,Mou TC,Gray DM

    更新日期:2004-04-02 00:00:00

  • DNA topoisomerases from rat liver: physiological variations.

    abstract::Besides the nicking-closing (topoisomerase I) activity, an ATP-dependent DNA topoisomerase is present in rat liver nuclei. The enzyme, partially purified, is able to catenate in vitro closed DNA circles in a magnesium-dependent, ATP-dependent, histone H1-dependent reaction, and to decatenate in vitro kinetoplast DNA n...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/11.4.1059

    authors: Duguet M,Lavenot C,Harper F,Mirambeau G,De Recondo AM

    更新日期:1983-02-25 00:00:00

  • Transcript quantitation in total yeast cellular RNA using kinetic PCR.

    abstract::Kinetically monitored, reverse transcriptase-initiated PCR (kinetic RT-PCR, kRT-PCR) is a novel application of kinetic PCR for high throughput transcript quantitation in total cellular RNA. The assay offers the simplicity and flexibility of an enzyme assay with distinct advantages over DNA microarray hybridization and...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/28.2.e2

    authors: Kang JJ,Watson RM,Fisher ME,Higuchi R,Gelfand DH,Holland MJ

    更新日期:2000-01-15 00:00:00

  • MMDB: Entrez's 3D structure database.

    abstract::The three dimensional structures for representatives of nearly half of all protein families are now available in public databases. Thus, no matter which protein one investigates, it is increasingly likely that the 3D structure of a homolog will be known and may reveal unsuspected structure-function relationships. The ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/27.1.240

    authors: Marchler-Bauer A,Addess KJ,Chappey C,Geer L,Madej T,Matsuo Y,Wang Y,Bryant SH

    更新日期:1999-01-01 00:00:00

  • HOMCOS: a server to predict interacting protein pairs and interacting sites by homology modeling of complex structures.

    abstract::As protein-protein interactions are crucial in most biological processes, it is valuable to understand how and where protein pairs interact. We developed a web server HOMCOS (Homology Modeling of Complex Structure, http://biunit.naist.jp/homcos) to predict interacting protein pairs and interacting sites by homology mo...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn218

    authors: Fukuhara N,Kawabata T

    更新日期:2008-07-01 00:00:00

  • Structural analysis of the genetic switch that regulates the expression of restriction-modification genes.

    abstract::Controller (C) proteins regulate the timing of the expression of restriction and modification (R-M) genes through a combination of positive and negative feedback circuits. A single dimer bound to the operator switches on transcription of the C-gene and the endonuclease gene; at higher concentrations, a second dimer bo...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn448

    authors: McGeehan JE,Streeter SD,Thresh SJ,Ball N,Ravelli RB,Kneale GG

    更新日期:2008-08-01 00:00:00

  • Identification of cis-acting elements in the SUC2 promoter of Saccharomyces cerevisiae required for activation of transcription.

    abstract::We analyzed the effects of site-directed mutations in the SUC2 promoter of Saccharomyces cerevisiae. Analyses were performed in wild-type as well as mig1 and tup1 mutant strains after the promoter mutants were reintroduced into the native SUC2 locus on the left arm of chromosome IX. Mutation of the two GC boxes reveal...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/26.4.1002

    authors: Bu Y,Schmidt MC

    更新日期:1998-02-15 00:00:00

  • The effects of polymorphisms on human gene targeting.

    abstract::DNA mismatches that occur between vector homology arms and chromosomal target sequences reduce gene targeting frequencies in several species; however, this has not been reported in human cells. Here we demonstrate that even a single mismatched base pair can significantly decrease human gene targeting frequencies. In a...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt1303

    authors: Deyle DR,Li LB,Ren G,Russell DW

    更新日期:2014-03-01 00:00:00

  • A chicken middle-repetitive DNA sequence which shares homology with mammalian ubiquitous repeats.

    abstract::We have identified and sequenced two members of a chicken middle repetitive DNA sequence family. By reassociation kinetics, members of this family (termed CRl) are estimated to be present in 1500-7000 copies per chicken haploid genome. The first family member sequenced (CRlUla) is located approximately 2 kb upstream f...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/9.20.5383

    authors: Stumph WE,Kristo P,Tsai MJ,O'Malley BW

    更新日期:1981-10-24 00:00:00

  • Improved detection of small deletions in complex pools of DNA.

    abstract::About 40% of the genes in the nematode Caenorhabditis elegans have homologs in humans. Based on the history of this model system, it is clear that the application of genetic methods to the study of this set of genes would provide important clues to their function in humans. To facilitate such genetic studies, we are e...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gnf051

    authors: Edgley M,D'Souza A,Moulder G,McKay S,Shen B,Gilchrist E,Moerman D,Barstead R

    更新日期:2002-06-15 00:00:00

  • The International Gene Trap Consortium Website: a portal to all publicly available gene trap cell lines in mouse.

    abstract::Gene trapping is a method of generating murine embryonic stem (ES) cell lines containing insertional mutations in known and novel genes. A number of international groups have used this approach to create sizeable public cell line repositories available to the scientific community for the generation of mutant mouse str...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkj097

    authors: Nord AS,Chang PJ,Conklin BR,Cox AV,Harper CA,Hicks GG,Huang CC,Johns SJ,Kawamoto M,Liu S,Meng EC,Morris JH,Rossant J,Ruiz P,Skarnes WC,Soriano P,Stanford WL,Stryke D,von Melchner H,Wurst W,Yamamura K,Young SG,

    更新日期:2006-01-01 00:00:00

  • EcoGene 3.0.

    abstract::EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade i...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gks1235

    authors: Zhou J,Rudd KE

    更新日期:2013-01-01 00:00:00

  • A novel RNA molecular signature for activation of 2'-5' oligoadenylate synthetase-1.

    abstract::Human 2'-5' oligoadenylate synthetase-1 (OAS1) is central in innate immune system detection of cytoplasmic double-stranded RNA (dsRNA) and promotion of host antiviral responses. However, the molecular signatures that promote OAS1 activation are currently poorly defined. We show that the 3'-end polyuridine sequence of ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gku1289

    authors: Vachon VK,Calderon BM,Conn GL

    更新日期:2015-01-01 00:00:00

  • The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide.

    abstract::The Genomes On Line Database (GOLD) is a web resource for comprehensive access to information regarding complete and ongoing genome sequencing projects worldwide. The database currently incorporates information on over 1500 sequencing projects, of which 294 have been completed and the data deposited in the public data...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkj145

    authors: Liolios K,Tavernarakis N,Hugenholtz P,Kyrpides NC

    更新日期:2006-01-01 00:00:00

  • High-resolution profiling of the LEDGF/p75 chromatin interaction in the ENCODE region.

    abstract::Lens epithelium-derived growth factor/p75 (LEDGF/p75) is a transcriptional coactivator involved in stress response, autoimmune disease, cancer and HIV replication. A fusion between the nuclear pore protein NUP98 and LEDGF/p75 has been found in human acute and chronic myeloid leukemia and association of LEDGF/p75 with ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkq410

    authors: De Rijck J,Bartholomeeusen K,Ceulemans H,Debyser Z,Gijsbers R

    更新日期:2010-10-01 00:00:00

  • High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites.

    abstract::In silico prediction of transcription factor binding sites (TFBSs) is central to the task of gene regulatory network elucidation. Genomic DNA sequence information provides a basis for these predictions, due to the sequence specificity of TF-binding events. However, DNA sequence alone is an impoverished source of infor...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn866

    authors: Whitington T,Perkins AC,Bailey TL

    更新日期:2009-01-01 00:00:00

  • Comparative sequence and structure of viroid-like RNAs of two plant viruses.

    abstract::A newly discovered group of spherical plant viruses contains a bipartite genome consisting of a single-strand linear RNA molecule (RNA 1, Mr 1.5 x 10(6) ), and a single-strand, covalently closed circular viroid-like RNA molecule (RNA 2, Mr approximately 125,000). The nucleotide sequences of the RNA 2 of two of these, ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/10.12.3681

    authors: Haseloff J,Symons RH

    更新日期:1982-06-25 00:00:00

  • A cDNA clone encoding a glycinin A1a subunit precursor of soybean.

    abstract::A cDNA clone covering the whole coding region for a glycinin subunit precursor containing the A1a acidic subunit, one of the A2 family, has been identified from a library of soybean cotyledonary cDNA clones using a mixed oligonucleotide probe. Analysis of the cDNA insert revealed that it contained 1746 nucleotides of ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/13.18.6719

    authors: Negoro T,Momma T,Fukazawa C

    更新日期:1985-09-25 00:00:00

  • Patscanui: an intuitive web interface for searching patterns in DNA and protein data.

    abstract::Patterns in biological sequences frequently signify interesting features in the underlying molecule. Many tools exist to search for well-known patterns. Less support is available for exploratory analysis, where no well-defined patterns are known yet. PatScanUI (https://patscan.secondarymetabolites.org/) provides a hig...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gky321

    authors: Blin K,Wohlleben W,Weber T

    更新日期:2018-07-02 00:00:00

  • Rapid assay for detection of Escherichia coli xanthine-guanine phosphoribosyltransferase activity in transduced cells.

    abstract::Cultured mammalian cells transduced with the Escherichia coli gene, Ecogpt, synthesize the bacterial enzyme xanthine-guanine phosphoribosyl transferase (XGPT) (1). This paper describes a method for measuring XGPT activity in crude cell extracts by following the conversion of 14C-xanthine (X) to 14C-xanthine monophosph...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/13.8.2921

    authors: Chu G,Berg P

    更新日期:1985-04-25 00:00:00

  • ATPase activity of the UvrA and UvrAB protein complexes of the Escherichia coli UvrABC endonuclease.

    abstract::We have analyzed the ATPase activity exhibited by the UvrABC DNA repair complex. The UvrA protein is an ATPase whose lack of DNA dependence may be related to the ATP induced monomer-dimer transitions. ATP induced dimerization may be responsible for the enhanced DNA binding activity observed in the presence of ATP. Alt...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/17.11.4145

    authors: Oh EY,Claassen L,Thiagalingam S,Mazur S,Grossman L

    更新日期:1989-06-12 00:00:00

  • Transient expression directed by homologous and heterologous promoter and enhancer sequences in fish cells.

    abstract::In order to construct fish specific expression vectors for studies on gene regulation in vitro and in vivo a variety of heterologous enhancers and promoters from mammals and from viruses of higher vertebrate cells were tested for expression of the bacterial chloramphenicol acetyl transferase reporter gene in three tel...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/18.11.3299

    authors: Friedenreich H,Schartl M

    更新日期:1990-06-11 00:00:00

  • Systematic analysis of enzymatic DNA polymerization using oligo-DNA templates and triphosphate analogs involving 2',4'-bridged nucleosides.

    abstract::In order to systematically analyze the effects of nucleoside modification of sugar moieties in DNA polymerase reactions, we synthesized 16 modified templates containing 2',4'-bridged nucleotides and three types of 2',4'-bridged nucleoside-5'-triphospates with different bridging structures. Among the five types of ther...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn404

    authors: Kuwahara M,Obika S,Nagashima J,Ohta Y,Suto Y,Ozaki H,Sawai H,Imanishi T

    更新日期:2008-08-01 00:00:00

  • A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing.

    abstract::Multiplexed high-throughput pyrosequencing is currently limited in complexity (number of samples sequenced in parallel), and in capacity (number of sequences obtained per sample). Physical-space segregation of the sequencing platform into a fixed number of channels allows limited multiplexing, but obscures available s...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkm760

    authors: Parameswaran P,Jalili R,Tao L,Shokralla S,Gharizadeh B,Ronaghi M,Fire AZ

    更新日期:2007-01-01 00:00:00

  • Processing of complementary sense RNAs of Digitaria streak virus in its host and in transgenic tobacco.

    abstract::We have used a polymerase chain reaction (PCR) procedure to analyse low abundance complementary sense RNAs of Digitaria streak virus (DSV) from infected leaves of Digitaria setigera. This study has confirmed that both spliced and unspliced RNAs are synthesised by the same transcription unit. The position of the intron...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/18.24.7259

    authors: Mullineaux PM,Guerineau F,Accotto GP

    更新日期:1990-12-25 00:00:00