Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome.

Abstract:

:The approach to annotating a genome critically affects the number and accuracy of genes identified in the genome sequence. Genome annotation based on stringent gene identification is prone to underestimate the complement of genes encoded in a genome. In contrast, over-prediction of putative genes followed by exhaustive computational sequence, motif and structural homology search will find rarely expressed, possibly unique, new genes at the risk of including non-functional genes. We developed a two-stage approach that combines the merits of stringent genome annotation with the benefits of over-prediction. First we identify plausible genes regardless of matches with EST, cDNA or protein sequences from the organism (stage 1). In the second stage, proteins predicted from the plausible genes are compared at the protein level with EST, cDNA and protein sequences, and protein structures from other organisms (stage 2). Remote but biologically meaningful protein sequence or structure homologies provide supporting evidence for genuine genes. The method, applied to the Drosophila melanogaster genome, validated 1,042 novel candidate genes after filtering 19,410 plausible genes, of which 12,124 matched the original 13,601 annotated genes. This annotation strategy is applicable to genomes of all organisms, including human.

journal_name

Nat Genet

journal_title

Nature genetics

authors

Gopal S,Schroeder M,Pieper U,Sczyrba A,Aytekin-Kurban G,Bekiranov S,Fajardo JE,Eswar N,Sanchez R,Sali A,Gaasterland T

doi

10.1038/85922

keywords:

subject

Has Abstract

pub_date

2001-03-01 00:00:00

pages

337-40

issue

3

eissn

1061-4036

issn

1546-1718

journal_volume

27

pub_type

杂志文章
  • Distinct interactions of PML-RARalpha and PLZF-RARalpha with co-repressors determine differential responses to RA in APL.

    abstract::Acute promyelocytic leukaemia (APL), associated with chromosomal translocations involving the retinoic acid receptor alpha gene (RARA) and the PML gene, is sensitive to retinoic acid (RA) treatment, while APL patients harbouring translocations between RARA and the PLZF gene do not respond to RA. We have generated PML-...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng0298-126

    authors: He LZ,Guidez F,Tribioli C,Peruzzi D,Ruthardt M,Zelent A,Pandolfi PP

    更新日期:1998-02-01 00:00:00

  • The genetics of plant metabolism.

    abstract::Variation for metabolite composition and content is often observed in plants. However, it is poorly understood to what extent this variation has a genetic basis. Here, we describe the genetic analysis of natural variation in the metabolite composition in Arabidopsis thaliana. Instead of focusing on specific metabolite...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng1815

    authors: Keurentjes JJ,Fu J,de Vos CH,Lommen A,Hall RD,Bino RJ,van der Plas LH,Jansen RC,Vreugdenhil D,Koornneef M

    更新日期:2006-07-01 00:00:00

  • A putative pheromone receptor gene expressed in human olfactory mucosa.

    abstract::Pheromones elicit specific behavioural responses and physiological alterations in recipients of the same species. In mammals, these chemical signals are recognized within the nasal cavity by sensory neurons that express pheromone receptors. In rodents, these receptors are thought to be represented by two large multige...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/79124

    authors: Rodriguez I,Greer CA,Mok MY,Mombaerts P

    更新日期:2000-09-01 00:00:00

  • A map of constrained coding regions in the human genome.

    abstract::Deep catalogs of genetic variation from thousands of humans enable the detection of intraspecies constraint by identifying coding regions with a scarcity of variation. While existing techniques summarize constraint for entire genes, single gene-wide metrics conceal regional constraint variability within each gene. The...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/s41588-018-0294-6

    authors: Havrilla JM,Pedersen BS,Layer RM,Quinlan AR

    更新日期:2019-01-01 00:00:00

  • DNA methylation dynamics during B cell maturation underlie a continuum of disease phenotypes in chronic lymphocytic leukemia.

    abstract::Charting differences between tumors and normal tissue is a mainstay of cancer research. However, clonal tumor expansion from complex normal tissue architectures potentially obscures cancer-specific events, including divergent epigenetic patterns. Using whole-genome bisulfite sequencing of normal B cell subsets, we obs...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng.3488

    authors: Oakes CC,Seifert M,Assenov Y,Gu L,Przekopowitz M,Ruppert AS,Wang Q,Imbusch CD,Serva A,Koser SD,Brocks D,Lipka DB,Bogatyrova O,Weichenhan D,Brors B,Rassenti L,Kipps TJ,Mertens D,Zapatka M,Lichter P,Döhner H,Küppe

    更新日期:2016-03-01 00:00:00

  • The peripheral myelin protein gene PMP-22 is contained within the Charcot-Marie-Tooth disease type 1A duplication.

    abstract::Charcot-Marie-Tooth disease (CMT1) is the most common form of inherited peripheral neuropathy. Although the disease is genetically heterogeneous, it has been demonstrated that the gene defect is the most frequent type (CMT1A) is the result of a partial duplication of band 17p11.2. Recent studies suggested that the per...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng0692-171

    authors: Timmerman V,Nelis E,Van Hul W,Nieuwenhuijsen BW,Chen KL,Wang S,Ben Othman K,Cullen B,Leach RJ,Hanemann CO

    更新日期:1992-06-01 00:00:00

  • Disruption of the uncoupling protein-2 gene in mice reveals a role in immunity and reactive oxygen species production.

    abstract::The gene Ucp2 is a member of a family of genes found in animals and plants, encoding a protein homologous to the brown fat uncoupling protein Ucp1 (refs 1-3). As Ucp2 is widely expressed in mammalian tissues, uncouples respiration and resides within a region of genetic linkage to obesity, a role in energy dissipation ...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/82565

    authors: Arsenijevic D,Onuma H,Pecqueur C,Raimbault S,Manning BS,Miroux B,Couplan E,Alves-Guerra MC,Goubern M,Surwit R,Bouillaud F,Richard D,Collins S,Ricquier D

    更新日期:2000-12-01 00:00:00

  • Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas.

    abstract::The Cancer Genome Atlas Pan-Cancer Analysis Working Group collaborated on the Synapse software platform to share and evolve data, results and methodologies while performing integrative analysis of molecular profiling data from 12 tumor types. The group's work serves as a pilot case study that provides (i) a template f...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng.2761

    authors: Omberg L,Ellrott K,Yuan Y,Kandoth C,Wong C,Kellen MR,Friend SH,Stuart J,Liang H,Margolin AA

    更新日期:2013-10-01 00:00:00

  • Toward genome-wide SNP genotyping.

    abstract::Genome-wide association studies with SNP markers are expected to allow identification of genes that underlie complex disorders. Hundreds of thousands of SNP markers will be required for comprehensive genome-wide association studies. The development of microarray-based methods for SNP genotyping on this scale remains a...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng1558

    authors: Syvänen AC

    更新日期:2005-06-01 00:00:00

  • The gene encoding phosphodiesterase 4D confers risk of ischemic stroke.

    abstract::We previously mapped susceptibility to stroke to chromosome 5q12. Here we finely mapped this locus and tested it for association with stroke. We found the strongest association in the gene encoding phosphodiesterase 4D (PDE4D), especially for carotid and cardiogenic stroke, the forms of stroke related to atheroscleros...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng1245

    authors: Gretarsdottir S,Thorleifsson G,Reynisdottir ST,Manolescu A,Jonsdottir S,Jonsdottir T,Gudmundsdottir T,Bjarnadottir SM,Einarsson OB,Gudjonsdottir HM,Hawkins M,Gudmundsson G,Gudmundsdottir H,Andrason H,Gudmundsdottir AS,Sigur

    更新日期:2003-10-01 00:00:00

  • A recombination hotspot responsible for two inherited peripheral neuropathies is located near a mariner transposon-like element.

    abstract::The Charcot-Marie Tooth disease type 1A (CMT1A) duplication and hereditary neuropathy with liability to pressure palsies (HNPP) deletion are reciprocal products of an unequal crossing-over event between misaligned flanking CMT1A-REP repeats. The molecular aetiology of this apparently homologous recombination event was...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng0396-288

    authors: Reiter LT,Murakami T,Koeuth T,Pentao L,Muzny DM,Gibbs RA,Lupski JR

    更新日期:1996-03-01 00:00:00

  • Isolation of chromosome 21-specific yeast artificial chromosomes from a total human genome library.

    abstract::A new approach for the isolation of chromosome-specific subsets from a human genomic yeast artificial chromosome (YAC) library is described. It is based on the hybridization with an Alu polymerase chain reaction (PCR) probe. We screened a 1.5 genome equivalent YAC library of megabase insert size with Alu PCR products ...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng0692-222

    authors: Chumakov IM,Le Gall I,Billault A,Ougen P,Soularue P,Guillou S,Rigault P,Bui H,De Tand MF,Barillot E

    更新日期:1992-06-01 00:00:00

  • Identification of enterotoxigenic Escherichia coli (ETEC) clades with long-term global distribution.

    abstract::Enterotoxigenic Escherichia coli (ETEC), a major cause of infectious diarrhea, produce heat-stable and/or heat-labile enterotoxins and at least 25 different colonization factors that target the intestinal mucosa. The genes encoding the enterotoxins and most of the colonization factors are located on plasmids found acr...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng.3145

    authors: von Mentzer A,Connor TR,Wieler LH,Semmler T,Iguchi A,Thomson NR,Rasko DA,Joffre E,Corander J,Pickard D,Wiklund G,Svennerholm AM,Sjöling Å,Dougan G

    更新日期:2014-12-01 00:00:00

  • DNA methylation loss in late-replicating domains is linked to mitotic cell division.

    abstract::DNA methylation loss occurs frequently in cancer genomes, primarily within lamina-associated, late-replicating regions termed partially methylated domains (PMDs). We profiled 39 diverse primary tumors and 8 matched adjacent tissues using whole-genome bisulfite sequencing (WGBS) and analyzed them alongside 343 addition...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/s41588-018-0073-4

    authors: Zhou W,Dinh HQ,Ramjan Z,Weisenberger DJ,Nicolet CM,Shen H,Laird PW,Berman BP

    更新日期:2018-04-01 00:00:00

  • Mutation of DNASE1 in people with systemic lupus erythematosus.

    abstract::Systemic lupus erythematosus (SLE) is a highly prevalent human autoimmune diseases that causes progressive glomerulonephritis, arthritis and an erythematoid rash. Mice deficient in deoxyribonuclease I (Dnase1) develop an SLE-like syndrome. Here we describe two patients with a heterozygous nonsense mutation in exon 2 o...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/91070

    authors: Yasutomo K,Horiuchi T,Kagami S,Tsukamoto H,Hashimura C,Urushihara M,Kuroda Y

    更新日期:2001-08-01 00:00:00

  • Triplication of a 21q22 region contributes to B cell transformation through HMGN1 overexpression and loss of histone H3 Lys27 trimethylation.

    abstract::Down syndrome confers a 20-fold increased risk of B cell acute lymphoblastic leukemia (B-ALL), and polysomy 21 is the most frequent somatic aneuploidy among all B-ALLs. Yet the mechanistic links between chromosome 21 triplication and B-ALL remain undefined. Here we show that germline triplication of only 31 genes orth...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng.2949

    authors: Lane AA,Chapuy B,Lin CY,Tivey T,Li H,Townsend EC,van Bodegom D,Day TA,Wu SC,Liu H,Yoda A,Alexe G,Schinzel AC,Sullivan TJ,Malinge S,Taylor JE,Stegmaier K,Jaffe JD,Bustin M,te Kronnie G,Izraeli S,Harris MH,Steve

    更新日期:2014-06-01 00:00:00

  • A literature network of human genes for high-throughput analysis of gene expression.

    abstract::We have carried out automated extraction of explicit and implicit biomedical knowledge from publicly available gene and text databases to create a gene-to-gene co-citation network for 13,712 named human genes by automated analysis of titles and abstracts in over 10 million MEDLINE records. The associations between gen...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng0501-21

    authors: Jenssen TK,Laegreid A,Komorowski J,Hovig E

    更新日期:2001-05-01 00:00:00

  • Micro RNAs are complementary to 3' UTR sequence motifs that mediate negative post-transcriptional regulation.

    abstract::Micro RNAs are a large family of noncoding RNAs of 21-22 nucleotides whose functions are generally unknown. Here a large subset of Drosophila micro RNAs is shown to be perfectly complementary to several classes of sequence motif previously demonstrated to mediate negative post-transcriptional regulation. These finding...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng865

    authors: Lai EC

    更新日期:2002-04-01 00:00:00

  • RPA regulates telomerase action by providing Est1p access to chromosome ends.

    abstract::Replication protein A (RPA) is a highly conserved single-stranded DNA-binding protein involved in DNA replication, recombination and repair. We show here that RPA is present at the telomeres of the budding yeast Saccharomyces cerevisiae, with a maximal association in S phase. A truncation of the N-terminal region of R...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng1284

    authors: Schramke V,Luciano P,Brevet V,Guillot S,Corda Y,Longhese MP,Gilson E,Géli V

    更新日期:2004-01-01 00:00:00

  • The Capsella rubella genome and the genomic consequences of rapid mating system evolution.

    abstract::The shift from outcrossing to selfing is common in flowering plants, but the genomic consequences and the speed at which they emerge remain poorly understood. An excellent model for understanding the evolution of self fertilization is provided by Capsella rubella, which became self compatible <200,000 years ago. We re...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng.2669

    authors: Slotte T,Hazzouri KM,Ågren JA,Koenig D,Maumus F,Guo YL,Steige K,Platts AE,Escobar JS,Newman LK,Wang W,Mandáková T,Vello E,Smith LM,Henz SR,Steffen J,Takuno S,Brandvain Y,Coop G,Andolfatto P,Hu TT,Blanchette M,

    更新日期:2013-07-01 00:00:00

  • Soluble epoxide hydrolase is a susceptibility factor for heart failure in a rat model of human disease.

    abstract::We aimed to identify genetic variants associated with heart failure by using a rat model of the human disease. We performed invasive cardiac hemodynamic measurements in F2 crosses between spontaneously hypertensive heart failure (SHHF) rats and reference strains. We combined linkage analyses with genome-wide expressio...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng.129

    authors: Monti J,Fischer J,Paskas S,Heinig M,Schulz H,Gösele C,Heuser A,Fischer R,Schmidt C,Schirdewan A,Gross V,Hummel O,Maatz H,Patone G,Saar K,Vingron M,Weldon SM,Lindpaintner K,Hammock BD,Rohde K,Dietz R,Cook SA,Sc

    更新日期:2008-05-01 00:00:00

  • Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis.

    abstract::Sequence variation in human genes is largely confined to single-nucleotide polymorphisms (SNPs) and is valuable in tests of association with common diseases and pharmacogenetic traits. We performed a systematic and comprehensive survey of molecular variation to assess the nature, pattern and frequency of SNPs in 75 ca...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/10297

    authors: Halushka MK,Fan JB,Bentley K,Hsie L,Shen N,Weder A,Cooper R,Lipshutz R,Chakravarti A

    更新日期:1999-07-01 00:00:00

  • Mutations in the gene encoding the lamin B receptor produce an altered nuclear morphology in granulocytes (Pelger-Huët anomaly).

    abstract::Pelger-Huët anomaly (PHA; OMIM *169400) is an autosomal dominant disorder characterized by abnormal nuclear shape and chromatin organization in blood granulocytes. Affected individuals show hypolobulated neutrophil nuclei with coarse chromatin. Presumed homozygous individuals have ovoid neutrophil nuclei, as well as v...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng925

    authors: Hoffmann K,Dreger CK,Olins AL,Olins DE,Shultz LD,Lucke B,Karl H,Kaps R,Müller D,Vayá A,Aznar J,Ware RE,Sotelo Cruz N,Lindner TH,Herrmann H,Reis A,Sperling K

    更新日期:2002-08-01 00:00:00

  • Ocular albinism: evidence for a defect in an intracellular signal transduction system.

    abstract::G protein-coupled receptors (GPCRs) participate in the most common signal transduction system at the plasma membrane. The wide distribution of heterotrimeric G proteins in the internal membranes suggests that a similar signalling mechanism might also be used at intracellular locations. We provide here structural evide...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/12715

    authors: Schiaffino MV,d'Addio M,Alloni A,Baschirotto C,Valetti C,Cortese K,Puri C,Bassi MT,Colla C,De Luca M,Tacchetti C,Ballabio A

    更新日期:1999-09-01 00:00:00

  • Cryptorchidism in mice mutant for Insl3.

    abstract::Impaired testicular descent (cryptorchidism) is one of the most frequent congenital abnormalities in humans, involving 2% of male births. Cryptorchidism can result in infertility and increases risk for development of germ-cell tumours. Testicular descent from abdomen to scrotum occurs in two distinct phases: the trans...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/10364

    authors: Nef S,Parada LF

    更新日期:1999-07-01 00:00:00

  • Mouse model of X-linked chronic granulomatous disease, an inherited defect in phagocyte superoxide production.

    abstract::Chronic granulomatous disease (CGD) is a recessive disorder characterized by a defective phagocyte respiratory burst oxidase, life-threatening pyogenic infections and inflammatory granulomas. Gene targeting was used to generate mice with a null allele of the gene involved in X-linked CGD, which encodes the 91 kD subun...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng0295-202

    authors: Pollock JD,Williams DA,Gifford MA,Li LL,Du X,Fisherman J,Orkin SH,Doerschuk CM,Dinauer MC

    更新日期:1995-02-01 00:00:00

  • A double-stranded RNA binding protein required for activation of repressed messages in mammalian germ cells.

    abstract::Chromatin packaging in mammalian spermatozoa requires an ordered replacement of the somatic histones by two classes of spermatid-specific basic proteins, the transition proteins and the protamines. Temporal expression of transition proteins and protamines during spermatid differentiation is under translational control...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/9684

    authors: Zhong J,Peters AH,Lee K,Braun RE

    更新日期:1999-06-01 00:00:00

  • Genome-wide meta-analyses identify multiple loci associated with smoking behavior.

    abstract::Consistent but indirect evidence has implicated genetic factors in smoking behavior. We report meta-analyses of several smoking phenotypes within cohorts of the Tobacco and Genetics Consortium (n = 74,053). We also partnered with the European Network of Genetic and Genomic Epidemiology (ENGAGE) and Oxford-GlaxoSmithKl...

    journal_title:Nature genetics

    pub_type: 杂志文章,meta分析

    doi:10.1038/ng.571

    authors: Tobacco and Genetics Consortium.

    更新日期:2010-05-01 00:00:00

  • Extensive allelic variation and ultrashort telomeres in senescent human cells.

    abstract::By imposing a limit on the proliferative lifespan of most somatic cells, telomere erosion represents an innate mechanism for tumor suppression and may contribute to age-related disease. A detailed understanding of the pathways that link shortened telomeres to replicative senescence has been severely hindered by the in...

    journal_title:Nature genetics

    pub_type: 杂志文章

    doi:10.1038/ng1084

    authors: Baird DM,Rowson J,Wynford-Thomas D,Kipling D

    更新日期:2003-02-01 00:00:00

  • Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer.

    abstract::Immunotherapy for metastatic colorectal cancer is effective only for mismatch repair-deficient tumors with high microsatellite instability that demonstrate immune infiltration, suggesting that tumor cells can determine their immune microenvironment. To understand this cross-talk, we analyzed the transcriptome of 91,10...

    journal_title:Nature genetics

    pub_type: 杂志文章,多中心研究

    doi:10.1038/s41588-020-0636-z

    authors: Lee HO,Hong Y,Etlioglu HE,Cho YB,Pomella V,Van den Bosch B,Vanhecke J,Verbandt S,Hong H,Min JW,Kim N,Eum HH,Qian J,Boeckx B,Lambrechts D,Tsantoulis P,De Hertogh G,Chung W,Lee T,An M,Shin HT,Joung JG,Jung MH,

    更新日期:2020-06-01 00:00:00