Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes.

Abstract:

:Comparative genomics provides a general methodology for discovering functional DNA elements and understanding their evolution. The availability of many related genomes enables more powerful analyses, but requires rigorous phylogenetic methods to resolve orthologous genes and regions. Here, we use 12 recently sequenced Drosophila genomes and nine fungal genomes to address the problem of accurate gene-tree reconstruction across many complete genomes. We show that existing phylogenetic methods that treat each gene tree in isolation show large-scale inaccuracies, largely due to insufficient phylogenetic information in individual genes. However, we find that gene trees exhibit common properties that can be exploited for evolutionary studies and accurate phylogenetic reconstruction. Evolutionary rates can be decoupled into gene-specific and species-specific components, which can be learned across complete genomes. We develop a phylogenetic reconstruction methodology that exploits these properties and achieves significantly higher accuracy, addressing the species-level heterotachy and enabling studies of gene evolution in the context of species evolution.

journal_name

Genome Res

journal_title

Genome research

authors

Rasmussen MD,Kellis M

doi

10.1101/gr.7105007

subject

Has Abstract

pub_date

2007-12-01 00:00:00

pages

1932-42

issue

12

eissn

1088-9051

issn

1549-5469

pii

gr.7105007

journal_volume

17

pub_type

杂志文章
  • Whole-genome sequence assembly for mammalian genomes: Arachne 2.

    abstract::We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal change...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.828403

    authors: Jaffe DB,Butler J,Gnerre S,Mauceli E,Lindblad-Toh K,Mesirov JP,Zody MC,Lander ES

    更新日期:2003-01-01 00:00:00

  • Immune signatures correlate with L1 retrotransposition in gastrointestinal cancers.

    abstract::Long interspersed nuclear element-1 (LINE-1 or L1) retrotransposons are normally suppressed in somatic tissues mainly due to DNA methylation and antiviral defense. However, the mechanism to suppress L1s may be disrupted in cancers, thus allowing L1s to act as insertional mutagens and cause genomic rearrangement and in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.231837.117

    authors: Jung H,Choi JK,Lee EA

    更新日期:2018-08-01 00:00:00

  • Dissecting transcription regulatory pathways through a new bacterial one-hybrid reporter system.

    abstract::Sequence-specific DNA-binding transcription factors have widespread biological significance in the regulation of gene expression. However, in lower prokaryotes and eukaryotic metazoans, it is usually difficult to find transcription regulatory factors that recognize specific target promoters. To address this, we have d...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.086595.108

    authors: Guo M,Feng H,Zhang J,Wang W,Wang Y,Li Y,Gao C,Chen H,Feng Y,He ZG

    更新日期:2009-07-01 00:00:00

  • HERC2 rs12913832 modulates human pigmentation by attenuating chromatin-loop formation between a long-range enhancer and the OCA2 promoter.

    abstract::Pigmentation of skin, eye, and hair reflects some of the most evident common phenotypes in humans. Several candidate genes for human pigmentation are identified. The SNP rs12913832 has strong statistical association with human pigmentation. It is located within an intron of the nonpigment gene HERC2, 21 kb upstream of...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.128652.111

    authors: Visser M,Kayser M,Palstra RJ

    更新日期:2012-03-01 00:00:00

  • A platform for curated products from novel open reading frames prompts reinterpretation of disease variants.

    abstract::Recent evidence from proteomics and deep massively parallel sequencing studies have revealed that eukaryotic genomes contain substantial numbers of as-yet-uncharacterized open reading frames (ORFs). We define these uncharacterized ORFs as novel ORFs (nORFs). nORFs in humans are mostly under 100 codons and are found in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.263202.120

    authors: Neville MDC,Kohze R,Erady C,Meena N,Hayden M,Cooper DN,Mort M,Prabakaran S

    更新日期:2021-01-19 00:00:00

  • Delineation of key regulatory elements identifies points of vulnerability in the mitogen-activated signaling network.

    abstract::Drug development efforts against cancer are often hampered by the complex properties of signaling networks. Here we combined the results of an RNAi screen targeting the cellular signaling machinery, with graph theoretical analysis to extract the core modules that process both mitogenic and oncogenic signals to drive c...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.116145.110

    authors: Jailkhani N,Ravichandran S,Hegde SR,Siddiqui Z,Mande SC,Rao KV

    更新日期:2011-12-01 00:00:00

  • Improved discovery of genetic interactions using CRISPRiSeq across multiple environments.

    abstract::Large-scale genetic interaction (GI) screens in yeast have been invaluable for our understanding of molecular systems biology and for characterizing novel gene function. Owing in part to the high costs and long experiment times required, a preponderance of GI data has been generated in a single environmental condition...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.246603.118

    authors: Jaffe M,Dziulko A,Smith JD,St Onge RP,Levy SF,Sherlock G

    更新日期:2019-04-01 00:00:00

  • Genetic analysis of complex traits in the emerging Collaborative Cross.

    abstract::The Collaborative Cross (CC) is a mouse recombinant inbred strain panel that is being developed as a resource for mammalian systems genetics. Here we describe an experiment that uses partially inbred CC lines to evaluate the genetic properties and utility of this emerging resource. Genome-wide analysis of the incipien...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.111310.110

    authors: Aylor DL,Valdar W,Foulds-Mathes W,Buus RJ,Verdugo RA,Baric RS,Ferris MT,Frelinger JA,Heise M,Frieman MB,Gralinski LE,Bell TA,Didion JD,Hua K,Nehrenberg DL,Powell CL,Steigerwalt J,Xie Y,Kelada SN,Collins FS,Yang IV

    更新日期:2011-08-01 00:00:00

  • A GC-rich sequence feature in the 3' UTR directs UPF1-dependent mRNA decay in mammalian cells.

    abstract::Up-frameshift protein 1 (UPF1) is an ATP-dependent RNA helicase that has essential roles in RNA surveillance and in post-transcriptional gene regulation by promoting the degradation of mRNAs. Previous studies revealed that UPF1 is associated with the 3' untranslated region (UTR) of target mRNAs via as-yet-unknown sequ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.206060.116

    authors: Imamachi N,Salam KA,Suzuki Y,Akimitsu N

    更新日期:2017-03-01 00:00:00

  • An EST-enriched comparative map of Brassica oleracea and Arabidopsis thaliana.

    abstract::A detailed comparative map of Brassica oleracea and Arabidopsis thaliana has been established based largely on mapping of Arabidopsis ESTs in two Arabidopsis and four Brassica populations. Based on conservative criteria for inferring synteny, "one to one correspondence" between Brassica and Arabidopsis chromosomes acc...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.6.776

    authors: Lan TH,DelMonte TA,Reischmann KP,Hyman J,Kowalski SP,McFerson J,Kresovich S,Paterson AH

    更新日期:2000-06-01 00:00:00

  • Natural genetic variation in C. elegans identified genomic loci controlling metabolite levels.

    abstract::Metabolic homeostasis is sustained by complex biological networks that respond to nutrient availability. Genetic and environmental factors may disrupt this equilibrium, leading to metabolic disorders, including obesity and type 2 diabetes. To identify the genetic factors controlling metabolism, we performed quantitati...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.232322.117

    authors: Gao AW,Sterken MG,Uit de Bos J,van Creij J,Kamble R,Snoek BL,Kammenga JE,Houtkooper RH

    更新日期:2018-09-01 00:00:00

  • Genome-scale cloning and expression of individual open reading frames using topoisomerase I-mediated ligation.

    abstract::The in vitro cloning of DNA molecules traditionally uses PCR amplification or site-specific restriction endonucleases to generate linear DNA inserts with defined termini and requires DNA ligase to covalently join those inserts to vectors with the corresponding ends. We have used the properties of Vaccinia DNA topoisom...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:

    authors: Heyman JA,Cornthwaite J,Foncerrada L,Gilmore JR,Gontang E,Hartman KJ,Hernandez CL,Hood R,Hull HM,Lee WY,Marcil R,Marsh EJ,Mudd KM,Patino MJ,Purcell TJ,Rowland JJ,Sindici ML,Hoeffler JP

    更新日期:1999-04-01 00:00:00

  • Centromere repositioning.

    abstract::Primate pericentromeric regions recently have been shown to exhibit extraordinary evolutionary plasticity. In this paper we report an additional peculiar feature of these regions that we discovered while analyzing, by FISH, the evolutionary conservation of primate phylogenetic chromosome IX. If the position of the cen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.9.12.1184

    authors: Montefalcone G,Tempesta S,Rocchi M,Archidiacono N

    更新日期:1999-12-01 00:00:00

  • Mutation detection using mass spectrometric separation of tiny oligonucleotide fragments.

    abstract::A DNA mutation detection protocol able to identify and characterize a previously unknown change in a given sequence in a rapid, efficient, sensitive, and inexpensive manner is required to take advantage of the resources now available to researchers through the genome sequencing projects. We have developed a method bas...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.gr-1578r

    authors: Elso C,Toohey B,Reid GE,Poetter K,Simpson RJ,Foote SJ

    更新日期:2002-09-01 00:00:00

  • Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee.

    abstract::To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains....

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5302

    authors: Whitfield CW,Band MR,Bonaldo MF,Kumar CG,Liu L,Pardinas JR,Robertson HM,Soares MB,Robinson GE

    更新日期:2002-04-01 00:00:00

  • A pooling-based approach to mapping genetic variants associated with DNA methylation.

    abstract::DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.183749.114

    authors: Kaplow IM,MacIsaac JL,Mah SM,McEwen LM,Kobor MS,Fraser HB

    更新日期:2015-06-01 00:00:00

  • lobSTR: A short tandem repeat profiler for personal genomes.

    abstract::Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic genealogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat S...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.135780.111

    authors: Gymrek M,Golan D,Rosset S,Erlich Y

    更新日期:2012-06-01 00:00:00

  • Time course regulatory analysis based on paired expression and chromatin accessibility data.

    abstract::A time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose time course regulatory analysis (TimeReg) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility da...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.257063.119

    authors: Duren Z,Chen X,Xin J,Wang Y,Wong WH

    更新日期:2020-04-01 00:00:00

  • Nonrandom domain organization of the Arabidopsis genome at the nuclear periphery.

    abstract::The nuclear space is not a homogeneous biochemical environment. Many studies have demonstrated that the transcriptional activity of a gene is linked to its positioning within the nuclear space. Following the discovery of lamin-associated domains (LADs), which are transcriptionally repressed chromatin regions, the nonr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.215186.116

    authors: Bi X,Cheng YJ,Hu B,Ma X,Wu R,Wang JW,Liu C

    更新日期:2017-07-01 00:00:00

  • Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography.

    abstract::Y chromosome haplotypes are particularly useful in deciphering human evolutionary history because they accentuate the effects of drift, migration, and range expansion. Significant acceleration of Y biallelic marker discovery and subsequent typing involving heteroduplex detection has been achieved by implementing an in...

    journal_title:Genome research

    pub_type: 信件

    doi:10.1101/gr.7.10.996

    authors: Underhill PA,Jin L,Lin AA,Mehdi SQ,Jenkins T,Vollrath D,Davis RW,Cavalli-Sforza LL,Oefner PJ

    更新日期:1997-10-01 00:00:00

  • Evaluation of predicted network modules in yeast metabolism using NMR-based metabolite profiling.

    abstract::Genome-scale metabolic models promise important insights into cell function. However, the definition of pathways and functional network modules within these models, and in the biochemical literature in general, is often based on intuitive reasoning. Although mathematical methods have been proposed to identify modules,...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5662207

    authors: Bundy JG,Papp B,Harmston R,Browne RA,Clayson EM,Burton N,Reece RJ,Oliver SG,Brindle KM

    更新日期:2007-04-01 00:00:00

  • Retroelement distributions in the human genome: variations associated with age and proximity to genes.

    abstract::Remnants of more than 3 million transposable elements, primarily retroelements, comprise nearly half of the human genome and have generated much speculation concerning their evolutionary significance. We have exploited the draft human genome sequence to examine the distributions of retroelements on a genome-wide scale...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.388902

    authors: Medstrand P,van de Lagemaat LN,Mager DL

    更新日期:2002-10-01 00:00:00

  • Natural genetic variation in yeast longevity.

    abstract::The genetics of aging in the yeast Saccharomyces cerevisiae has involved the manipulation of individual genes in laboratory strains. We have instituted a quantitative genetic analysis of the yeast replicative lifespan by sampling the natural genetic variation in a wild yeast isolate. Haploid segregants from a cross be...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.136549.111

    authors: Stumpferl SW,Brand SE,Jiang JC,Korona B,Tiwari A,Dai J,Seo JG,Jazwinski SM

    更新日期:2012-10-01 00:00:00

  • A virome-wide clonal integration analysis platform for discovering cancer viral etiology.

    abstract::Oncoviral infection is responsible for 12%-15% of cancer in humans. Convergent evidence from epidemiology, pathology, and oncology suggests that new viral etiologies for cancers remain to be discovered. Oncoviral profiles can be obtained from cancer genome sequencing data; however, widespread viral sequence contaminat...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.242529.118

    authors: Chen X,Kost J,Sulovari A,Wong N,Liang WS,Cao J,Li D

    更新日期:2019-05-01 00:00:00

  • Screening of gene-associated polymorphisms by use of in-gel competitive reassociation and EST (cDNA) array hybridization.

    abstract::In-gel competitive reassociation (IGCR) is a method of differential subtraction to enrich polymorphic DNA restriction fragments between two DNA samples without probes or specific sequence information. Here, we show that by combining IGCR and expressed sequence tags (EST) array hybridization, polymorphic DNA fragments ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.434103

    authors: Gotoh K,Oishi M

    更新日期:2003-03-01 00:00:00

  • Spotted long oligonucleotide arrays for human gene expression analysis.

    abstract::DNA microarrays produced by deposition (or 'spotting')of a single long oligonucleotide probe for each gene may be an attractive alternative to other types of arrays. We produced spotted oligonucleotide arrays using two large collections of approximately 70-mer probes, and used these arrays to analyze gene expression i...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1048803

    authors: Barczak A,Rodriguez MW,Hanspers K,Koth LL,Tai YC,Bolstad BM,Speed TP,Erle DJ

    更新日期:2003-07-01 00:00:00

  • Genome-wide identification of conserved regulatory function in diverged sequences.

    abstract::Plasticity of gene regulatory encryption can permit DNA sequence divergence without loss of function. Functional information is preserved through conservation of the composition of transcription factor binding sites (TFBS) in a regulatory element. We have developed a method that can accurately identify pairs of functi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.119016.110

    authors: Taher L,McGaughey DM,Maragh S,Aneas I,Bessling SL,Miller W,Nobrega MA,McCallion AS,Ovcharenko I

    更新日期:2011-07-01 00:00:00

  • Relationship between histone modifications and transcription factor binding is protein family specific.

    abstract::The very small fraction of putative binding sites (BSs) that are occupied by transcription factors (TFs) in vivo can be highly variable across different cell types. This observation has been partly attributed to changes in chromatin accessibility and histone modification (HM) patterns surrounding BSs. Previous studies...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.220079.116

    authors: Xin B,Rohs R

    更新日期:2018-01-11 00:00:00

  • Determinants of CpG islands: expression in early embryo and isochore structure.

    abstract::In an attempt to understand the origin of CpG islands (CGIs) in mammalian genomes, we have studied their location and structure according to the expression pattern of genes and to the G + C content of isochores in which they are embedded. We show that CGIs located over the transcription start site (named start CGIs) a...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.174501

    authors: Ponger L,Duret L,Mouchiroud D

    更新日期:2001-11-01 00:00:00

  • TATA is a modular component of synthetic promoters.

    abstract::The expression of most genes is regulated by multiple transcription factors. The interactions between transcription factors produce complex patterns of gene expression that are not always obvious from the arrangement of cis-regulatory elements in a promoter. One critical element of promoters is the TATA box, the docki...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.106732.110

    authors: Mogno I,Vallania F,Mitra RD,Cohen BA

    更新日期:2010-10-01 00:00:00