ModelTeller: Model Selection for Optimal Phylogenetic Reconstruction Using Machine Learning.

Abstract:

:Statistical criteria have long been the standard for selecting the best model for phylogenetic reconstruction and downstream statistical inference. Although model selection is regarded as a fundamental step in phylogenetics, existing methods for this task consume computational resources for long processing time, they are not always feasible, and sometimes depend on preliminary assumptions which do not hold for sequence data. Moreover, although these methods are dedicated to revealing the processes that underlie the sequence data, they do not always produce the most accurate trees. Notably, phylogeny reconstruction consists of two related tasks, topology reconstruction and branch-length estimation. It was previously shown that in many cases the most complex model, GTR+I+G, leads to topologies that are as accurate as using existing model selection criteria, but overestimates branch lengths. Here, we present ModelTeller, a computational methodology for phylogenetic model selection, devised within the machine-learning framework, optimized to predict the most accurate nucleotide substitution model for branch-length estimation. We demonstrate that ModelTeller leads to more accurate branch-length inference than current model selection criteria on data sets simulated under realistic processes. ModelTeller relies on a readily implemented machine-learning model and thus the prediction according to features extracted from the sequence data results in a substantial decrease in running time compared with existing strategies. By harnessing the machine-learning framework, we distinguish between features that mostly contribute to branch-length optimization, concerning the extent of sequence divergence, and features that are related to estimates of the model parameters that are important for the selection made by current criteria.

journal_name

Mol Biol Evol

authors

Abadi S,Avram O,Rosset S,Pupko T,Mayrose I

doi

10.1093/molbev/msaa154

subject

Has Abstract

pub_date

2020-11-01 00:00:00

pages

3338-3352

issue

11

eissn

0737-4038

issn

1537-1719

pii

5862639

journal_volume

37

pub_type

杂志文章
  • Bayesian Phylogeography and Pathogenic Characterization of Smallpox Based on HA, ATI, and CrmB Genes.

    abstract::Variola virus is at risk of re-emergence either through accidental release, bioterrorism, or synthetic biology. The use of phylogenetics and phylogeography to support epidemic field response is expected to grow as sequencing technology becomes miniaturized, cheap, and ubiquitous. In this study, we aimed to explore the...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msy153

    authors: Adam DC,Scotch M,MacIntyre CR

    更新日期:2018-11-01 00:00:00

  • Comparative Transcriptomics Analyses across Species, Organs, and Developmental Stages Reveal Functionally Constrained lncRNAs.

    abstract::The functionality of long noncoding RNAs (lncRNAs) is disputed. In general, lncRNAs are under weak selective pressures, suggesting that the majority of lncRNAs may be nonfunctional. However, although some surveys showed negligible phenotypic effects upon lncRNA perturbation, key biological roles were demonstrated for ...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msz212

    authors: Darbellay F,Necsulea A

    更新日期:2020-01-01 00:00:00

  • Evolution of clonality and polyploidy in a weevil system.

    abstract::The increased interest in asexual organisms calls for in-depth studies of asexual complexes that actively give rise to new clones. We present an extensive molecular study of the Otiorhynchus scaber (Coleoptera, Curculionidae) weevil system. Three forms have traditionally been recognized: diploid sexuals, triploid, and...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msg180

    authors: Stenberg P,Lundmark M,Knutelski S,Saura A

    更新日期:2003-10-01 00:00:00

  • Bayesian estimation of past population dynamics in BEAST 1.10 using the Skygrid coalescent model.

    abstract::Inferring past population dynamics over time from heterochronous molecular sequence data is often achieved using the Bayesian Skygrid model, a non-parametric coalescent model that estimates the effective population size over time. Available in BEAST, a cross-platform program for Bayesian analysis of molecular sequence...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msz172

    authors: Hill V,Baele G

    更新日期:2019-07-31 00:00:00

  • Extreme reconfiguration of plastid genomes in the angiosperm family Geraniaceae: rearrangements, repeats, and codon usage.

    abstract::Geraniaceae plastid genomes (plastomes) have experienced a remarkable number of genomic changes. The plastomes of Erodium texanum, Geranium palmatum, and Monsonia speciosa were sequenced and compared with other rosids and the previously published Pelargonium hortorum plastome. Geraniaceae plastomes were found to be hi...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msq229

    authors: Guisinger MM,Kuehl JV,Boore JL,Jansen RK

    更新日期:2011-01-01 00:00:00

  • Molecular Mechanism of the Two-Component Suicidal Weapon of Neocapritermes taracua Old Workers.

    abstract::In termites, as in many social insects, some individuals specialize in colony defense, developing diverse weaponry. As workers of the termite Neocapritermes taracua (Termitidae: Termitinae) age, their efficiency to perform general tasks decreases, while they accumulate defensive secretions and increase their readiness...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msv273

    authors: Bourguignon T,Šobotník J,Brabcová J,Sillam-Dussès D,Buček A,Krasulová J,Vytisková B,Demianová Z,Mareš M,Roisin Y,Vogel H

    更新日期:2016-03-01 00:00:00

  • Glyceraldehyde-3-phosphate dehydrogenase gene diversity in eubacteria and eukaryotes: evidence for intra- and inter-kingdom gene transfer.

    abstract::Cyanobacteria contain up to three highly divergent glyceraldehyde-3-phosphate dehydrogenase (GAPDH) genes: gap1, gap2, and gap3. Genes gap1 and gap2 are closely related at the sequence level to the nuclear genes encoding cytosolic and chloroplast GAPDH of higher plants and have recently been shown to play distinct key...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a026125

    authors: Figge RM,Schubert M,Brinkmann H,Cerff R

    更新日期:1999-04-01 00:00:00

  • The dJ/dS Ratio Test Reveals Hundreds of Novel Putative Cancer Drivers.

    abstract::Computational tools with a balanced sensitivity and specificity in identification of candidate cancer drivers are highly desired. In this study, we propose a new statistical test, namely the dJ/dS ratio test, to compute the relative mutation rate of exon/intron junction sites (dJ) to synonymous sites (dS); observation...

    journal_title:Molecular biology and evolution

    pub_type: 信件

    doi:10.1093/molbev/msv083

    authors: Chen H,Xing K,He X

    更新日期:2015-08-01 00:00:00

  • Evolution of trypsinogen activation peptides.

    abstract::The activation peptide of mammalian trypsinogens contains a highly conserved tetra-aspartate sequence (D19-D20-D21-D22) preceding the K23-I24 scissile peptide bond, which is hydrolyzed as the first step in the activation process. Here, we examined the evolution and function of trypsinogen activation peptides through i...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msg183

    authors: Chen JM,Kukor Z,Le Maréchal C,Tóth M,Tsakiris L,Raguénès O,Férec C,Sahin-Tóth M

    更新日期:2003-11-01 00:00:00

  • Population growth of human Y chromosomes: a study of Y chromosome microsatellites.

    abstract::We use variation at a set of eight human Y chromosome microsatellite loci to investigate the demographic history of the Y chromosome. Instead of assuming a population of constant size, as in most of the previous work on the Y chromosome, we consider a model which permits a period of recent population growth. We show t...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a026091

    authors: Pritchard JK,Seielstad MT,Perez-Lezaun A,Feldman MW

    更新日期:1999-12-01 00:00:00

  • Lactate dehydrogenase-B cDNA from the teleost Fundulus heteroclitus: evolutionary implications.

    abstract::A cDNA that encodes the heart-type lactate dehydrogenase (LDH-B) from the teleost fish Fundulus heteroclitus was cloned and sequenced. The protein encoded by the cDNA was analyzed in relation to 13 LDH proteins from a variety of taxa. One of the deductions from this analysis is that LDH-B proteins have residues in the...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a040559

    authors: Crawford DL,Constantino HR,Powers DA

    更新日期:1989-07-01 00:00:00

  • Ancestral sequence reconstruction in primate mitochondrial DNA: compositional bias and effect on functional inference.

    abstract::Reconstruction of ancestral DNA and amino acid sequences is an important means of inferring information about past evolutionary events. Such reconstructions suggest changes in molecular function and evolutionary processes over the course of evolution and are used to infer adaptation and convergence. Maximum likelihood...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msh198

    authors: Krishnan NM,Seligmann H,Stewart CB,De Koning AP,Pollock DD

    更新日期:2004-10-01 00:00:00

  • A Common Genetic Origin for Early Farmers from Mediterranean Cardial and Central European LBK Cultures.

    abstract::The spread of farming out of the Balkans and into the rest of Europe followed two distinct routes: An initial expansion represented by the Impressa and Cardial traditions, which followed the Northern Mediterranean coastline; and another expansion represented by the LBK (Linearbandkeramik) tradition, which followed the...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msv181

    authors: Olalde I,Schroeder H,Sandoval-Velasco M,Vinner L,Lobón I,Ramirez O,Civit S,García Borja P,Salazar-García DC,Talamo S,María Fullola J,Xavier Oms F,Pedro M,Martínez P,Sanz M,Daura J,Zilhão J,Marquès-Bonet T,Gilbert MT,

    更新日期:2015-12-01 00:00:00

  • Accelerated evolution of sites undergoing mRNA editing in plant mitochondria and chloroplasts.

    abstract::The selective constraints influencing mRNA editing in plant organelles are largely unknown. To investigate these, we compared patterns of editing between monocot and dicot mitochondrial mRNA. On average, 24% of sites that are edited form C to U in one species have been substituted during evolution by a genomic T in th...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a025768

    authors: Shields DC,Wolfe KH

    更新日期:1997-03-01 00:00:00

  • Overdispersion of the molecular clock: temporal variation of gene-specific substitution rates in Drosophila.

    abstract::Simple models of molecular evolution assume that sequences evolve by a Poisson process in which nucleotide or amino acid substitutions occur as rare independent events. In these models, the expected ratio of the variance to the mean of substitution counts equals 1, and substitution processes with a ratio greater than ...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msn112

    authors: Bedford T,Hartl DL

    更新日期:2008-08-01 00:00:00

  • Distinct roles for SOS1 in the convergent evolution of salt tolerance in Eutrema salsugineum and Schrenkiella parvula.

    abstract::Eutrema salsugineum and Schrenkiella parvula are salt-tolerant relatives of the salt-sensitive species Arabidopsis thaliana. An important component of salt tolerance is the regulation of Na(+) ion homeostasis, which occurs in part through proteins encoded by the Cation/Proton Antiporter-1 (CPA1) gene family. We used a...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msu152

    authors: Jarvis DE,Ryu CH,Beilstein MA,Schumaker KS

    更新日期:2014-08-01 00:00:00

  • Small RNAs Reflect Grandparental Environments in Apomictic Dandelion.

    abstract::Plants can show long-term effects of environmental stresses and in some cases a stress "memory" has been reported to persist across generations, potentially mediated by epigenetic mechanisms. However, few documented cases exist of transgenerational effects that persist for multiple generations and it remains unclear i...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msx150

    authors: Morgado L,Preite V,Oplaat C,Anava S,Ferreira de Carvalho J,Rechavi O,Johannes F,Verhoeven KJF

    更新日期:2017-08-01 00:00:00

  • Inferring parameters shaping amino acid usage in prokaryotic genomes via Bayesian MCMC methods.

    abstract::Molar content of guanine plus cytosine (G + C) and optimal growth temperature (OGT) are main factors characterizing the frequency distribution of amino acids in prokaryotes. Previous work, using multivariate exploratory methods, has emphasized ascertainment of biological factors underlying variability between genomes,...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msj023

    authors: Naya H,Gianola D,Romero H,Urioste JI,Musto H

    更新日期:2006-01-01 00:00:00

  • Target-Driven Positive Selection at Hot Spots of Scorpion Toxins Uncovers Their Potential in Design of Insecticides.

    abstract::Positive selection sites (PSSs), a class of amino acid sites with an excess of nonsynonymous to synonymous substitutions, are indicators of adaptive molecular evolution and have been detected in many protein families involved in a diversity of biological processes by statistical approaches. However, few studies are co...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msw065

    authors: Zhu L,Peigneur S,Gao B,Zhang S,Tytgat J,Zhu S

    更新日期:2016-08-01 00:00:00

  • Altered trans-regulatory control of gene expression in multiple anthocyanin genes contributes to adaptive flower color evolution in Mimulus aurantiacus.

    abstract::A fundamental goal in evolutionary biology is to identify the molecular changes responsible for adaptive evolution. In this study, we describe a genetic analysis to determine whether the molecular changes contributing to adaptive flower color divergence in Mimulus aurantiacus affect gene expression or enzymatic activi...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msn268

    authors: Streisfeld MA,Rausher MD

    更新日期:2009-02-01 00:00:00

  • Molecular evolution of two paralogous tandemly repeated heterochromatic gene clusters linked to the X and Y chromosomes of Drosophila melanogaster.

    abstract::Here we report the peculiarities of molecular evolution and divergence of paralogous heterochromatic clusters of the testis- expressed X-linked Stellate and Y-linked Su(Ste) tandem repeats. It was suggested that Stellate and Su(Ste) clusters affecting male fertility are the amplified derivatives of the unique euchroma...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a026348

    authors: Kogan GL,Epstein VN,Aravin AA,Gvozdev VA

    更新日期:2000-05-01 00:00:00

  • Population structure in the American oyster as inferred by nuclear gene genealogies.

    abstract::Multiple haplotypes from each of three nuclear loci were isolated and sequenced from geographic populations of the American oyster, Crassostrea virginica. In tests of alternative phylogeographic hypotheses for this species, nuclear gene genealogies constructed for these haplotypes were compared to one another, to a mi...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a025908

    authors: Hare MP,Avise JC

    更新日期:1998-02-01 00:00:00

  • Rapid Viral Symbiogenesis via Changes in Parasitoid Wasp Genome Architecture.

    abstract::Viral genome integration provides a complex route to biological innovation that has rarely but repeatedly occurred in one of the most diverse lineages of organisms on the planet, parasitoid wasps. We describe a novel endogenous virus in braconid wasps derived from pathogenic alphanudiviruses. Limited to a subset of th...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msy148

    authors: Burke GR,Simmonds TJ,Sharanowski BJ,Geib SM

    更新日期:2018-10-01 00:00:00

  • Parallel Domestication of the Heading Date 1 Gene in Cereals.

    abstract::Flowering time is one of the key determinants of crop adaptation to local environments during domestication. However, the genetic basis underlying flowering time is yet to be elucidated in most cereals. Although staple cereals, such as rice, maize, wheat, barley, and sorghum, have spread and adapted to a wide range of...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msv148

    authors: Liu H,Liu H,Zhou L,Zhang Z,Zhang X,Wang M,Li H,Lin Z

    更新日期:2015-10-01 00:00:00

  • Observations of amino acid gain and loss during protein evolution are explained by statistical bias.

    abstract::The authors of a recent manuscript in "Nature" claim to have discovered "universal trends" of amino acid gain and loss in protein evolution. Here, we show that this universal trend can be simply explained by a bias that is unavoidable with the 3-taxon trees used in the original analysis. We demonstrate that a rigorous...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msl010

    authors: Goldstein RA,Pollock DD

    更新日期:2006-07-01 00:00:00

  • Emergent complexity in Myosin V-based organelle inheritance.

    abstract::How is adaptability generated in a system composed of interacting cellular machineries, each with a separate and functionally critical job to perform? The machinery for organelle inheritance is precisely one such system, requiring coordination between robust and ancient cellular modules, including the cell cycle, cyto...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msr264

    authors: Mast FD,Rachubinski RA,Dacks JB

    更新日期:2012-03-01 00:00:00

  • Tracing the Archaeal Origins of Eukaryotic Membrane-Trafficking System Building Blocks.

    abstract::In contrast to prokaryotes, eukaryotic cells are characterized by a complex set of internal membrane-bound compartments. A subset of these, and the protein machineries that move material between them, define the membrane-trafficking system (MTS), the emergence of which represents a landmark in eukaryotic evolution. Un...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msw034

    authors: Klinger CM,Spang A,Dacks JB,Ettema TJ

    更新日期:2016-06-01 00:00:00

  • Male-driven evolution of mitochondrial and chloroplastidial DNA sequences in plants.

    abstract::Although there is substantial evidence that, in animals, male-inherited neutral DNA evolves at a higher rate than female-inherited DNA, the relative evolutionary rate of male- versus female-inherited DNA has not been investigated in plants. We compared the substitution rates at neutral sites of maternally and paternal...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a004151

    authors: Whittle CA,Johnston MO

    更新日期:2002-06-01 00:00:00

  • Human L1 Transposition Dynamics Unraveled with Functional Data Analysis.

    abstract::Long INterspersed Elements-1 (L1s) constitute >17% of the human genome and still actively transpose in it. Characterizing L1 transposition across the genome is critical for understanding genome evolution and somatic mutations. However, to date, L1 insertion and fixation patterns have not been studied comprehensively. ...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msaa194

    authors: Chen D,Cremona MA,Qi Z,Mitra RD,Chiaromonte F,Makova KD

    更新日期:2020-12-16 00:00:00

  • Multiple nuclear insertions of mitochondrial cytochrome b sequences in callitrichine primates.

    abstract::We report the presence of four nuclear paralogs of a 380-bp segment of cytochrome b in callitrichine primates (marmosets and tamarins). The mitochondrial cytochrome b sequence and each nuclear paralog were obtained from several species, allowing multiple comparisons of rates and patterns of substitution both between m...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a026388

    authors: Mundy NI,Pissinatti A,Woodruff DS

    更新日期:2000-07-01 00:00:00