Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3.

Abstract:

:Current sequencing methods produce large amounts of data, but genome assemblies constructed from these data are often fragmented and incomplete. Incomplete and error-filled assemblies result in many annotation errors, especially in the number of genes present in a genome. This means that methods attempting to estimate rates of gene duplication and loss often will be misled by such errors and that rates of gene family evolution will be consistently overestimated. Here, we present a method that takes these errors into account, allowing one to accurately infer rates of gene gain and loss among genomes even with low assembly and annotation quality. The method is implemented in the newest version of the software package CAFE, along with several other novel features. We demonstrate the accuracy of the method with extensive simulations and reanalyze several previously published data sets. Our results show that errors in genome annotation do lead to higher inferred rates of gene gain and loss but that CAFE 3 sufficiently accounts for these errors to provide accurate estimates of important evolutionary parameters.

journal_name

Mol Biol Evol

authors

Han MV,Thomas GW,Lugo-Martinez J,Hahn MW

doi

10.1093/molbev/mst100

subject

Has Abstract

pub_date

2013-08-01 00:00:00

pages

1987-97

issue

8

eissn

0737-4038

issn

1537-1719

pii

mst100

journal_volume

30

pub_type

杂志文章
  • Low nucleotide diversity for the expanded organelle and nuclear genomes of Volvox carteri supports the mutational-hazard hypothesis.

    abstract::The noncoding-DNA content of organelle and nuclear genomes can vary immensely. Both adaptive and nonadaptive explanations for this variation have been proposed. This study addresses a nonadaptive explanation called the mutational-hazard hypothesis and applies it to the mitochondrial, plastid, and nuclear genomes of th...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msq110

    authors: Smith DR,Lee RW

    更新日期:2010-10-01 00:00:00

  • A test for heterotachy using multiple pairs of sequences.

    abstract::Heterotachy is a general term to describe positions that evolve at different rates in different lineages. Heterotachy also can generally be viewed as multivariate rates-across-sites variation, which can be described as randomly drawing rates (or branch lengths) from a multivariate distribution for each branch at each ...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msq346

    authors: Wu J,Susko E

    更新日期:2011-05-01 00:00:00

  • Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior.

    abstract::The degree to which an amino acid site is free to vary is strongly dependent on its structural and functional importance. An amino acid that plays an essential role is unlikely to change over evolutionary time. Hence, the evolutionary rate at an amino acid site is indicative of how conserved this site is and, in turn,...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msh194

    authors: Mayrose I,Graur D,Ben-Tal N,Pupko T

    更新日期:2004-09-01 00:00:00

  • Evolution of primate ABO blood group genes and their homologous genes.

    abstract::There are three common alleles (A, B, and O) at the human ABO blood group locus. We compared nucleotide sequences of these alleles, and relatively large numbers of nucleotide differences were found among them. These differences correspond to the divergence time of at least a few million years, which is unusually large...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a025776

    authors: Saitou N,Yamamoto F

    更新日期:1997-04-01 00:00:00

  • Specificity of the DNA Mismatch Repair System (MMR) and Mutagenesis Bias in Bacteria.

    abstract::The mutation rate of an organism is influenced by the interaction of evolutionary forces such as natural selection and genetic drift. However, the mutation spectrum (i.e., the frequency distribution of different types of mutations) can be heavily influenced by DNA repair. Using mutation-accumulation lines of the extre...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msy134

    authors: Long H,Miller SF,Williams E,Lynch M

    更新日期:2018-10-01 00:00:00

  • Microsatellites as targets of natural selection.

    abstract::The ability to survey polymorphism on a genomic scale has enabled genome-wide scans for the targets of natural selection. Theory that connects patterns of genetic variation to evidence of natural selection most often assumes a diallelic locus and no recurrent mutation. Although these assumptions are suitable to select...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/mss247

    authors: Haasl RJ,Payseur BA

    更新日期:2013-02-01 00:00:00

  • Functional evolution of the yeast protein interaction network.

    abstract::Protein interactions are central to most biological processes. We investigated the dynamics of emergence of the protein interaction network of Saccharomyces cerevisiae by mapping origins of proteins on an evolutionary tree. We demonstrate that evolutionary periods are characterized by distinct connectivity levels of t...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msh085

    authors: Kunin V,Pereira-Leal JB,Ouzounis CA

    更新日期:2004-07-01 00:00:00

  • Heads or tails: a simple reliability check for multiple sequence alignments.

    abstract::The question of multiple sequence alignment quality has received much attention from developers of alignment methods. Less forthcoming, however, are practical measures for addressing alignment quality issues in real life settings. Here, we present a simple methodology to help identify and quantify the uncertainties in...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msm060

    authors: Landan G,Graur D

    更新日期:2007-06-01 00:00:00

  • Tracing the Evolutionary History of Inositol, 1, 4, 5-Trisphosphate Receptor: Insights from Analyses of Capsaspora owczarzaki Ca2+ Release Channel Orthologs.

    abstract::Cellular Ca(2+) homeostasis is tightly regulated and is pivotal to life. Inositol 1,4,5-trisphosphate receptors (IP3Rs) and ryanodine receptors (RyRs) are the major ion channels that regulate Ca(2+) release from intracellular stores. Although these channels have been extensively investigated in multicellular organisms...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msv098

    authors: Alzayady KJ,Sebé-Pedrós A,Chandrasekhar R,Wang L,Ruiz-Trillo I,Yule DI

    更新日期:2015-09-01 00:00:00

  • An unusual form of purifying selection in a sperm protein.

    abstract::Protamines are small, highly basic DNA-binding proteins found in the sperm of animals. Interestingly, the proportion of arginine residues in one type of protamine, protamine P1, is about 50% in mammals. Upon closer examination, it was found that both the total number of amino acids and the positions of arginine residu...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a026307

    authors: Rooney AP,Zhang J,Nei M

    更新日期:2000-02-01 00:00:00

  • Variations in ribosomal DNA and mitochondrial DNA among chromosomal species of subterranean mole rats.

    abstract::Restriction site variations in nuclear ribosomal DNA (rDNA) spacers and mitochondrial DNA (mtDNA) were examined in several populations of mole rats with variable numbers of chromosomes, which represented the two superspecies Spalax leucodon (2n = 38, 54, or 62) and Spalax ehrenbergi (2n = 52, 54, 58, or 60). Sequence ...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a025574

    authors: Suzuki H,Wakana S,Yonekawa H,Moriwaki K,Sakurai S,Nevo E

    更新日期:1996-01-01 00:00:00

  • Conditional gene genealogies under strong purifying selection.

    abstract::The ancestral selection graph, conditioned on the allelic types in the sample, is used to obtain a limiting gene genealogical process under strong selection. In an equilibrium, two-allele system with strong selection, neutral gene genealogies are predicted for random samples and for samples containing at most one unfa...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msn209

    authors: Wakeley J

    更新日期:2008-12-01 00:00:00

  • Evolution and phylogenetic utility of alignment gaps within intron sequences of three nuclear genes in bumble bees (Bombus).

    abstract::To test whether gaps resulting from sequence alignment contain phylogenetic signal concordant with those of base substitutions, we analyzed the occurrence of indel mutations upon a well-resolved, substitution-based tree for three nuclear genes in bumble bees (Bombus, Apidae: Bombini). The regions analyzed were exon an...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msg007

    authors: Kawakita A,Sota T,Ascher JS,Ito M,Tanaka H,Kato M

    更新日期:2003-01-01 00:00:00

  • Functional regulatory divergence of the innate immune system in interspecific Drosophila hybrids.

    abstract::In order to investigate divergence of immune regulation among Drosophila species, we have engaged in a study of innate immune function in F1 hybrids of Drosophila melanogaster and D. simulans. If pathways have diverged between the species such that incompatibilities have arisen between interacting components of the im...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msq146

    authors: Hill-Burns EM,Clark AG

    更新日期:2010-11-01 00:00:00

  • A novel abundant family of retroposed elements (DAS-SINEs) in the nine-banded armadillo (Dasypus novemcinctus).

    abstract::About half of the mammalian genome is composed of retroposons. Long interspersed elements (LINEs) and short interspersed elements (SINEs) are the most abundant repetitive elements and account for about 21% and 13% of the human genome, respectively. SINEs have been detected in all major mammalian lineages, except for t...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msi071

    authors: Churakov G,Smit AF,Brosius J,Schmitz J

    更新日期:2005-04-01 00:00:00

  • How meaningful are Bayesian support values?

    abstract::In this study, we used an empirical example based on 100 mitochondrial genomes from higher teleost fishes to compare the accuracy of parsimony-based jackknife values with Bayesian support values. Phylogenetic analyses of 366 partitions, using differential taxon and character sampling from the entire data matrix of 100...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msh014

    authors: Simmons MP,Pickett KM,Miya M

    更新日期:2004-01-01 00:00:00

  • A scan for human-specific relaxation of negative selection reveals unexpected polymorphism in proteasome genes.

    abstract::Environmental or genomic changes during evolution can relax negative selection pressure on specific loci, permitting high frequency polymorphisms at previously conserved sites. Here, we jointly analyze population genomic and comparative genomic data to search for functional processes showing relaxed negative selection...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/mst098

    authors: Somel M,Wilson Sayres MA,Jordan G,Huerta-Sanchez E,Fumagalli M,Ferrer-Admetlla A,Nielsen R

    更新日期:2013-08-01 00:00:00

  • Selection of laboratory wild-type phenotype from natural isolates of Escherichia coli in chemostats.

    abstract::We have followed, in glucose-limited chemostats, the evolution of natural isolates of Escherichia coli possessing maximal growth rates of 0.48-1.43 doublings/h. Under these conditions a rapid-growth phenotype similar to that of standard laboratory wild-type strains was selected so that after 280 generations all of the...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a040731

    authors: Mikkola R,Kurland CG

    更新日期:1992-05-01 00:00:00

  • Multiple nuclear insertions of mitochondrial cytochrome b sequences in callitrichine primates.

    abstract::We report the presence of four nuclear paralogs of a 380-bp segment of cytochrome b in callitrichine primates (marmosets and tamarins). The mitochondrial cytochrome b sequence and each nuclear paralog were obtained from several species, allowing multiple comparisons of rates and patterns of substitution both between m...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a026388

    authors: Mundy NI,Pissinatti A,Woodruff DS

    更新日期:2000-07-01 00:00:00

  • Molecular mechanisms of colicin evolution.

    abstract::This review explores features of the origin and evolution of colicins in Escherichia coli. First, the evolutionary relationships of 16 colicin and colicin-related proteins are inferred from amino acid and DNA sequence comparisons. These comparisons are employed to detail the evolutionary mechanisms involved in the ori...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a040081

    authors: Riley MA

    更新日期:1993-11-01 00:00:00

  • Molecular remodeling of members of the relaxin family during primate evolution.

    abstract::Employing comparative analysis of the cDNA-coding sequences of the unique preprorelaxin of the Afro-lorisiform Galago crassicaudatus and the Malagasy lemur Varecia variegata and the relaxin-like factor (RLF) of G. crassicaudatus, we demonstrated distinct differences in the dynamics of molecular remodeling of both horm...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a003815

    authors: Klonisch T,Froehlich C,Tetens F,Fischer B,Hombach-Klonisch S

    更新日期:2001-03-01 00:00:00

  • The legacy of domestication: accumulation of deleterious mutations in the dog genome.

    abstract::Dogs exhibit more phenotypic variation than any other mammal and are affected by a wide variety of genetic diseases. However, the origin and genetic basis of this variation is still poorly understood. We examined the effect of domestication on the dog genome by comparison with its wild ancestor, the gray wolf. We comp...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msn177

    authors: Cruz F,Vilà C,Webster MT

    更新日期:2008-11-01 00:00:00

  • Evaluation of methods for determination of a reconstructed history of gene sequence evolution.

    abstract::With whole-genome sequences being completed at an increasing rate, it is important to develop and assess tools to analyze them. Following annotation of the protein content of a genome, one can compare sequences with previously characterized homologous genes to detect novel functions within specific proteins in the evo...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/oxfordjournals.molbev.a003745

    authors: Liberles DA

    更新日期:2001-11-01 00:00:00

  • Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo.

    abstract::A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio tes...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msh123

    authors: Huelsenbeck JP,Larget B,Alfaro ME

    更新日期:2004-06-01 00:00:00

  • Phylogenomic analysis of the uracil-DNA glycosylase superfamily.

    abstract::The spontaneous deamination of cytosine produces uracil mispaired with guanine in DNA, which will produce a mutation, unless repaired. In all domains of life, uracil-DNA glycosylases (UDGs) are responsible for the elimination of uracil from DNA. Thus, UDGs contribute to the integrity of the genetic information and the...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msq318

    authors: Lucas-Lledó JI,Maddamsetti R,Lynch M

    更新日期:2011-03-01 00:00:00

  • Life History of the Oldest Lentivirus: Characterization of ELVgv Integrations in the Dermopteran Genome.

    abstract::Endogenous retroviruses are genomic elements formed by germline infiltration by originally exogenous viruses. These molecular fossils provide valuable information about the evolution of the retroviral family. Lentiviruses are an extensively studied genus of retroviruses infecting a broad range of mammals. Despite a we...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msw149

    authors: Hron T,Farkašová H,Padhi A,Pačes J,Elleder D

    更新日期:2016-10-01 00:00:00

  • Alu and LINE1 distributions in the human chromosomes: evidence of global genomic organization expressed in the form of power laws.

    abstract::Spatial distribution and clustering of repetitive elements are extensively studied during the last years, as well as their colocalization with other genomic components. Here we investigate the large-scale features of Alu and LINE1 spatial arrangement in the human genome by studying the size distribution of interrepeat...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msm181

    authors: Sellis D,Provata A,Almirantis Y

    更新日期:2007-11-01 00:00:00

  • Odorant Receptors for Detecting Flowering Plant Cues Are Functionally Conserved across Moths and Butterflies.

    abstract::Odorant receptors (ORs) are essential for plant-insect interactions. However, despite the global impacts of Lepidoptera (moths and butterflies) as major herbivores and pollinators, little functional data is available about Lepidoptera ORs involved in plant volatile detection. Here, we initially characterized the plant...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msaa300

    authors: Guo M,Du L,Chen Q,Feng Y,Zhang J,Zhang X,Tian K,Cao S,Huang T,Jacquin-Joly E,Wang G,Liu Y

    更新日期:2020-11-24 00:00:00

  • First-order correct bootstrap support adjustments for splits that allow hypothesis testing when using maximum likelihood estimation.

    abstract::The most frequent measure of phylogenetic uncertainty for splits is bootstrap support. Although large bootstrap support intuitively suggests that a split in a tree is well supported, it has not been clear how large bootstrap support needs to be to conclude that there is significant evidence that a hypothesized split i...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msq048

    authors: Susko E

    更新日期:2010-07-01 00:00:00

  • Rapid Viral Symbiogenesis via Changes in Parasitoid Wasp Genome Architecture.

    abstract::Viral genome integration provides a complex route to biological innovation that has rarely but repeatedly occurred in one of the most diverse lineages of organisms on the planet, parasitoid wasps. We describe a novel endogenous virus in braconid wasps derived from pathogenic alphanudiviruses. Limited to a subset of th...

    journal_title:Molecular biology and evolution

    pub_type: 杂志文章

    doi:10.1093/molbev/msy148

    authors: Burke GR,Simmonds TJ,Sharanowski BJ,Geib SM

    更新日期:2018-10-01 00:00:00