How Should Genes and Taxa be Sampled for Phylogenomic Analyses with Missing Data? An Empirical Study in Iguanian Lizards.

Abstract:

:Targeted sequence capture is becoming a widespread tool for generating large phylogenomic data sets to address difficult phylogenetic problems. However, this methodology often generates data sets in which increasing the number of taxa and loci increases amounts of missing data. Thus, a fundamental (but still unresolved) question is whether sampling should be designed to maximize sampling of taxa or genes, or to minimize the inclusion of missing data cells. Here, we explore this question for an ancient, rapid radiation of lizards, the pleurodont iguanians. Pleurodonts include many well-known clades (e.g., anoles, basilisks, iguanas, and spiny lizards) but relationships among families have proven difficult to resolve strongly and consistently using traditional sequencing approaches. We generated up to 4921 ultraconserved elements with sampling strategies including 16, 29, and 44 taxa, from 1179 to approximately 2.4 million characters per matrix and approximately 30% to 60% total missing data. We then compared mean branch support for interfamilial relationships under these 15 different sampling strategies for both concatenated (maximum likelihood) and species tree (NJst) approaches (after showing that mean branch support appears to be related to accuracy). We found that both approaches had the highest support when including loci with up to 50% missing taxa (matrices with ~40-55% missing data overall). Thus, our results show that simply excluding all missing data may be highly problematic as the primary guiding principle for the inclusion or exclusion of taxa and genes. The optimal strategy was somewhat different for each approach, a pattern that has not been shown previously. For concatenated analyses, branch support was maximized when including many taxa (44) but fewer characters (1.1 million). For species-tree analyses, branch support was maximized with minimal taxon sampling (16) but many loci (4789 of 4921). We also show that the choice of these sampling strategies can be critically important for phylogenomic analyses, since some strategies lead to demonstrably incorrect inferences (using the same method) that have strong statistical support. Our preferred estimate provides strong support for most interfamilial relationships in this important but phylogenetically challenging group.

journal_name

Syst Biol

journal_title

Systematic biology

authors

Streicher JW,Schulte JA 2nd,Wiens JJ

doi

10.1093/sysbio/syv058

subject

Has Abstract

pub_date

2016-01-01 00:00:00

pages

128-45

issue

1

eissn

1063-5157

issn

1076-836X

pii

syv058

journal_volume

65

pub_type

杂志文章
  • Multistate characters and diet shifts: evolution of Erotylidae (Coleoptera).

    abstract::The dominance of angiosperms has played a direct role in the diversification of insects, especially Coleoptera. The shift to angiosperm feeding from other diets is likely to have increased the rate of speciation in Phytophaga. However, Phytophaga is only one of many hyperdiverse lineages of beetles and studies of host...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1080/10635150701211844

    authors: Leschen RA,Buckley TR

    更新日期:2007-02-01 00:00:00

  • A Framework for Resolving Cryptic Species: A Case Study from the Lizards of the Australian Wet Tropics.

    abstract::As we collect range-wide genetic data for morphologically-defined species, we increasingly unearth evidence for cryptic diversity. Delimiting this cryptic diversity is challenging, both because the divergences span a continuum and because the lack of overt morphological differentiation suggests divergence has proceede...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syy026

    authors: Singhal S,Hoskin CJ,Couper P,Potter S,Moritz C

    更新日期:2018-11-01 00:00:00

  • Exploration of Plastid Phylogenomic Conflict Yields New Insights into the Deep Relationships of Leguminosae.

    abstract::Phylogenomic analyses have helped resolve many recalcitrant relationships in the angiosperm tree of life, yet phylogenetic resolution of the backbone of the Leguminosae, one of the largest and most economically and ecologically important families, remains poor due to generally limited molecular data and incomplete tax...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syaa013

    authors: Zhang R,Wang YH,Jin JJ,Stull GW,Bruneau A,Cardoso D,De Queiroz LP,Moore MJ,Zhang SD,Chen SY,Wang J,Li DZ,Yi TS

    更新日期:2020-07-01 00:00:00

  • Partial sequence homogenization in the 5S multigene families may generate sequence chimeras and spurious results in phylogenetic reconstructions.

    abstract::Multigene families have provided opportunities for evolutionary biologists to assess molecular evolution processes and phylogenetic reconstructions at deep and shallow systematic levels. However, the use of these markers is not free of technical and analytical challenges. Many evolutionary studies that used the nuclea...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syt101

    authors: Galián JA,Rosato M,Rosselló JA

    更新日期:2014-03-01 00:00:00

  • Why the phylogenetic regression appears robust to tree misspecification.

    abstract::The phylogenetic comparative method uses estimates of evolutionary relationships to explicitly model the covariance structure of interspecific data. By accounting for common ancestry, the coevolution between 2 or more traits, as a response to one another or to environmental variables, can be studied without confoundin...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syq098

    authors: Stone EA

    更新日期:2011-05-01 00:00:00

  • Phylogenetic Trees and Networks Can Serve as Powerful and Complementary Approaches for Analysis of Genomic Data.

    abstract::Genomic data have had a profound impact on nearly every biological discipline. In systematics and phylogenetics, the thousands of loci that are now being sequenced can be analyzed under the multispecies coalescent model (MSC) to explicitly account for gene tree discordance due to incomplete lineage sorting (ILS). Howe...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syz056

    authors: Blair C,Ané C

    更新日期:2020-05-01 00:00:00

  • Distribution and phylogeny of Penelope-like elements in eukaryotes.

    abstract::Penelope-like elements (PLEs) are a relatively little studied class of eukaryotic retroelements, distinguished by the presence of the GIY-YIG endonuclease domain, the ability of some representatives to retain introns, and the similarity of PLE-encoded reverse transcriptases to telomerases. Although these retrotranspos...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1080/10635150601077683

    authors: Arkhipova IR

    更新日期:2006-12-01 00:00:00

  • More characters or more taxa for a robust phylogeny--case study from the coffee family (Rubiaceae).

    abstract::Using different data sets mainly from the plant family Rubiaceae, but in parts also from the Apocynaceae, Asteraceae, Lardizabalaceae, Saxifragaceae, and Solanaceae, we have investigated the effect of number of characters, number of taxa, and kind of data on bootstrap values within phylogenetic trees. The percentage o...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1080/106351599260085

    authors: Bremer B,Jansen RK,Oxelman B,Backlund M,Lantz H,Kim KJ

    更新日期:1999-09-01 00:00:00

  • Multiple data sets, high homoplasy, and the phylogeny of softshell turtles (Testudines: Trionychidae).

    abstract::We present a phylogenetic hypothesis and novel, rank-free classification for all extant species of softshell turtles (Testudines:Trionychidae). Our data set included DNA sequence data from two mitochondrial protein-coding genes and a approximately 1-kb nuclear intron for 23 of 26 recognized species, and 59 previously ...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1080/10635150490503053

    authors: Engstrom TN,Shaffer HB,McCord WP

    更新日期:2004-10-01 00:00:00

  • Integration of Anatomy Ontologies and Evo-Devo Using Structured Markov Models Suggests a New Framework for Modeling Discrete Phenotypic Traits.

    abstract::Modeling discrete phenotypic traits for either ancestral character state reconstruction or morphology-based phylogenetic inference suffers from ambiguities of character coding, homology assessment, dependencies, and selection of adequate models. These drawbacks occur because trait evolution is driven by two key proces...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syz005

    authors: Tarasov S

    更新日期:2019-09-01 00:00:00

  • Phylogeny and biogeography of dolichoderine ants: effects of data partitioning and relict taxa on historical inference.

    abstract::Ants (Hymenoptera: Formicidae) are conspicuous organisms in most terrestrial ecosystems, often attaining high levels of abundance and diversity. In this study, we investigate the evolutionary history of a major clade of ants, the subfamily Dolichoderinae, whose species frequently achieve ecological dominance in ant co...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syq012

    authors: Ward PS,Brady SG,Fisher BL,Schultz TR

    更新日期:2010-05-01 00:00:00

  • New heuristic methods for joint species delimitation and species tree inference.

    abstract::Species delimitation and species tree inference are difficult problems in cases of recent divergence, especially when different loci have different histories. This paper quantifies the difficulty of jointly finding the division of samples to species and estimating a species tree without constraining the possible assig...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syp077

    authors: O'Meara BC

    更新日期:2010-01-01 00:00:00

  • 18S ribosomal RNA and tetrapod phylogeny.

    abstract::Previous phylogenetic analyses of tetrapod 18S ribosomal RNA (rRNA) sequences support the grouping of birds with mammals, whereas other molecular data, and morphological and paleontological data favor the grouping of birds with crocodiles. The 18S rRNA gene has consequently been considered odd, serving as "definitive ...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1080/10635150390196948

    authors: Xia X,Xie Z,Kjer KM

    更新日期:2003-06-01 00:00:00

  • The evolutionary root of flowering plants.

    abstract::Correct rooting of the angiosperm radiation is both challenging and necessary for understanding the origins and evolution of physiological and phenotypic traits in flowering plants. The problem is known to be difficult due to the large genetic distance separating flowering plants from other seed plants and the sparse ...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/sys070

    authors: Goremykin VV,Nikiforova SV,Biggs PJ,Zhong B,Delange P,Martin W,Woetzel S,Atherton RA,McLenachan PA,Lockhart PJ

    更新日期:2013-01-01 00:00:00

  • Bias in tree searches and its consequences for measuring group supports.

    abstract::When doing a bootstrap analysis with a single tree saved per pseudoreplicate, biased search algorithms may influence support values more than actual properties of the data set. Two methods commonly used for finding phylogenetic trees consist of randomizing the input order of species in multiple addition sequences foll...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syu051

    authors: Goloboff PA,Simmons MP

    更新日期:2014-11-01 00:00:00

  • Efficient exploration of the space of reconciled gene trees.

    abstract::Gene trees record the combination of gene-level events, such as duplication, transfer and loss (DTL), and species-level events, such as speciation and extinction. Gene tree-species tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and species-level eve...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syt054

    authors: Szöllõsi GJ,Rosikiewicz W,Boussau B,Tannier E,Daubin V

    更新日期:2013-11-01 00:00:00

  • Phylogeny of Eunicida (Annelida) and exploring data congruence using a partition addition bootstrap alteration (PABA) approach.

    abstract::Even though relationships within Annelida are poorly understood, Eunicida is one of only a few major annelid lineages well supported by morphology. The seven recognized eunicid families possess sclerotized jaws that include mandibles and a maxillary apparatus. The maxillary apparatuses vary in shape and number of elem...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1080/10635150500354910

    authors: Struck TH,Purschke G,Halanych KM

    更新日期:2006-02-01 00:00:00

  • A species tree for the Australo-Papuan Fairy-wrens and allies (Aves: Maluridae).

    abstract::We explored the efficacy of species tree methods at the family level in birds, using the Australo-Papuan Fairy-wrens (Passeriformes: Maluridae) as a model system. Fairy-wrens of the genus Malurus are known for high intensities of sexual selection, resulting in some cases in rapid speciation. This history suggests that...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syr101

    authors: Lee JY,Joseph L,Edwards SV

    更新日期:2012-03-01 00:00:00

  • The comparative method is not macroevolution: across-species evidence for within-species process.

    abstract::It is common for studies that employ the comparative method for the study of adaptation, i.e. documentation of potentially adaptive across-species patterns of trait-environment or trait-trait correlation, to be designated as "macroevolutionary." Authors are justified in using "macroevolution" in this way by appeal to ...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syaa086

    authors: Olson ME

    更新日期:2021-01-07 00:00:00

  • A novel test for host-symbiont codivergence indicates ancient origin of fungal endophytes in grasses.

    abstract::Significant phylogenetic codivergence between plant or animal hosts (H) and their symbionts or parasites (P) indicates the importance of their interactions on evolutionary time scales. However, valid and realistic methods to test for codivergence are not fully developed. One of the systems where possible codivergence ...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1080/10635150802172184

    authors: Schardl CL,Craven KD,Speakman S,Stromberg A,Lindstrom A,Yoshida R

    更新日期:2008-06-01 00:00:00

  • The Phylogeny of Rickettsia Using Different Evolutionary Signatures: How Tree-Like is Bacterial Evolution?

    abstract::Rickettsia is a genus of intracellular bacteria whose hosts and transmission strategies are both impressively diverse, and this is reflected in a highly dynamic genome. Some previous studies have described the evolutionary history of Rickettsia as non-tree-like, due to incongruity between phylogenetic reconstructions ...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syv084

    authors: Murray GG,Weinert LA,Rhule EL,Welch JJ

    更新日期:2016-03-01 00:00:00

  • Radiation of extant cetaceans driven by restructuring of the oceans.

    abstract::The remarkable fossil record of whales and dolphins (Cetacea) has made them an exemplar of macroevolution. Although their overall adaptive transition from terrestrial to fully aquatic organisms is well known, this is not true for the radiation of modern whales. Here, we explore the diversification of extant cetaceans ...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syp060

    authors: Steeman ME,Hebsgaard MB,Fordyce RE,Ho SY,Rabosky DL,Nielsen R,Rahbek C,Glenner H,Sørensen MV,Willerslev E

    更新日期:2009-12-01 00:00:00

  • Diversification, Introgression, and Rampant Cytonuclear Discordance in Rocky Mountains Chipmunks (Sciuridae: Tamias).

    abstract::Evidence from natural systems suggests that hybridization between animal species is more common than traditionally thought, but the overall contribution of introgression to standing genetic variation within species remains unclear for most animal systems. Here, we use targeted exon-capture to sequence thousands of nuc...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syaa085

    authors: Sarver BAJ,Herrera ND,Sneddon D,Hunter SS,Settles ML,Kronenberg Z,Demboski JR,Good JM,Sullivan J

    更新日期:2021-01-07 00:00:00

  • Independent contrasts and PGLS regression estimators are equivalent.

    abstract::We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has seve...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syr118

    authors: Blomberg SP,Lefevre JG,Wells JA,Waterhouse M

    更新日期:2012-05-01 00:00:00

  • Maximum Likelihood Implementation of an Isolation-with-Migration Model for Three Species.

    abstract::We develop a maximum likelihood (ML) method for estimating migration rates between species using genomic sequence data. A species tree is used to accommodate the phylogenetic relationships among three species, allowing for migration between the two sister species, while the third species is used as an out-group. A Mar...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syw063

    authors: Dalquen DA,Zhu T,Yang Z

    更新日期:2017-05-01 00:00:00

  • Rumbling Orchids: How To Assess Divergent Evolution Between Chloroplast Endosymbionts and the Nuclear Host.

    abstract::Phylogenetic relationships inferred from multilocus organellar and nuclear DNA data are often difficult to resolve because of evolutionary conflicts among gene trees. However, conflicting or "outlier" associations (i.e., linked pairs of "operational terminal units" in two phylogenies) among these data sets often provi...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syv070

    authors: Pérez-Escobar OA,Balbuena JA,Gottschling M

    更新日期:2016-01-01 00:00:00

  • Phylogenomics, Origin and Diversification of Anthozoans (Phylum Cnidaria).

    abstract::Anthozoan cnidarians (corals and sea anemones) include some of the world's most important foundation species, capable of building massive reef complexes that support entire ecosystems. Although previous molecular phylogenetic analyses have revealed widespread homoplasy of the morphological characters traditionally use...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syaa103

    authors: McFadden CS,Quattrini AM,Brugler MR,Cowman PF,Dueñas LF,Kitahara MV,Paz-García DA,Reimer JD,Rodríguez E

    更新日期:2021-01-28 00:00:00

  • Whole Genome Shotgun Phylogenomics Resolves the Pattern and Timing of Swallowtail Butterfly Evolution.

    abstract::Evolutionary relationships have remained unresolved in many well-studied groups, even though advances in next-generation sequencing and analysis, using approaches such as transcriptomics, anchored hybrid enrichment, or ultraconserved elements, have brought systematics to the brink of whole genome phylogenomics. Recent...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1093/sysbio/syz030

    authors: Allio R,Scornavacca C,Nabholz B,Clamens AL,Sperling FA,Condamine FL

    更新日期:2020-01-01 00:00:00

  • Novel versus unsupported clades: assessing the qualitative support for clades in MRP supertrees.

    abstract::Matrix representation with parsimony (MRP) supertree construction has been criticized because the supertree may specify clades that are contradicted by every source tree contributing to it. Such unsupported clades may also occur using other supertree methods; however, their incidence is largely unknown. In this study,...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:

    authors: Bininda-Emonds OR

    更新日期:2003-12-01 00:00:00

  • Evolution of a RNA polymerase gene family in Silene (Caryophyllaceae)-incomplete concerted evolution and topological congruence among paralogues.

    abstract::Four low-copy nuclear DNA intron regions from the second largest subunits of the RNA polymerase gene family (RPA2, RPB2, RPD2a, and RPD2b), the internal transcribed spacers (ITSs) from the nuclear ribosomal regions, and the rps16 intron from the chloroplast were sequenced and used in a phylogenetic analysis of 29 spec...

    journal_title:Systematic biology

    pub_type: 杂志文章

    doi:10.1080/10635150490888840

    authors: Popp M,Oxelman B

    更新日期:2004-12-01 00:00:00