Abstract:
:Accurate gene tree-species tree reconciliation is fundamental to inferring the evolutionary history of a gene family. However, although it has long been appreciated that population-related effects such as incomplete lineage sorting (ILS) can dramatically affect the gene tree, many of the most popular reconciliation methods consider discordance only due to gene duplication and loss (and sometimes horizontal gene transfer). Methods that do model ILS are either highly parameterized or consider a restricted set of histories, thus limiting their applicability and accuracy. To address these challenges, we present a novel algorithm DLCpar for inferring a most parsimonious (MP) history of a gene family in the presence of duplications, losses, and ILS. Our algorithm relies on a new reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes coalescent and duplication-loss history. We show that the LCT representation enables an exhaustive and efficient search over the space of reconciliations, and, for most gene families, the least common ancestor (LCA) mapping is an optimal solution for the species mapping between the gene tree and species tree in an MP LCT. Applying our algorithm to a variety of clades, including flies, fungi, and primates, as well as to simulated phylogenies, we achieve high accuracy, comparable to sophisticated probabilistic reconciliation methods, at reduced run time and with far fewer parameters. These properties enable inferences of the complex evolution of gene families across a broad range of species and large data sets.
journal_name
Genome Resjournal_title
Genome researchauthors
Wu YC,Rasmussen MD,Bansal MS,Kellis Mdoi
10.1101/gr.161968.113subject
Has Abstractpub_date
2014-03-01 00:00:00pages
475-86issue
3eissn
1088-9051issn
1549-5469pii
gr.161968.113journal_volume
24pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::The Polycomb group (PcG) and Trithorax group (TrxG) of proteins are required for stable and heritable maintenance of repressed and active gene expression states. Their antagonistic function on gene control, repression for PcG and activity for TrxG, is mediated by binding to chromatin and subsequent epigenetic modifica...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.114348.110
更新日期:2011-02-01 00:00:00
abstract::It is widely accepted that newly arisen duplicate gene pairs experience an altered selective regime that is often manifested as an increase in the rate of protein sequence evolution. Many details about the nature of the rate acceleration remain unknown, however, including its typical magnitude and duration, and whethe...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6341207
更新日期:2008-01-01 00:00:00
abstract::Clusters of functionally related genes can be disrupted by a single copy number variant (CNV). We demonstrate that the simultaneous disruption of multiple functionally related genes is a frequent and significant characteristic of de novo CNVs in patients with developmental disorders (P = 1 × 10(-3)). Using three diffe...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.184325.114
更新日期:2015-06-01 00:00:00
abstract::Through comparative studies of the model organism Arabidopsis thaliana and its close relative Brassica oleracea, we have identified conserved regions that represent potentially functional sequences overlooked by previous Arabidopsis genome annotation methods. A total of 454,274 whole genome shotgun sequences covering ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3176505
更新日期:2005-04-01 00:00:00
abstract::Facio-scapulo-humeral dystrophy (FSHD), a muscular hereditary disease with a prevalence of 1 in 20,000, is caused by a partial deletion of a subtelomeric repeat array on chromosome 4q. Earlier, we demonstrated the existence in the vicinity of the D4Z4 repeat of a nuclear matrix attachment site, FR-MAR, efficient in no...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6620908
更新日期:2008-01-01 00:00:00
abstract::We have performed detrended DNA walks on whole prokaryotic genomes, on noncoding sequences and, separately, on each position in codons of coding sequences. Our method enables us to distinguish between the mutational pressure associated with replication and the mutational pressure associated with transcription and othe...
journal_title:Genome research
pub_type: 杂志文章
doi:
更新日期:1999-05-01 00:00:00
abstract::Large scale gene perturbation experiments generate information about the number of genes whose activity is directly or indirectly affected by a gene perturbation. From this information, one can numerically estimate coarse structural network features such as the total number of direct regulatory interactions and the nu...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.193902
更新日期:2002-02-01 00:00:00
abstract::By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by m...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4039406
更新日期:2006-01-01 00:00:00
abstract::The rate of transcription elongation plays an important role in the timing of expression of full-length transcripts as well as in the regulation of alternative splicing. In this study, we coupled Bru-seq technology with 5,6-dichlorobenzimidazole 1-β-D-ribofuranoside (DRB) to estimate the elongation rates of over 2000 ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.171405.113
更新日期:2014-06-01 00:00:00
abstract::Mosaic mutations present in the germline have important implications for reproductive risk and disease transmission. We previously demonstrated a phenomenon occurring in the male germline, whereby specific mutations arising spontaneously in stem cells (spermatogonia) lead to clonal expansion, resulting in elevated mut...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.239186.118
更新日期:2018-12-01 00:00:00
abstract::Little is known about the rate of emergence of de novo genes, what their initial properties are, and how they spread in populations. We examined wild yeast populations (Saccharomyces paradoxus) to characterize the diversity and turnover of intergenic ORFs over short evolutionary timescales. We find that hundreds of in...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.239822.118
更新日期:2019-06-01 00:00:00
abstract::A DNA mutation detection protocol able to identify and characterize a previously unknown change in a given sequence in a rapid, efficient, sensitive, and inexpensive manner is required to take advantage of the resources now available to researchers through the genome sequencing projects. We have developed a method bas...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.gr-1578r
更新日期:2002-09-01 00:00:00
abstract::The higher-order structural organization and dynamics of the chromosomes play a central role in gene regulation. To explore this structure-function relationship, it is necessary to directly visualize genomic elements in living cells. Genome imaging based on the CRISPR system is a powerful approach but has limited appl...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.260018.119
更新日期:2020-09-01 00:00:00
abstract::Developments in microarray and high-throughput sequencing (HTS) technologies have resulted in a rapid expansion of research into epigenomic changes that occur in normal development and in the progression of disease, such as cancer. Not surprisingly, copy number variation (CNV) has a direct effect on HTS read densities...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.139055.112
更新日期:2012-12-01 00:00:00
abstract::To elucidate the role of exon shuffling in shaping the complexity of the human genome/proteome, we have systematically analyzed intron phase distributions in the coding sequence of human protein domains. We found that introns at the boundaries of domains show high excess of symmetrical phase combinations (i.e., 0-0, 1...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.520702
更新日期:2002-11-01 00:00:00
abstract::We report the validation of a new assay for typing single nucleotide polymorphisms (SNPs) that takes advantage of the 3'-to-5' exonuclease proofreading activity of many DNA polymerases. The assay uses one or more primers labeled on the 3' nucleotide base, and can be implemented in a variety of formats including a one-...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.939903
更新日期:2003-05-01 00:00:00
abstract::Eukaryotic translation initiation involves preinitiation ribosomal complex 5'-to-3' directional probing of mRNA for codons suitable for starting protein synthesis. The recognition of codons as starts depends on the codon identity and on its immediate nucleotide context known as Kozak context. When the context is weak ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.257352.119
更新日期:2020-07-01 00:00:00
abstract::Gene duplication and alternative splicing are important sources of proteomic diversity. Despite research indicating that gene duplication and alternative splicing are negatively correlated, the evolutionary relationship between the two remains unclear. One manner in which alternative splicing and gene duplication may ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.184473.114
更新日期:2015-05-01 00:00:00
abstract::Somatic L1 retrotransposition events have been shown to occur in epithelial cancers. Here, we attempted to determine how early somatic L1 insertions occurred during the development of gastrointestinal (GI) cancers. Using L1-targeted resequencing (L1-seq), we studied different stages of four colorectal cancers arising ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.196238.115
更新日期:2015-10-01 00:00:00
abstract::WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive. Ea...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.849004
更新日期:2004-06-01 00:00:00
abstract::Mammalian genomes are partitioned into domains that replicate in a defined temporal order. These domains can replicate at similar times in all cell types (constitutive) or at cell type-specific times (developmental). Genome-wide chromatin conformation capture (Hi-C) has revealed sub-megabase topologically associating ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.183699.114
更新日期:2015-08-01 00:00:00
abstract::The need for expeditious and inexpensive methods for high-throughput DNA sequencing has been highlighted by the accelerated pace of genome DNA sequencing over the past year. At the Joint Genome Institute, the throughput in terms of high-quality bases per day has increased over 20-fold during the past 18 mo, reaching a...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.167801
更新日期:2001-07-01 00:00:00
abstract::Rhesus macaque is an Old World monkey that shared a common ancestor with human ∼25 Myr ago and is an important animal model for human disease studies. A deep understanding of its genetics is therefore required for both biomedical and evolutionary studies. Among structural variants, inversions represent a driving force...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.265322.120
更新日期:2020-11-01 00:00:00
abstract::Variation in the composition of the human oral microbiome in health and disease has been observed. We have characterized inter- and intra-individual variation of microbial communities of 107 individuals in one of the largest cohorts to date (264 saliva samples), using culture-independent 16S rRNA pyrosequencing. We ex...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.140608.112
更新日期:2012-11-01 00:00:00
abstract::Previous approaches to mutation detection in mRNA from the neurofibromatosis 1 (NF1) locus have required the PCR amplification of five or more overlapping cDNA segments to screen the entire 8.5-kb open reading frame (ORF). Systematically, these assays do not detect deletions that span the region of overlap (usually 1-...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6.1.58
更新日期:1996-01-01 00:00:00
abstract::The Y centromere sequence of house mouse, Mus musculus, remains unknown despite our otherwise significant knowledge of the genome sequence of this important mammalian model organism. Here, we report the complete molecular characterization of the C57BL/6J chromosome Y centromere, which comprises a highly diverged minor...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.092080.109
更新日期:2009-12-01 00:00:00
abstract::Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and co...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.185579.114
更新日期:2015-03-01 00:00:00
abstract::Alternative splicing is a major mechanism for gene product regulation in many multicellular organisms. By using different exon combinations, some coding regions can encode amino acids in multiple reading frames in different transcripts. Here we performed a systematic search through a set of high-quality human transcri...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4246506
更新日期:2006-02-01 00:00:00
abstract::Up-frameshift protein 1 (UPF1) is an ATP-dependent RNA helicase that has essential roles in RNA surveillance and in post-transcriptional gene regulation by promoting the degradation of mRNAs. Previous studies revealed that UPF1 is associated with the 3' untranslated region (UTR) of target mRNAs via as-yet-unknown sequ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.206060.116
更新日期:2017-03-01 00:00:00
abstract::CRISPR/Cas9-mediated targeted mutagenesis allows efficient generation of loss-of-function alleles in zebrafish. To date, this technology has been primarily used to generate genetic knockout animals. Nevertheless, the study of the function of certain loci might require tight spatiotemporal control of gene inactivation....
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.196170.115
更新日期:2016-05-01 00:00:00