Abstract:
:Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic acid sequence features detectable by the yeast cell, which we integrate into a Unified Model (UM) that models transcription as a whole. The cis-elements that denote where transcription initiates function primarily through nucleosome depletion, and, using a synthetic promoter system, we show that most of these elements are sufficient to initiate transcription in vivo. Hrp1 binding sites are the major characteristic of terminators; these binding sites are often clustered in terminator regions and can terminate transcription bidirectionally. The UM predicts global transcript structure by modeling transcription of the genome using a hidden Markov model whose emissions are the outputs of the initiation and termination classifiers. We validated the novel predictions of the UM with available RNA-seq data and tested it further by directly comparing the transcript structure predicted by the model to the transcription generated by the cell for synthetic DNA segments of random design. We show that the UM identifies transcription start sites more accurately than the initiation classifier alone, indicating that the relative arrangement of promoter and terminator elements influences their function. Our model presents a concrete description of how the cell defines transcript units, explains the existence of nongenic transcripts, and provides insight into genome evolution.
journal_name
Genome Resjournal_title
Genome researchauthors
de Boer CG,van Bakel H,Tsui K,Li J,Morris QD,Nislow C,Greenblatt JF,Hughes TRdoi
10.1101/gr.164327.113subject
Has Abstractpub_date
2014-01-01 00:00:00pages
154-66issue
1eissn
1088-9051issn
1549-5469pii
gr.164327.113journal_volume
24pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::Here we describe software tools for the automated detection of DNA restriction fragments resolved on agarose fingerprinting gels. We present a mathematical model for the location and shape of the restriction fragments as a function of fragment size, with model parameters determined empirically from "marker" lanes cont...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.904303
更新日期:2003-05-01 00:00:00
abstract::The human pseudoautosomal region 1 (PAR1) is essential for meiotic pairing and recombination, and its deletion causes male sterility. Comparative studies of human and mouse pseudoautosomal genes are valuable in charting the evolution of this interesting region, but have been limited by the paucity of genes conserved b...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.197001
更新日期:2001-12-01 00:00:00
abstract::Theory is developed for the process of sequencing randomly selected large-insert clones. Genome size, library depth, clone size, and clone distribution are considered relevant properties and perfect overlap detection for contig assembly is assumed. Genome-specific and nonrandom effects are neglected. Order of magnitud...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.gr-1339r
更新日期:2001-02-01 00:00:00
abstract::The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC re...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.213538.116
更新日期:2017-05-01 00:00:00
abstract::Differences in gene repertoire among bacterial genomes are usually ascribed to gene loss or to lateral gene transfer from unrelated cellular organisms. However, most bacteria contain large numbers of ORFans, that is, annotated genes that are restricted to a particular genome and that possess no known homologs. The uni...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.2231904
更新日期:2004-06-01 00:00:00
abstract::Next-generation sequencing is a powerful approach for discovering genetic variation. Sensitive variant calling and haplotype inference from population sequencing data remain challenging. We describe methods for high-quality discovery, genotyping, and phasing of SNPs for low-coverage (approximately 5×) sequencing of po...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.146084.112
更新日期:2013-05-01 00:00:00
abstract::We assessed the content, structure, and distribution of segmental duplications (> or =90% sequence identity, > or =5 kb length) within the published version of the Rattus norvegicus genome assembly (v.3.1). The overall fraction of duplicated sequence within the rat assembly (2.92%) is greater than that of the mouse (1...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1907504
更新日期:2004-04-01 00:00:00
abstract::When transcription is to the right of the promoter, the "top," mRNA-synonymous strand of DNA tends to be purine-rich. When transcription is to the left of the promoter, the top, mRNA-template strand tends to be pyrimidine-rich. This transcription-direction rule suggests that there has been an evolutionary selection pr...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.2.228
更新日期:2000-02-01 00:00:00
abstract::Double minutes (dmin) and homogeneously staining regions (hsr) are the cytogenetic hallmarks of genomic amplification in cancer. Different mechanisms have been proposed to explain their genesis. Recently, our group showed that the MYC-containing dmin in leukemia cases arise by excision and amplification (episome model...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.106252.110
更新日期:2010-09-01 00:00:00
abstract::DNA methylation plays key roles in diverse biological processes such as X chromosome inactivation, transposable element repression, genomic imprinting, and tissue-specific gene expression. Sequencing-based DNA methylation profiling provides an unprecedented opportunity to map and compare complete DNA methylomes. This ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.156539.113
更新日期:2013-09-01 00:00:00
abstract::Interindividual variability in response to chemicals and drugs is a common regulatory concern. It is assumed that xenobiotic-induced adverse reactions have a strong genetic basis, but many mechanism-based investigations have not been successful in identifying susceptible individuals. While recent advances in pharmacog...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.090241.108
更新日期:2009-09-01 00:00:00
abstract::The apicomplexan Cryptosporidium parvum is one of the most prevalent protozoan parasites of humans. We report the physical mapping of the genome of the Iowa isolate, sequencing and analysis of chromosome 6, and approximately 0.9 Mbp of sequence sampled from the remainder of the genome. To construct a robust physical m...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1555203
更新日期:2003-08-01 00:00:00
abstract::By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by m...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4039406
更新日期:2006-01-01 00:00:00
abstract::The exponential growth of pathogen nucleic acid sequences available in public domain databases has invited their direct use in pathogen detection, identification, and surveillance strategies. DNA microarray technology has offered the potential for the direct DNA sequence analysis of a broad spectrum of pathogens of in...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4337206
更新日期:2006-04-01 00:00:00
abstract::Chronic bacterial infections of the lung are the leading cause of morbidity and mortality in cystic fibrosis patients. Tracking bacterial evolution during chronic infections can provide insights into how host selection pressures-including immune responses and therapeutic interventions-shape bacterial genomes. We carri...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.213363.116
更新日期:2017-04-01 00:00:00
abstract::Transposable elements (TEs) proliferate within the genome of their host, which responds by silencing them epigenetically. Much is known about the mechanisms of silencing in plants, particularly the role of siRNAs in guiding DNA methylation. In contrast, little is known about siRNA targeting patterns along the length o...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.193763.115
更新日期:2016-02-01 00:00:00
abstract::The current Caenorhabditis elegans genomic annotation has many genes organized in operons. Using directionally stitched promoterGFP methodology, we have conducted the largest survey to date on the regulatory regions of annotated C. elegans operons and identified 65, over 25% of those studied, with internal promoters. ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6824707
更新日期:2007-10-01 00:00:00
abstract::We have determined the complete sequence of 951,695 bp from the class I region of H2, the mouse major histocompatibility complex (Mhc) from strain 129/Sv (haplotype bc). The sequence contains 26 genes. The sequence spans from the last 50 kb of the H2-T region, including 2 class I genes and 3 class I pseudogenes, and i...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.975303
更新日期:2003-04-01 00:00:00
abstract::The functional classification of genes on a genome-wide scale is now in its infancy, and we make a first attempt to assess existing methods and identify sources of error. To this end, we compared two independent efforts for associating proteins with functions, one implemented by FlyBase and the other by PANTHER at Cel...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.771603
更新日期:2003-09-01 00:00:00
abstract::It is widely accepted that newly arisen duplicate gene pairs experience an altered selective regime that is often manifested as an increase in the rate of protein sequence evolution. Many details about the nature of the rate acceleration remain unknown, however, including its typical magnitude and duration, and whethe...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6341207
更新日期:2008-01-01 00:00:00
abstract::We compared the genome of the nematode Caenorhabditis elegans to 13% of that of Caenorhabditis briggsae, identifying 252 conserved segments along their chromosomes. We detected 517 chromosomal rearrangements, with the ratio of translocations to inversions to transpositions being approximately 1:1:2. We estimate that t...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.172702
更新日期:2002-06-01 00:00:00
abstract::Interferons and interleukin-10 are involved in key aspects of the host defence mechanisms. Human chromosome 21 harbors the interferon/interleukin-10 receptor gene cluster linked to the GART gene. This cluster includes both components of the interferon alpha/beta-receptor (IFNAR1 and IFNAR2) and the second components o...
journal_title:Genome research
pub_type: 杂志文章
doi:
更新日期:1999-03-01 00:00:00
abstract::Noncoding RNA (ncRNA) constitutes a significant portion of the mammalian transcriptome. Emerging evidence suggests that it regulates gene expression in cis or trans by modulating the chromatin structure. To uncover the functional role of ncRNA in chromatin organization, we deep sequenced chromatin-associated RNAs (CAR...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.103473.109
更新日期:2010-07-01 00:00:00
abstract::Candida albicans is a commensal fungus of the human gastrointestinal tract and a prevalent opportunistic pathogen. To examine diversity within this species, extensive genomic and phenotypic analyses were performed on 21 clinical C. albicans isolates. Genomic variation was evident in the form of polymorphisms, copy num...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.174623.114
更新日期:2015-03-01 00:00:00
abstract::A time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose time course regulatory analysis (TimeReg) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility da...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.257063.119
更新日期:2020-04-01 00:00:00
abstract::Gene duplication and alternative splicing are important sources of proteomic diversity. Despite research indicating that gene duplication and alternative splicing are negatively correlated, the evolutionary relationship between the two remains unclear. One manner in which alternative splicing and gene duplication may ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.184473.114
更新日期:2015-05-01 00:00:00
abstract::In the search for common genetic variants that contribute to prevalent human diseases, patterns of linkage disequilibrium (LD) among linked markers should be considered when selecting SNPs. Genotyping efficiency can be increased by choosing tagging SNPs (tagSNPs) in LD with other SNPs. However, it remains to be seen w...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4138406
更新日期:2006-03-01 00:00:00
abstract::Many missense substitutions are identified in single nucleotide polymorphism (SNP) data and large-scale random mutagenesis projects. Each amino acid substitution potentially affects protein function. We have constructed a tool that uses sequence homology to predict whether a substitution affects protein function. SIFT...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.176601
更新日期:2001-05-01 00:00:00
abstract::Elucidating the pathophysiology and molecular attributes of common disorders as well as developing targeted and effective treatments hinges on the study of the relevant cell type and tissues. Pancreatic beta cells within the islets of Langerhans are centrally involved in the pathogenesis of both type 1 and type 2 diab...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.150706.112
更新日期:2013-09-01 00:00:00
abstract::Little is known about the rate of emergence of de novo genes, what their initial properties are, and how they spread in populations. We examined wild yeast populations (Saccharomyces paradoxus) to characterize the diversity and turnover of intergenic ORFs over short evolutionary timescales. We find that hundreds of in...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.239822.118
更新日期:2019-06-01 00:00:00