A unified model for yeast transcript definition.

Abstract:

:Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic acid sequence features detectable by the yeast cell, which we integrate into a Unified Model (UM) that models transcription as a whole. The cis-elements that denote where transcription initiates function primarily through nucleosome depletion, and, using a synthetic promoter system, we show that most of these elements are sufficient to initiate transcription in vivo. Hrp1 binding sites are the major characteristic of terminators; these binding sites are often clustered in terminator regions and can terminate transcription bidirectionally. The UM predicts global transcript structure by modeling transcription of the genome using a hidden Markov model whose emissions are the outputs of the initiation and termination classifiers. We validated the novel predictions of the UM with available RNA-seq data and tested it further by directly comparing the transcript structure predicted by the model to the transcription generated by the cell for synthetic DNA segments of random design. We show that the UM identifies transcription start sites more accurately than the initiation classifier alone, indicating that the relative arrangement of promoter and terminator elements influences their function. Our model presents a concrete description of how the cell defines transcript units, explains the existence of nongenic transcripts, and provides insight into genome evolution.

journal_name

Genome Res

journal_title

Genome research

authors

de Boer CG,van Bakel H,Tsui K,Li J,Morris QD,Nislow C,Greenblatt JF,Hughes TR

doi

10.1101/gr.164327.113

subject

Has Abstract

pub_date

2014-01-01 00:00:00

pages

154-66

issue

1

eissn

1088-9051

issn

1549-5469

pii

gr.164327.113

journal_volume

24

pub_type

杂志文章
  • Software for automated analysis of DNA fingerprinting gels.

    abstract::Here we describe software tools for the automated detection of DNA restriction fragments resolved on agarose fingerprinting gels. We present a mathematical model for the location and shape of the restriction fragments as a function of fragment size, with model parameters determined empirically from "marker" lanes cont...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.904303

    authors: Fuhrmann DR,Krzywinski MI,Chiu R,Saeedi P,Schein JE,Bosdet IE,Chinwalla A,Hillier LW,Waterston RH,McPherson JD,Jones SJ,Marra MA

    更新日期:2003-05-01 00:00:00

  • Differential divergence of three human pseudoautosomal genes and their mouse homologs: implications for sex chromosome evolution.

    abstract::The human pseudoautosomal region 1 (PAR1) is essential for meiotic pairing and recombination, and its deletion causes male sterility. Comparative studies of human and mouse pseudoautosomal genes are valuable in charting the evolution of this interesting region, but have been limited by the paucity of genes conserved b...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.197001

    authors: Gianfrancesco F,Sanges R,Esposito T,Tempesta S,Rao E,Rappold G,Archidiacono N,Graves JA,Forabosco A,D'Urso M

    更新日期:2001-12-01 00:00:00

  • Theories and applications for sequencing randomly selected clones.

    abstract::Theory is developed for the process of sequencing randomly selected large-insert clones. Genome size, library depth, clone size, and clone distribution are considered relevant properties and perfect overlap detection for contig assembly is assumed. Genome-specific and nonrandom effects are neglected. Order of magnitud...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.gr-1339r

    authors: Wendl MC,Marra MA,Hillier LW,Chinwalla AT,Wilson RK,Waterston RH

    更新日期:2001-02-01 00:00:00

  • Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II.

    abstract::The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC re...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.213538.116

    authors: Norman PJ,Norberg SJ,Guethlein LA,Nemat-Gorgani N,Royce T,Wroblewski EE,Dunn T,Mann T,Alicata C,Hollenbach JA,Chang W,Shults Won M,Gunderson KL,Abi-Rached L,Ronaghi M,Parham P

    更新日期:2017-05-01 00:00:00

  • Bacterial genomes as new gene homes: the genealogy of ORFans in E. coli.

    abstract::Differences in gene repertoire among bacterial genomes are usually ascribed to gene loss or to lateral gene transfer from unrelated cellular organisms. However, most bacteria contain large numbers of ORFans, that is, annotated genes that are restricted to a particular genome and that possess no known homologs. The uni...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2231904

    authors: Daubin V,Ochman H

    更新日期:2004-06-01 00:00:00

  • An integrative variant analysis pipeline for accurate genotype/haplotype inference in population NGS data.

    abstract::Next-generation sequencing is a powerful approach for discovering genetic variation. Sensitive variant calling and haplotype inference from population sequencing data remain challenging. We describe methods for high-quality discovery, genotyping, and phasing of SNPs for low-coverage (approximately 5×) sequencing of po...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.146084.112

    authors: Wang Y,Lu J,Yu J,Gibbs RA,Yu F

    更新日期:2013-05-01 00:00:00

  • Recent segmental duplications in the working draft assembly of the brown Norway rat.

    abstract::We assessed the content, structure, and distribution of segmental duplications (> or =90% sequence identity, > or =5 kb length) within the published version of the Rattus norvegicus genome assembly (v.3.1). The overall fraction of duplicated sequence within the rat assembly (2.92%) is greater than that of the mouse (1...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1907504

    authors: Tuzun E,Bailey JA,Eichler EE

    更新日期:2004-04-01 00:00:00

  • Thermophilic bacteria strictly obey Szybalski's transcription direction rule and politely purine-load RNAs with both adenine and guanine.

    abstract::When transcription is to the right of the promoter, the "top," mRNA-synonymous strand of DNA tends to be purine-rich. When transcription is to the left of the promoter, the top, mRNA-template strand tends to be pyrimidine-rich. This transcription-direction rule suggests that there has been an evolutionary selection pr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.2.228

    authors: Lao PJ,Forsdyke DR

    更新日期:2000-02-01 00:00:00

  • Gene amplification as double minutes or homogeneously staining regions in solid tumors: origin and structure.

    abstract::Double minutes (dmin) and homogeneously staining regions (hsr) are the cytogenetic hallmarks of genomic amplification in cancer. Different mechanisms have been proposed to explain their genesis. Recently, our group showed that the MYC-containing dmin in leukemia cases arise by excision and amplification (episome model...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.106252.110

    authors: Storlazzi CT,Lonoce A,Guastadisegni MC,Trombetta D,D'Addabbo P,Daniele G,L'Abbate A,Macchia G,Surace C,Kok K,Ullmann R,Purgato S,Palumbo O,Carella M,Ambros PF,Rocchi M

    更新日期:2010-09-01 00:00:00

  • Functional DNA methylation differences between tissues, cell types, and across individuals discovered using the M&M algorithm.

    abstract::DNA methylation plays key roles in diverse biological processes such as X chromosome inactivation, transposable element repression, genomic imprinting, and tissue-specific gene expression. Sequencing-based DNA methylation profiling provides an unprecedented opportunity to map and compare complete DNA methylomes. This ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.156539.113

    authors: Zhang B,Zhou Y,Lin N,Lowdon RF,Hong C,Nagarajan RP,Cheng JB,Li D,Stevens M,Lee HJ,Xing X,Zhou J,Sundaram V,Elliott G,Gu J,Shi T,Gascard P,Sigaroudinia M,Tlsty TD,Kadlecek T,Weiss A,O'Geen H,Farnham PJ,Maire

    更新日期:2013-09-01 00:00:00

  • Mouse population-guided resequencing reveals that variants in CD44 contribute to acetaminophen-induced liver injury in humans.

    abstract::Interindividual variability in response to chemicals and drugs is a common regulatory concern. It is assumed that xenobiotic-induced adverse reactions have a strong genetic basis, but many mechanism-based investigations have not been successful in identifying susceptible individuals. While recent advances in pharmacog...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.090241.108

    authors: Harrill AH,Watkins PB,Su S,Ross PK,Harbourt DE,Stylianou IM,Boorman GA,Russo MW,Sackler RS,Harris SC,Smith PC,Tennant R,Bogue M,Paigen K,Harris C,Contractor T,Wiltshire T,Rusyn I,Threadgill DW

    更新日期:2009-09-01 00:00:00

  • Integrated mapping, chromosomal sequencing and sequence analysis of Cryptosporidium parvum.

    abstract::The apicomplexan Cryptosporidium parvum is one of the most prevalent protozoan parasites of humans. We report the physical mapping of the genome of the Iowa isolate, sequencing and analysis of chromosome 6, and approximately 0.9 Mbp of sequence sampled from the remainder of the genome. To construct a robust physical m...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1555203

    authors: Bankier AT,Spriggs HF,Fartmann B,Konfortov BA,Madera M,Vogel C,Teichmann SA,Ivens A,Dear PH

    更新日期:2003-08-01 00:00:00

  • Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes.

    abstract::By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by m...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4039406

    authors: Kimura K,Wakamatsu A,Suzuki Y,Ota T,Nishikawa T,Yamashita R,Yamamoto J,Sekine M,Tsuritani K,Wakaguri H,Ishii S,Sugiyama T,Saito K,Isono Y,Irie R,Kushida N,Yoneyama T,Otsuka R,Kanda K,Yokoi T,Kondo H,Wagatsuma M

    更新日期:2006-01-01 00:00:00

  • Broad-spectrum respiratory tract pathogen identification using resequencing DNA microarrays.

    abstract::The exponential growth of pathogen nucleic acid sequences available in public domain databases has invited their direct use in pathogen detection, identification, and surveillance strategies. DNA microarray technology has offered the potential for the direct DNA sequence analysis of a broad spectrum of pathogens of in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4337206

    authors: Lin B,Wang Z,Vora GJ,Thornton JA,Schnur JM,Thach DC,Blaney KM,Ligler AG,Malanoski AP,Santiago J,Walter EA,Agan BK,Metzgar D,Seto D,Daum LT,Kruzelock R,Rowley RK,Hanson EH,Tibbetts C,Stenger DA

    更新日期:2006-04-01 00:00:00

  • Phenotypic diversity and genotypic flexibility of Burkholderia cenocepacia during long-term chronic infection of cystic fibrosis lungs.

    abstract::Chronic bacterial infections of the lung are the leading cause of morbidity and mortality in cystic fibrosis patients. Tracking bacterial evolution during chronic infections can provide insights into how host selection pressures-including immune responses and therapeutic interventions-shape bacterial genomes. We carri...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.213363.116

    authors: Lee AH,Flibotte S,Sinha S,Paiero A,Ehrlich RL,Balashov S,Ehrlich GD,Zlosnik JE,Mell JC,Nislow C

    更新日期:2017-04-01 00:00:00

  • A role for palindromic structures in the cis-region of maize Sirevirus LTRs in transposable element evolution and host epigenetic response.

    abstract::Transposable elements (TEs) proliferate within the genome of their host, which responds by silencing them epigenetically. Much is known about the mechanisms of silencing in plants, particularly the role of siRNAs in guiding DNA methylation. In contrast, little is known about siRNA targeting patterns along the length o...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.193763.115

    authors: Bousios A,Diez CM,Takuno S,Bystry V,Darzentas N,Gaut BS

    更新日期:2016-02-01 00:00:00

  • Identification and analysis of internal promoters in Caenorhabditis elegans operons.

    abstract::The current Caenorhabditis elegans genomic annotation has many genes organized in operons. Using directionally stitched promoterGFP methodology, we have conducted the largest survey to date on the regulatory regions of annotated C. elegans operons and identified 65, over 25% of those studied, with internal promoters. ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6824707

    authors: Huang P,Pleasance ED,Maydan JS,Hunt-Newbury R,O'Neil NJ,Mah A,Baillie DL,Marra MA,Moerman DG,Jones SJ

    更新日期:2007-10-01 00:00:00

  • Species-specific class I gene expansions formed the telomeric 1 mb of the mouse major histocompatibility complex.

    abstract::We have determined the complete sequence of 951,695 bp from the class I region of H2, the mouse major histocompatibility complex (Mhc) from strain 129/Sv (haplotype bc). The sequence contains 26 genes. The sequence spans from the last 50 kb of the H2-T region, including 2 class I genes and 3 class I pseudogenes, and i...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.975303

    authors: Takada T,Kumánovics A,Amadou C,Yoshino M,Jones EP,Athanasiou M,Evans GA,Fischer Lindahl K

    更新日期:2003-04-01 00:00:00

  • Assessment of genome-wide protein function classification for Drosophila melanogaster.

    abstract::The functional classification of genes on a genome-wide scale is now in its infancy, and we make a first attempt to assess existing methods and identify sources of error. To this end, we compared two independent efforts for associating proteins with functions, one implemented by FlyBase and the other by PANTHER at Cel...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.771603

    authors: Mi H,Vandergriff J,Campbell M,Narechania A,Majoros W,Lewis S,Thomas PD,Ashburner M

    更新日期:2003-09-01 00:00:00

  • A burst of protein sequence evolution and a prolonged period of asymmetric evolution follow gene duplication in yeast.

    abstract::It is widely accepted that newly arisen duplicate gene pairs experience an altered selective regime that is often manifested as an increase in the rate of protein sequence evolution. Many details about the nature of the rate acceleration remain unknown, however, including its typical magnitude and duration, and whethe...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6341207

    authors: Scannell DR,Wolfe KH

    更新日期:2008-01-01 00:00:00

  • Fourfold faster rate of genome rearrangement in nematodes than in Drosophila.

    abstract::We compared the genome of the nematode Caenorhabditis elegans to 13% of that of Caenorhabditis briggsae, identifying 252 conserved segments along their chromosomes. We detected 517 chromosomal rearrangements, with the ratio of translocations to inversions to transpositions being approximately 1:1:2. We estimate that t...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.172702

    authors: Coghlan A,Wolfe KH

    更新日期:2002-06-01 00:00:00

  • Comparative genomic analysis of the interferon/interleukin-10 receptor gene cluster.

    abstract::Interferons and interleukin-10 are involved in key aspects of the host defence mechanisms. Human chromosome 21 harbors the interferon/interleukin-10 receptor gene cluster linked to the GART gene. This cluster includes both components of the interferon alpha/beta-receptor (IFNAR1 and IFNAR2) and the second components o...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:

    authors: Reboul J,Gardiner K,Monneron D,Uzé G,Lutfalla G

    更新日期:1999-03-01 00:00:00

  • Characterization of the RNA content of chromatin.

    abstract::Noncoding RNA (ncRNA) constitutes a significant portion of the mammalian transcriptome. Emerging evidence suggests that it regulates gene expression in cis or trans by modulating the chromatin structure. To uncover the functional role of ncRNA in chromatin organization, we deep sequenced chromatin-associated RNAs (CAR...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.103473.109

    authors: Mondal T,Rasmussen M,Pandey GK,Isaksson A,Kanduri C

    更新日期:2010-07-01 00:00:00

  • Genetic and phenotypic intra-species variation in Candida albicans.

    abstract::Candida albicans is a commensal fungus of the human gastrointestinal tract and a prevalent opportunistic pathogen. To examine diversity within this species, extensive genomic and phenotypic analyses were performed on 21 clinical C. albicans isolates. Genomic variation was evident in the form of polymorphisms, copy num...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.174623.114

    authors: Hirakawa MP,Martinez DA,Sakthikumar S,Anderson MZ,Berlin A,Gujja S,Zeng Q,Zisson E,Wang JM,Greenberg JM,Berman J,Bennett RJ,Cuomo CA

    更新日期:2015-03-01 00:00:00

  • Time course regulatory analysis based on paired expression and chromatin accessibility data.

    abstract::A time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose time course regulatory analysis (TimeReg) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility da...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.257063.119

    authors: Duren Z,Chen X,Xin J,Wang Y,Wong WH

    更新日期:2020-04-01 00:00:00

  • Evidence for widespread subfunctionalization of splice forms in vertebrate genomes.

    abstract::Gene duplication and alternative splicing are important sources of proteomic diversity. Despite research indicating that gene duplication and alternative splicing are negatively correlated, the evolutionary relationship between the two remains unclear. One manner in which alternative splicing and gene duplication may ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.184473.114

    authors: Lambert MJ,Cochran WO,Wilde BM,Olsen KG,Cooper CD

    更新日期:2015-05-01 00:00:00

  • The portability of tagSNPs across populations: a worldwide survey.

    abstract::In the search for common genetic variants that contribute to prevalent human diseases, patterns of linkage disequilibrium (LD) among linked markers should be considered when selecting SNPs. Genotyping efficiency can be increased by choosing tagging SNPs (tagSNPs) in LD with other SNPs. However, it remains to be seen w...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4138406

    authors: González-Neira A,Ke X,Lao O,Calafell F,Navarro A,Comas D,Cann H,Bumpstead S,Ghori J,Hunt S,Deloukas P,Dunham I,Cardon LR,Bertranpetit J

    更新日期:2006-03-01 00:00:00

  • Predicting deleterious amino acid substitutions.

    abstract::Many missense substitutions are identified in single nucleotide polymorphism (SNP) data and large-scale random mutagenesis projects. Each amino acid substitution potentially affects protein function. We have constructed a tool that uses sequence homology to predict whether a substitution affects protein function. SIFT...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.176601

    authors: Ng PC,Henikoff S

    更新日期:2001-05-01 00:00:00

  • Cell-type, allelic, and genetic signatures in the human pancreatic beta cell transcriptome.

    abstract::Elucidating the pathophysiology and molecular attributes of common disorders as well as developing targeted and effective treatments hinges on the study of the relevant cell type and tissues. Pancreatic beta cells within the islets of Langerhans are centrally involved in the pathogenesis of both type 1 and type 2 diab...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.150706.112

    authors: Nica AC,Ongen H,Irminger JC,Bosco D,Berney T,Antonarakis SE,Halban PA,Dermitzakis ET

    更新日期:2013-09-01 00:00:00

  • Turnover of ribosome-associated transcripts from de novo ORFs produces gene-like characteristics available for de novo gene emergence in wild yeast populations.

    abstract::Little is known about the rate of emergence of de novo genes, what their initial properties are, and how they spread in populations. We examined wild yeast populations (Saccharomyces paradoxus) to characterize the diversity and turnover of intergenic ORFs over short evolutionary timescales. We find that hundreds of in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.239822.118

    authors: Durand É,Gagnon-Arsenault I,Hallin J,Hatin I,Dubé AK,Nielly-Thibault L,Namy O,Landry CR

    更新日期:2019-06-01 00:00:00