Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals.

Abstract:

:Understanding the consequences of regulatory variation in the human genome remains a major challenge, with important implications for understanding gene regulation and interpreting the many disease-risk variants that fall outside of protein-coding regions. Here, we provide a direct window into the regulatory consequences of genetic variation by sequencing RNA from 922 genotyped individuals. We present a comprehensive description of the distribution of regulatory variation--by the specific expression phenotypes altered, the properties of affected genes, and the genomic characteristics of regulatory variants. We detect variants influencing expression of over ten thousand genes, and through the enhanced resolution offered by RNA-sequencing, for the first time we identify thousands of variants associated with specific phenotypes including splicing and allelic expression. Evaluating the effects of both long-range intra-chromosomal and trans (cross-chromosomal) regulation, we observe modularity in the regulatory network, with three-dimensional chromosomal configuration playing a particular role in regulatory modules within each chromosome. We also observe a significant depletion of regulatory variants affecting central and critical genes, along with a trend of reduced effect sizes as variant frequency increases, providing evidence that purifying selection and buffering have limited the deleterious impact of regulatory variation on the cell. Further, generalizing beyond observed variants, we have analyzed the genomic properties of variants associated with expression and splicing and developed a Bayesian model to predict regulatory consequences of genetic variants, applicable to the interpretation of individual genomes and disease studies. Together, these results represent a critical step toward characterizing the complete landscape of human regulatory variation.

journal_name

Genome Res

journal_title

Genome research

authors

Battle A,Mostafavi S,Zhu X,Potash JB,Weissman MM,McCormick C,Haudenschild CD,Beckman KB,Shi J,Mei R,Urban AE,Montgomery SB,Levinson DF,Koller D

doi

10.1101/gr.155192.113

subject

Has Abstract

pub_date

2014-01-01 00:00:00

pages

14-24

issue

1

eissn

1088-9051

issn

1549-5469

pii

gr.155192.113

journal_volume

24

pub_type

杂志文章
  • New class of microRNA targets containing simultaneous 5'-UTR and 3'-UTR interaction sites.

    abstract::MicroRNAs (miRNAs) are known to post-transcriptionally regulate target mRNAs through the 3'-UTR, which interacts mainly with the 5'-end of miRNA in animals. Here we identify many endogenous motifs within human 5'-UTRs specific to the 3'-ends of miRNAs. The 3'-end of conserved miRNAs in particular has significant inter...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.089367.108

    authors: Lee I,Ajay SS,Yook JI,Kim HS,Hong SH,Kim NH,Dhanasekaran SM,Chinnaiyan AM,Athey BD

    更新日期:2009-07-01 00:00:00

  • Prokaryotic phylogenies inferred from protein structural domains.

    abstract::The determination of the phylogenetic relationships among microorganisms has long relied primarily on gene sequence information. Given that prokaryotic organisms often lack morphological characteristics amenable to phylogenetic analysis, prokaryotic phylogenies, in particular, are often based on sequence data. In this...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.3033805

    authors: Deeds EJ,Hennessey H,Shakhnovich EI

    更新日期:2005-03-01 00:00:00

  • Chromosomal instability mediated by non-B DNA: cruciform conformation and not DNA sequence is responsible for recurrent translocation in humans.

    abstract::Chromosomal aberrations have been thought to be random events. However, recent findings introduce a new paradigm in which certain DNA segments have the potential to adopt unusual conformations that lead to genomic instability and nonrandom chromosomal rearrangement. One of the best-studied examples is the palindromic ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.079244.108

    authors: Inagaki H,Ohye T,Kogo H,Kato T,Bolor H,Taniguchi M,Shaikh TH,Emanuel BS,Kurahashi H

    更新日期:2009-02-01 00:00:00

  • Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes.

    abstract::Comparative genomics provides a general methodology for discovering functional DNA elements and understanding their evolution. The availability of many related genomes enables more powerful analyses, but requires rigorous phylogenetic methods to resolve orthologous genes and regions. Here, we use 12 recently sequenced...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.7105007

    authors: Rasmussen MD,Kellis M

    更新日期:2007-12-01 00:00:00

  • Cytosine modifications modulate the chromatin architecture of transcriptional enhancers.

    abstract::Epigenetic mechanisms are believed to play key roles in the establishment of cell-specific transcription programs. Accordingly, the modified bases 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) have been observed in DNA of genomic regulatory regions such as enhancers, and oxidation of 5mC into 5hmC by Ten-e...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.211466.116

    authors: Mahé EA,Madigou T,Sérandour AA,Bizot M,Avner S,Chalmel F,Palierne G,Métivier R,Salbert G

    更新日期:2017-06-01 00:00:00

  • Enzymatic regional methylation assay: a novel method to quantify regional CpG methylation density.

    abstract::We have developed a novel quantitative method for rapidly assessing the CpG methylation density of a DNA region in mammalian cells. After bisulfite modification of genomic DNA, the region of interest is PCR amplified with primers containing two dam sites (GATC). The purified PCR products are then incubated with 14C-la...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.202501

    authors: Galm O,Rountree MR,Bachman KE,Jair KW,Baylin SB,Herman JG

    更新日期:2002-01-01 00:00:00

  • Asymmetric nucleosomes flank promoters in the budding yeast genome.

    abstract::Nucleosomes in active chromatin are dynamic, but whether they have distinct structural conformations is unknown. To identify nucleosomes with alternative structures genome-wide, we used H4S47C-anchored cleavage mapping, which revealed that 5% of budding yeast (Saccharomyces cerevisiae) nucleosome positions have asymme...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.182618.114

    authors: Ramachandran S,Zentner GE,Henikoff S

    更新日期:2015-03-01 00:00:00

  • Antisense transcripts with FANTOM2 clone set and their implications for gene regulation.

    abstract::We have used the FANTOM2 mouse cDNA set (60,770 clones), public mRNA data, and mouse genome sequence data to identify 2481 pairs of sense-antisense transcripts and 899 further pairs of nonantisense bidirectional transcription based upon genomic mapping. The analysis greatly expands the number of known examples of sens...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.982903

    authors: Kiyosawa H,Yamanaka I,Osato N,Kondo S,Hayashizaki Y,RIKEN GER Group.,GSL Members.

    更新日期:2003-06-01 00:00:00

  • Genome-wide analyses of alternative splicing in plants: opportunities and challenges.

    abstract::Alternative splicing (AS) creates multiple mRNA transcripts from a single gene. While AS is known to contribute to gene regulation and proteome diversity in animals, the study of its importance in plants is in its early stages. However, recently available plant genome and transcript sequence data sets are enabling a g...

    journal_title:Genome research

    pub_type: 杂志文章,评审

    doi:10.1101/gr.053678.106

    authors: Barbazuk WB,Fu Y,McGinnis KM

    更新日期:2008-09-01 00:00:00

  • Genomic evolution, patterns of global dissemination, and interspecies transmission of human and simian T-cell leukemia/lymphotropic viruses.

    abstract::Using both env and long terminal repeat (LTR) sequences, with maximal representation of genetic diversity within primate strains, we revise and expand the unique evolutionary history of human and simian T-cell leukemia/lymphotropic viruses (HTLV/STLV). Based on the robust application of three different phylogenetic al...

    journal_title:Genome research

    pub_type: 杂志文章,评审

    doi:

    authors: Slattery JP,Franchini G,Gessain A

    更新日期:1999-06-01 00:00:00

  • Function and evolution of a gene family encoding odorant binding-like proteins in a social insect, the honey bee (Apis mellifera).

    abstract::The remarkable olfactory power of insect species is thought to be generated by a combinatorial action of two large protein families, G protein-coupled olfactory receptors (ORs) and odorant binding proteins (OBPs). In olfactory sensilla, OBPs deliver hydrophobic airborne molecules to ORs, but their expression in nonolf...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5075706

    authors: Forêt S,Maleszka R

    更新日期:2006-11-01 00:00:00

  • Extensive variation and low heritability of DNA methylation identified in a twin study.

    abstract::Disturbance of DNA methylation leading to aberrant gene expression has been implicated in the etiology of many diseases. Whereas variation at the genetic level has been studied extensively, less is known about the extent and function of epigenetic variation. To explore variation and heritability of DNA methylation, we...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.119685.110

    authors: Gervin K,Hammerø M,Akselsen HE,Moe R,Nygård H,Brandt I,Gjessing HK,Harris JR,Undlien DE,Lyle R

    更新日期:2011-11-01 00:00:00

  • Computational identification of operons in microbial genomes.

    abstract::By applying graph representations to biochemical pathways, a new computational pipeline is proposed to find potential operons in microbial genomes. The algorithm relies on the fact that enzyme genes in operons tend to catalyze successive reactions in metabolic pathways. We applied this algorithm to 42 microbial genome...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.200602

    authors: Zheng Y,Szustakowski JD,Fortnow L,Roberts RJ,Kasif S

    更新日期:2002-08-01 00:00:00

  • Nonrandom domain organization of the Arabidopsis genome at the nuclear periphery.

    abstract::The nuclear space is not a homogeneous biochemical environment. Many studies have demonstrated that the transcriptional activity of a gene is linked to its positioning within the nuclear space. Following the discovery of lamin-associated domains (LADs), which are transcriptionally repressed chromatin regions, the nonr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.215186.116

    authors: Bi X,Cheng YJ,Hu B,Ma X,Wu R,Wang JW,Liu C

    更新日期:2017-07-01 00:00:00

  • The human homolog T of the mouse T(Brachyury) gene; gene structure, cDNA sequence, and assignment to chromosome 6q27.

    abstract::We have cloned the human gene encoding the transcription factor T. T protein is vital for the formation of posterior mesoderm and axial development in all vertebrates. Brachyury mutant mice, which lack T protein, die in utero with abnormal notochord, posterior somites, and allantois. We have identified human T genomic...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6.3.226

    authors: Edwards YH,Putt W,Lekoape KM,Stott D,Fox M,Hopkinson DA,Sowden J

    更新日期:1996-03-01 00:00:00

  • Assessing clusters and motifs from gene expression data.

    abstract::Large-scale gene expression studies and genomic sequencing projects are providing vast amounts of information that can be used to identify or predict cellular regulatory processes. Genes can be clustered on the basis of the similarity of their expression profiles or function and these clusters are likely to contain ge...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.148301

    authors: Jakt LM,Cao L,Cheah KS,Smith DK

    更新日期:2001-01-01 00:00:00

  • A novel testis ubiquitin-binding protein gene arose by exon shuffling in hominoids.

    abstract::Most new genes arise by duplication of existing gene structures, after which relaxed selection on the new copy frequently leads to mutational inactivation of the duplicate; only rarely will a new gene with modified function emerge. Here we describe a unique mechanism of gene creation, whereby new combinations of funct...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6252107

    authors: Babushok DV,Ohshima K,Ostertag EM,Chen X,Wang Y,Mandal PK,Okada N,Abrams CS,Kazazian HH Jr

    更新日期:2007-08-01 00:00:00

  • A-to-I RNA editing promotes developmental stage-specific gene and lncRNA expression.

    abstract::A-to-I RNA editing is a conserved widespread phenomenon in which adenosine (A) is converted to inosine (I) by adenosine deaminases (ADARs) in double-stranded RNA regions, mainly noncoding. Mutations in ADAR enzymes in Caenorhabditis elegans cause defects in normal development but are not lethal as in human and mouse. ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.211169.116

    authors: Goldstein B,Agranat-Tamir L,Light D,Ben-Naim Zgayer O,Fishman A,Lamm AT

    更新日期:2017-03-01 00:00:00

  • The identification and functional annotation of RNA structures conserved in vertebrates.

    abstract::Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than seq...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.208652.116

    authors: Seemann SE,Mirza AH,Hansen C,Bang-Berthelsen CH,Garde C,Christensen-Dalsgaard M,Torarinsson E,Yao Z,Workman CT,Pociot F,Nielsen H,Tommerup N,Ruzzo WL,Gorodkin J

    更新日期:2017-08-01 00:00:00

  • The evolution of evolvability in microRNA target sites in vertebrates.

    abstract::The lack of long-term evolutionary conservation of microRNA (miRNA) target sites appears to contradict many analyses of their functions. Several hypotheses have been offered, but an attractive one-that the conservation may be a function of taxonomic hierarchy (vertebrates, mammals, primates, etc.)-has rarely been disc...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.148916.112

    authors: Xu J,Zhang R,Shen Y,Liu G,Lu X,Wu CI

    更新日期:2013-11-01 00:00:00

  • High-throughput plasmid purification for capillary sequencing.

    abstract::The need for expeditious and inexpensive methods for high-throughput DNA sequencing has been highlighted by the accelerated pace of genome DNA sequencing over the past year. At the Joint Genome Institute, the throughput in terms of high-quality bases per day has increased over 20-fold during the past 18 mo, reaching a...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.167801

    authors: Elkin CJ,Richardson PM,Fourcade HM,Hammon NM,Pollard MJ,Predki PF,Glavina T,Hawkins TL

    更新日期:2001-07-01 00:00:00

  • Two contrasting classes of nucleolus-associated domains in mouse fibroblast heterochromatin.

    abstract::In interphase eukaryotic cells, almost all heterochromatin is located adjacent to the nucleolus or to the nuclear lamina, thus defining nucleolus-associated domains (NADs) and lamina-associated domains (LADs), respectively. Here, we determined the first genome-scale map of murine NADs in mouse embryonic fibroblasts (M...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.247072.118

    authors: Vertii A,Ou J,Yu J,Yan A,Pagès H,Liu H,Zhu LJ,Kaufman PD

    更新日期:2019-08-01 00:00:00

  • Nature and structure of human genes that generate retropseudogenes.

    abstract::The human genome is estimated to contain 23,000 to 33,000 retropseudogenes. To study the properties of genes giving rise to these retroelements, we compared the structure and expression of genes with or without known retropseudogenes. Four main features have emerged from the analysis of 181 genes associated to retrops...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.5.672

    authors: Gonçalves I,Duret L,Mouchiroud D

    更新日期:2000-05-01 00:00:00

  • Whole population, genome-wide mapping of hidden relatedness.

    abstract::We present GERMLINE, a robust algorithm for identifying segmental sharing indicative of recent common ancestry between pairs of individuals. Unlike methods with comparable objectives, GERMLINE scales linearly with the number of samples, enabling analysis of whole-genome data in large cohorts. Our approach is based on ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.081398.108

    authors: Gusev A,Lowe JK,Stoffel M,Daly MJ,Altshuler D,Breslow JL,Friedman JM,Pe'er I

    更新日期:2009-02-01 00:00:00

  • The origins and evolution of chromosomes, dosage compensation, and mechanisms underlying venom regulation in snakes.

    abstract::Here we use a chromosome-level genome assembly of a prairie rattlesnake (Crotalus viridis), together with Hi-C, RNA-seq, and whole-genome resequencing data, to study key features of genome biology and evolution in reptiles. We identify the rattlesnake Z Chromosome, including the recombining pseudoautosomal region, and...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.240952.118

    authors: Schield DR,Card DC,Hales NR,Perry BW,Pasquesi GM,Blackmon H,Adams RH,Corbin AB,Smith CF,Ramesh B,Demuth JP,Betrán E,Tollis M,Meik JM,Mackessy SP,Castoe TA

    更新日期:2019-04-01 00:00:00

  • Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.

    abstract::Long sequencing reads generated by single-molecule sequencing technology offer the possibility of dramatically improving the contiguity of genome assemblies. The biggest challenge today is that long reads have relatively high error rates, currently around 15%. The high error rates make it difficult to use this data al...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.213405.116

    authors: Zimin AV,Puiu D,Luo MC,Zhu T,Koren S,Marçais G,Yorke JA,Dvořák J,Salzberg SL

    更新日期:2017-05-01 00:00:00

  • Gene and alternative splicing annotation with AIR.

    abstract::Designing effective and accurate tools for identifying the functional and structural elements in a genome remains at the frontier of genome annotation owing to incompleteness and inaccuracy of the data, limitations in the computational models, and shifting paradigms in genomics, such as alternative splicing. We presen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2889405

    authors: Florea L,Di Francesco V,Miller J,Turner R,Yao A,Harris M,Walenz B,Mobarry C,Merkulov GV,Charlab R,Dew I,Deng Z,Istrail S,Li P,Sutton G

    更新日期:2005-01-01 00:00:00

  • GeneID in Drosophila.

    abstract::GeneID is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, and start and stop codons are predicted and scored along the sequence using position weight matrices (PWMs). In the second step, exons are built from the sites. Exons are scored ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.4.511

    authors: Parra G,Blanco E,Guigó R

    更新日期:2000-04-01 00:00:00

  • A burst of protein sequence evolution and a prolonged period of asymmetric evolution follow gene duplication in yeast.

    abstract::It is widely accepted that newly arisen duplicate gene pairs experience an altered selective regime that is often manifested as an increase in the rate of protein sequence evolution. Many details about the nature of the rate acceleration remain unknown, however, including its typical magnitude and duration, and whethe...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6341207

    authors: Scannell DR,Wolfe KH

    更新日期:2008-01-01 00:00:00

  • Comparative sequence analyses reveal rapid and divergent evolutionary changes of the WFDC locus in the primate lineage.

    abstract::The initial comparison of the human and chimpanzee genome sequences revealed 16 genomic regions with an unusually high density of rapidly evolving genes. One such region is the whey acidic protein (WAP) four-disulfide core domain locus (or WFDC locus), which contains 14 WFDC genes organized in two subloci on human chr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6004607

    authors: Hurle B,Swanson W,NISC Comparative Sequencing Program.,Green ED

    更新日期:2007-03-01 00:00:00