The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes.

Abstract:

:Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%-48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels.

journal_name

Genome Res

journal_title

Genome research

authors

Montgomery SB,Goode DL,Kvikstad E,Albers CA,Zhang ZD,Mu XJ,Ananda G,Howie B,Karczewski KJ,Smith KS,Anaya V,Richardson R,Davis J,1000 Genomes Project Consortium.,MacArthur DG,Sidow A,Duret L,Gerstein M,Makova KD,Marc

doi

10.1101/gr.148718.112

subject

Has Abstract

pub_date

2013-05-01 00:00:00

pages

749-61

issue

5

eissn

1088-9051

issn

1549-5469

pii

gr.148718.112

journal_volume

23

pub_type

杂志文章
  • Long-read single-molecule maps of the functional methylome.

    abstract::We report on the development of a methylation analysis workflow for optical detection of fluorescent methylation profiles along chromosomal DNA molecules. In combination with Bionano Genomics genome mapping technology, these profiles provide a hybrid genetic/epigenetic genome-wide map composed of DNA molecules spannin...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.240739.118

    authors: Sharim H,Grunwald A,Gabrieli T,Michaeli Y,Margalit S,Torchinsky D,Arielly R,Nifker G,Juhasz M,Gularek F,Almalvez M,Dufault B,Chandra SS,Liu A,Bhattacharya S,Chen YW,Vilain E,Wagner KR,Pevsner J,Reifenberger J,Lam

    更新日期:2019-04-01 00:00:00

  • A dynamic H3K27ac signature identifies VEGFA-stimulated endothelial enhancers and requires EP300 activity.

    abstract::Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth facto...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.149674.112

    authors: Zhang B,Day DS,Ho JW,Song L,Cao J,Christodoulou D,Seidman JG,Crawford GE,Park PJ,Pu WT

    更新日期:2013-06-01 00:00:00

  • The use of exome capture RNA-seq for highly degraded RNA with application to clinical cancer sequencing.

    abstract::RNA-seq by poly(A) selection is currently the most common protocol for whole transcriptome sequencing as it provides a broad, detailed, and accurate view of the RNA landscape. Unfortunately, the utility of poly(A) libraries is greatly limited when the input RNA is degraded, which is the norm for research tissues and c...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.189621.115

    authors: Cieslik M,Chugh R,Wu YM,Wu M,Brennan C,Lonigro R,Su F,Wang R,Siddiqui J,Mehra R,Cao X,Lucas D,Chinnaiyan AM,Robinson D

    更新日期:2015-09-01 00:00:00

  • Widespread plasticity in CTCF occupancy linked to DNA methylation.

    abstract::CTCF is a ubiquitously expressed regulator of fundamental genomic processes including transcription, intra- and interchromosomal interactions, and chromatin structure. Because of its critical role in genome function, CTCF binding patterns have long been assumed to be largely invariant across different cellular environ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.136101.111

    authors: Wang H,Maurano MT,Qu H,Varley KE,Gertz J,Pauli F,Lee K,Canfield T,Weaver M,Sandstrom R,Thurman RE,Kaul R,Myers RM,Stamatoyannopoulos JA

    更新日期:2012-09-01 00:00:00

  • Next-generation sequencing identifies the natural killer cell microRNA transcriptome.

    abstract::Natural killer (NK) cells are innate lymphocytes important for early host defense against infectious pathogens and surveillance against malignant transformation. Resting murine NK cells regulate the translation of effector molecule mRNAs (e.g., granzyme B, GzmB) through unclear molecular mechanisms. MicroRNAs (miRNAs)...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.107995.110

    authors: Fehniger TA,Wylie T,Germino E,Leong JW,Magrini VJ,Koul S,Keppel CR,Schneider SE,Koboldt DC,Sullivan RP,Heinz ME,Crosby SD,Nagarajan R,Ramsingh G,Link DC,Ley TJ,Mardis ER

    更新日期:2010-11-01 00:00:00

  • Improved discovery of genetic interactions using CRISPRiSeq across multiple environments.

    abstract::Large-scale genetic interaction (GI) screens in yeast have been invaluable for our understanding of molecular systems biology and for characterizing novel gene function. Owing in part to the high costs and long experiment times required, a preponderance of GI data has been generated in a single environmental condition...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.246603.118

    authors: Jaffe M,Dziulko A,Smith JD,St Onge RP,Levy SF,Sherlock G

    更新日期:2019-04-01 00:00:00

  • A first version of the Caenorhabditis elegans Promoterome.

    abstract::An important aspect of the development of systems biology approaches in metazoans is the characterization of expression patterns of nearly all genes predicted from genome sequences. Such "localizome" maps should provide information on where (in what cells or tissues) and when (at what stage of development or under wha...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2497604

    authors: Dupuy D,Li QR,Deplancke B,Boxem M,Hao T,Lamesch P,Sequerra R,Bosak S,Doucette-Stamm L,Hope IA,Hill DE,Walhout AJ,Vidal M

    更新日期:2004-10-01 00:00:00

  • Ancestral grass karyotype reconstruction unravels new mechanisms of genome shuffling as a source of plant evolution.

    abstract::The comparison of the chromosome numbers of today's species with common reconstructed paleo-ancestors has led to intense speculation of how chromosomes have been rearranged over time in mammals. However, similar studies in plants with respect to genome evolution as well as molecular mechanisms leading to mosaic synten...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.109744.110

    authors: Murat F,Xu JH,Tannier E,Abrouk M,Guilhot N,Pont C,Messing J,Salse J

    更新日期:2010-11-01 00:00:00

  • Global analysis of Drosophila Cys₂-His₂ zinc finger proteins reveals a multitude of novel recognition motifs and binding determinants.

    abstract::Cys2-His2 zinc finger proteins (ZFPs) are the largest group of transcription factors in higher metazoans. A complete characterization of these ZFPs and their associated target sequences is pivotal to fully annotate transcriptional regulatory networks in metazoan genomes. As a first step in this process, we have charac...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.151472.112

    authors: Enuameh MS,Asriyan Y,Richards A,Christensen RG,Hall VL,Kazemian M,Zhu C,Pham H,Cheng Q,Blatti C,Brasefield JA,Basciotta MD,Ou J,McNulty JC,Zhu LJ,Celniker SE,Sinha S,Stormo GD,Brodsky MH,Wolfe SA

    更新日期:2013-06-01 00:00:00

  • Massive turnover of functional sequence in human and other mammalian genomes.

    abstract::Despite the availability of dozens of animal genome sequences, two key questions remain unanswered: First, what fraction of any species' genome confers biological function, and second, are apparent differences in organismal complexity reflected in an objective measure of genomic complexity? Here, we address both quest...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.108795.110

    authors: Meader S,Ponting CP,Lunter G

    更新日期:2010-10-01 00:00:00

  • Antisense transcripts with FANTOM2 clone set and their implications for gene regulation.

    abstract::We have used the FANTOM2 mouse cDNA set (60,770 clones), public mRNA data, and mouse genome sequence data to identify 2481 pairs of sense-antisense transcripts and 899 further pairs of nonantisense bidirectional transcription based upon genomic mapping. The analysis greatly expands the number of known examples of sens...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.982903

    authors: Kiyosawa H,Yamanaka I,Osato N,Kondo S,Hayashizaki Y,RIKEN GER Group.,GSL Members.

    更新日期:2003-06-01 00:00:00

  • A cross-platform analysis of 14,177 expression quantitative trait loci derived from lymphoblastoid cell lines.

    abstract::Gene expression levels can be an important link DNA between variation and phenotypic manifestations. Our previous map of global gene expression, based on ~400K single nucleotide polymorphisms (SNPs) and 50K transcripts in 400 sib pairs from the MRCA family panel, has been widely used to interpret the results of genome...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.142521.112

    authors: Liang L,Morar N,Dixon AL,Lathrop GM,Abecasis GR,Moffatt MF,Cookson WO

    更新日期:2013-04-01 00:00:00

  • The mouse Aire gene: comparative genomic sequencing, gene organization, and expression.

    abstract::Mutations in the human AIRE gene (hAIRE) result in the development of an autoimmune disease named APECED (autoimmune polyendocrinopathy candidiasis ectodermal dystrophy; OMIM 240300). Previously, we have cloned hAIRE and shown that it codes for a putative transcription-associated factor. Here we report the cloning and...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:

    authors: Blechschmidt K,Schweiger M,Wertz K,Poulson R,Christensen HM,Rosenthal A,Lehrach H,Yaspo ML

    更新日期:1999-02-01 00:00:00

  • Differential divergence of three human pseudoautosomal genes and their mouse homologs: implications for sex chromosome evolution.

    abstract::The human pseudoautosomal region 1 (PAR1) is essential for meiotic pairing and recombination, and its deletion causes male sterility. Comparative studies of human and mouse pseudoautosomal genes are valuable in charting the evolution of this interesting region, but have been limited by the paucity of genes conserved b...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.197001

    authors: Gianfrancesco F,Sanges R,Esposito T,Tempesta S,Rao E,Rappold G,Archidiacono N,Graves JA,Forabosco A,D'Urso M

    更新日期:2001-12-01 00:00:00

  • Genomics and hearing impairment.

    abstract::Hearing impairment is clinically and genetically heterogeneous. There are >400 disorders in which hearing impairment is a characteristic of the syndrome, and family studies demonstrate that there are at least 30 autosomal loci for nonsyndromic hearing impairment. The genes that have been identified encode diaphanous (...

    journal_title:Genome research

    pub_type: 历史文章,杂志文章,评审

    doi:

    authors: Keats BJ,Berlin CI

    更新日期:1999-01-01 00:00:00

  • Comparative analysis of gene-expression patterns in human and African great ape cultured fibroblasts.

    abstract::Although much is known about genetic variation in human and African great ape (chimpanzee, bonobo, and gorilla) genomes, substantially less is known about variation in gene-expression profiles within and among these species. This information is necessary for defining transcriptional regulatory networks that contribute...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1289803

    authors: Karaman MW,Houck ML,Chemnick LG,Nagpal S,Chawannakul D,Sudano D,Pike BL,Ho VV,Ryder OA,Hacia JG

    更新日期:2003-07-01 00:00:00

  • Principled multi-omic analysis reveals gene regulatory mechanisms of phenotype variation.

    abstract::Recent studies have analyzed large-scale data sets of gene expression to identify genes associated with interindividual variation in phenotypes ranging from cancer subtypes to drug sensitivity, promising new avenues of research in personalized medicine. However, gene expression data alone is limited in its ability to ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.227066.117

    authors: Hanson C,Cairns J,Wang L,Sinha S

    更新日期:2018-08-01 00:00:00

  • An EST-enriched comparative map of Brassica oleracea and Arabidopsis thaliana.

    abstract::A detailed comparative map of Brassica oleracea and Arabidopsis thaliana has been established based largely on mapping of Arabidopsis ESTs in two Arabidopsis and four Brassica populations. Based on conservative criteria for inferring synteny, "one to one correspondence" between Brassica and Arabidopsis chromosomes acc...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.6.776

    authors: Lan TH,DelMonte TA,Reischmann KP,Hyman J,Kowalski SP,McFerson J,Kresovich S,Paterson AH

    更新日期:2000-06-01 00:00:00

  • Dynamic changes in replication timing and gene expression during lineage specification of human pluripotent stem cells.

    abstract::Duplication of the genome in mammalian cells occurs in a defined temporal order referred to as its replication-timing (RT) program. RT changes dynamically during development, regulated in units of 400-800 kb referred to as replication domains (RDs). Changes in RT are generally coordinated with transcriptional competen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.187989.114

    authors: Rivera-Mulia JC,Buckley Q,Sasaki T,Zimmerman J,Didier RA,Nazor K,Loring JF,Lian Z,Weissman S,Robins AJ,Schulz TC,Menendez L,Kulik MJ,Dalton S,Gabr H,Kahveci T,Gilbert DM

    更新日期:2015-08-01 00:00:00

  • CBX3 regulates efficient RNA processing genome-wide.

    abstract::CBX5, CBX1, and CBX3 (HP1α, β, and γ, respectively) play an evolutionarily conserved role in the formation and maintenance of heterochromatin. In addition, CBX5, CBX1, and CBX3 may also participate in transcriptional regulation of genes. Recently, CBX3 binding to the bodies of a subset of genes has been observed in hu...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.124818.111

    authors: Smallwood A,Hon GC,Jin F,Henry RE,Espinosa JM,Ren B

    更新日期:2012-08-01 00:00:00

  • Accumulation of RNA on chromatin disrupts heterochromatic silencing.

    abstract::Long noncoding RNAs (lncRNAs) play a conserved role in regulating gene expression, chromatin dynamics, and cell differentiation. They serve as a platform for RNA interference (RNAi)-mediated heterochromatin formation or DNA methylation in many eukaryotic organisms. We found in Schizosaccharomyces pombe that heterochro...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.216986.116

    authors: Brönner C,Salvi L,Zocco M,Ugolini I,Halic M

    更新日期:2017-07-01 00:00:00

  • Telomeric organization of a variable and inducible toxin gene family in the ancient eukaryote Giardia duodenalis.

    abstract::Giardia duodenalis is the best-characterized example of the most ancient eukaryotes, which are primitively amitochondrial and anaerobic. The surface of Giardia is coated with cysteine-rich proteins. One family of these proteins, CRP136, varies among isolates and upon environmental stress. A repeat region within the CR...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.7.1.37

    authors: Upcroft P,Chen N,Upcroft JA

    更新日期:1997-01-01 00:00:00

  • Gene amplification as double minutes or homogeneously staining regions in solid tumors: origin and structure.

    abstract::Double minutes (dmin) and homogeneously staining regions (hsr) are the cytogenetic hallmarks of genomic amplification in cancer. Different mechanisms have been proposed to explain their genesis. Recently, our group showed that the MYC-containing dmin in leukemia cases arise by excision and amplification (episome model...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.106252.110

    authors: Storlazzi CT,Lonoce A,Guastadisegni MC,Trombetta D,D'Addabbo P,Daniele G,L'Abbate A,Macchia G,Surace C,Kok K,Ullmann R,Purgato S,Palumbo O,Carella M,Ambros PF,Rocchi M

    更新日期:2010-09-01 00:00:00

  • Genome-wide parent-of-origin DNA methylation analysis reveals the intricacies of human imprinting and suggests a germline methylation-independent mechanism of establishment.

    abstract::Differential methylation between the two alleles of a gene has been observed in imprinted regions, where the methylation of one allele occurs on a parent-of-origin basis, the inactive X-chromosome in females, and at those loci whose methylation is driven by genetic variants. We have extensively characterized imprinted...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.164913.113

    authors: Court F,Tayama C,Romanelli V,Martin-Trujillo A,Iglesias-Platas I,Okamura K,Sugahara N,Simón C,Moore H,Harness JV,Keirstead H,Sanchez-Mut JV,Kaneki E,Lapunzina P,Soejima H,Wake N,Esteller M,Ogata T,Hata K,Nakabayashi

    更新日期:2014-04-01 00:00:00

  • Schizosaccharomyces pombe essential genes: a pilot study.

    abstract::After completion of the Schizosaccharomyces pombe genome sequence, we have carried out a pilot gene deletion project to assess the feasibility of a genome-wide deletion project and to estimate the percentage of essential genes. Using a PCR-based gene deletion procedure, we investigated 100 genes within a 253-kb region...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.636103

    authors: Decottignies A,Sanchez-Perez I,Nurse P

    更新日期:2003-03-01 00:00:00

  • Global analysis of protein homomerization in Saccharomyces cerevisiae.

    abstract::In vivo analyses of the occurrence, subcellular localization, and dynamics of protein-protein interactions (PPIs) are important issues in functional proteomic studies. The bimolecular fluorescence complementation (BiFC) assay has many advantages in that it provides a reliable way to detect PPIs in living cells with mi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.231860.117

    authors: Kim Y,Jung JP,Pack CG,Huh WK

    更新日期:2019-01-01 00:00:00

  • Meiotic recombination generates rich diversity in NK cell receptor genes, alleles, and haplotypes.

    abstract::Natural killer (NK) cells contribute to the essential functions of innate immunity and reproduction. Various genes encode NK cell receptors that recognize the major histocompatibility complex (MHC) Class I molecules expressed by other cells. For primate NK cells, the killer-cell immunoglobulin-like receptors (KIR) are...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.085738.108

    authors: Norman PJ,Abi-Rached L,Gendzekhadze K,Hammond JA,Moesta AK,Sharma D,Graef T,McQueen KL,Guethlein LA,Carrington CV,Chandanayingyong D,Chang YH,Crespí C,Saruhan-Direskeneli G,Hameed K,Kamkamidze G,Koram KA,Layrisse Z,Ma

    更新日期:2009-05-01 00:00:00

  • Efficient identification of Y chromosome sequences in the human and Drosophila genomes.

    abstract::Notwithstanding their biological importance, Y chromosomes remain poorly known in most species. A major obstacle to their study is the identification of Y chromosome sequences; due to its high content of repetitive DNA, in most genome projects, the Y chromosome sequence is fragmented into a large number of small, unma...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.156034.113

    authors: Carvalho AB,Clark AG

    更新日期:2013-11-01 00:00:00

  • DNA profiling of B chromosomes from the yellow-necked mouse Apodemus flavicollis (Rodentia, Mammalia).

    abstract::Using AP-PCR-based DNA profiling we examined some structural features of B chromosomes from yellow-necked mice Apodemus flavicollis. Mice harboring one, two, or three or lacking B chromosomes were examined. Chromosomal structure was scanned for variant bands by using a series of arbitrary primers and from these, infor...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:

    authors: Tanic N,Dedovic N,Vujosevic M,Dimitrijevic B

    更新日期:2000-01-01 00:00:00

  • Unamplified cap analysis of gene expression on a single-molecule sequencer.

    abstract::We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3' end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.115469.110

    authors: Kanamori-Katayama M,Itoh M,Kawaji H,Lassmann T,Katayama S,Kojima M,Bertin N,Kaiho A,Ninomiya N,Daub CO,Carninci P,Forrest AR,Hayashizaki Y

    更新日期:2011-07-01 00:00:00