Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases.

Abstract:

:Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently, single-molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date no statistical framework has been proposed to enhance the power to detect these events while also controlling for false-positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test position of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best-performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events, while others represent putative chemically modified sites of unknown types.

journal_name

Genome Res

journal_title

Genome research

authors

Schadt EE,Banerjee O,Fang G,Feng Z,Wong WH,Zhang X,Kislyuk A,Clark TA,Luong K,Keren-Paz A,Chess A,Kumar V,Chen-Plotkin A,Sondheimer N,Korlach J,Kasarskis A

doi

10.1101/gr.136739.111

subject

Has Abstract

pub_date

2013-01-01 00:00:00

pages

129-41

issue

1

eissn

1088-9051

issn

1549-5469

pii

gr.136739.111

journal_volume

23

pub_type

杂志文章
  • WebLogo: a sequence logo generator.

    abstract::WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive. Ea...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.849004

    authors: Crooks GE,Hon G,Chandonia JM,Brenner SE

    更新日期:2004-06-01 00:00:00

  • Deterministic protein inference for shotgun proteomics data provides new insights into Arabidopsis pollen development and function.

    abstract::Pollen, the male gametophyte of flowering plants, represents an ideal biological system to study developmental processes, such as cell polarity, tip growth, and morphogenesis. Upon hydration, the metabolically quiescent pollen rapidly switches to an active state, exhibiting extremely fast growth. This rapid switch req...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.089060.108

    authors: Grobei MA,Qeli E,Brunner E,Rehrauer H,Zhang R,Roschitzki B,Basler K,Ahrens CH,Grossniklaus U

    更新日期:2009-10-01 00:00:00

  • Yeast genetic interaction screen of human genes associated with amyotrophic lateral sclerosis: identification of MAP2K5 kinase as a potential drug target.

    abstract::To understand disease mechanisms, a large-scale analysis of human-yeast genetic interactions was performed. Of 1305 human disease genes assayed, 20 genes exhibited strong toxicity in yeast. Human-yeast genetic interactions were identified by en masse transformation of the human disease genes into a pool of 4653 homozy...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.211649.116

    authors: Jo M,Chung AY,Yachie N,Seo M,Jeon H,Nam Y,Seo Y,Kim E,Zhong Q,Vidal M,Park HC,Roth FP,Suk K

    更新日期:2017-09-01 00:00:00

  • A GC-rich sequence feature in the 3' UTR directs UPF1-dependent mRNA decay in mammalian cells.

    abstract::Up-frameshift protein 1 (UPF1) is an ATP-dependent RNA helicase that has essential roles in RNA surveillance and in post-transcriptional gene regulation by promoting the degradation of mRNAs. Previous studies revealed that UPF1 is associated with the 3' untranslated region (UTR) of target mRNAs via as-yet-unknown sequ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.206060.116

    authors: Imamachi N,Salam KA,Suzuki Y,Akimitsu N

    更新日期:2017-03-01 00:00:00

  • Dynamic building of a BAC clone tiling path for the Rat Genome Sequencing Project.

    abstract::CLONEPICKER is a software pipeline that integrates sequence data with BAC clone fingerprints to dynamically select a minimal overlapping clone set covering the whole genome. In the Rat Genome Sequencing Project (RGSP), a hybrid strategy of "clone by clone" and "whole genome shotgun" approaches was used to maximize the...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2171704

    authors: Chen R,Sodergren E,Weinstock GM,Gibbs RA

    更新日期:2004-04-01 00:00:00

  • Evolution of a genomic regulatory domain: the role of gene co-option and gene duplication in the Enhancer of split complex.

    abstract::The Drosophila Enhancer of split complex [E(spl)-C] is a remarkable complex of genes many of which are effectors or modulators of Notch signaling. The complex contains different classes of genes including four bearded genes and seven basic helix-loop-helix (bHLH) genes. We examined the evolution of this unusual comple...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.104794.109

    authors: Duncan EJ,Dearden PK

    更新日期:2010-07-01 00:00:00

  • Rapid molecular assays to study human centromere genomics.

    abstract::The centromere is the structural unit responsible for the faithful segregation of chromosomes. Although regulation of centromeric function by epigenetic factors has been well-studied, the contributions of the underlying DNA sequences have been much less well defined, and existing methodologies for studying centromere ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.219709.116

    authors: Contreras-Galindo R,Fischer S,Saha AK,Lundy JD,Cervantes PW,Mourad M,Wang C,Qian B,Dai M,Meng F,Chinnaiyan A,Omenn GS,Kaplan MH,Markovitz DM

    更新日期:2017-12-01 00:00:00

  • Gene-specific vulnerability to imprinting variability in human embryonic stem cell lines.

    abstract::Disregulation of imprinted genes can be associated with tumorigenesis and altered cell differentiation capacity and so could provide adverse outcomes for stem cell applications. Although the maintenance of mouse and primate embryonic stem cells in a pluripotent state has been reported to disrupt the monoallelic expres...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6609207

    authors: Kim KP,Thurston A,Mummery C,Ward-van Oostwaard D,Priddle H,Allegrucci C,Denning C,Young L

    更新日期:2007-12-01 00:00:00

  • From first base: the sequence of the tip of the X chromosome of Drosophila melanogaster, a comparison of two sequencing strategies.

    abstract::We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome of Drosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different tran...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.173801

    authors: Benos PV,Gatt MK,Murphy L,Harris D,Barrell B,Ferraz C,Vidal S,Brun C,Demaille J,Cadieu E,Dreano S,Gloux S,Lelaure V,Mottier S,Galibert F,Borkova D,Miñana B,Kafatos FC,Bolshakov S,Sidén-Kiamos I,Papagiannakis G,S

    更新日期:2001-05-01 00:00:00

  • Systematic recovery and analysis of full-ORF human cDNA clones.

    abstract::The Mammalian Gene Collection (MGC) consortium (http://mgc.nci.nih.gov) seeks to establish publicly available collections of full-ORF cDNAs for several organisms of significance to biomedical research, including human. To date over 15,200 human cDNA clones containing full-length open reading frames (ORFs) have been id...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2473704

    authors: Baross A,Butterfield YS,Coughlin SM,Zeng T,Griffith M,Griffith OL,Petrescu AS,Smailus DE,Khattra J,McDonald HL,McKay SJ,Moksa M,Holt RA,Marra MA

    更新日期:2004-10-01 00:00:00

  • Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee.

    abstract::To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains....

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5302

    authors: Whitfield CW,Band MR,Bonaldo MF,Kumar CG,Liu L,Pardinas JR,Robertson HM,Soares MB,Robinson GE

    更新日期:2002-04-01 00:00:00

  • Integrative functional genomics identifies an enhancer looping to the SOX9 gene disrupted by the 17q24.3 prostate cancer risk locus.

    abstract::Genome-wide association studies (GWAS) are identifying genetic predisposition to various diseases. The 17q24.3 locus harbors the single nucleotide polymorphism (SNP) rs1859962 that is statistically associated with prostate cancer (PCa). It defines a 130-kb linkage disequilibrium (LD) block that lies in an ∼2-Mb gene d...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.135665.111

    authors: Zhang X,Cowper-Sal lari R,Bailey SD,Moore JH,Lupien M

    更新日期:2012-08-01 00:00:00

  • Two large families of chemoreceptor genes in the nematodes Caenorhabditis elegans and Caenorhabditis briggsae reveal extensive gene duplication, diversification, movement, and intron loss.

    abstract::The str family of genes encoding seven-transmembrane G-protein-coupled or serpentine receptors related to the ODR-10 diacetyl chemoreceptor is very large, with at least 197 members in the Caenorhabditis elegans genome. The closely related stl family has 43 genes, and both families are distantly related to the srd fami...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.5.449

    authors: Robertson HM

    更新日期:1998-05-01 00:00:00

  • Distal CpG islands can serve as alternative promoters to transcribe genes with silenced proximal promoters.

    abstract::DNA methylation at the promoter of a gene is presumed to render it silent, yet a sizable fraction of genes with methylated proximal promoters exhibit elevated expression. Here, we show, through extensive analysis of the methylome and transcriptome in 34 tissues, that in many such cases, transcription is initiated by a...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.212050.116

    authors: Sarda S,Das A,Vinson C,Hannenhalli S

    更新日期:2017-04-01 00:00:00

  • Functional DNA methylation differences between tissues, cell types, and across individuals discovered using the M&M algorithm.

    abstract::DNA methylation plays key roles in diverse biological processes such as X chromosome inactivation, transposable element repression, genomic imprinting, and tissue-specific gene expression. Sequencing-based DNA methylation profiling provides an unprecedented opportunity to map and compare complete DNA methylomes. This ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.156539.113

    authors: Zhang B,Zhou Y,Lin N,Lowdon RF,Hong C,Nagarajan RP,Cheng JB,Li D,Stevens M,Lee HJ,Xing X,Zhou J,Sundaram V,Elliott G,Gu J,Shi T,Gascard P,Sigaroudinia M,Tlsty TD,Kadlecek T,Weiss A,O'Geen H,Farnham PJ,Maire

    更新日期:2013-09-01 00:00:00

  • Profiling patterned transcripts in Drosophila embryos.

    abstract::Here we describe a high-throughput screen to isolate transcripts with spatially restricted patterns of expression in early embryos. Our approach utilizes robotic automation for rapid analysis of sequence-selected cDNAs in a whole-mount in situ hybridization assay. We determined the spatial distribution of a random col...

    journal_title:Genome research

    pub_type: 信件

    doi:10.1101/gr.84402

    authors: Simin K,Scuderi A,Reamey J,Dunn D,Weiss R,Metherall JE,Letsou A

    更新日期:2002-07-01 00:00:00

  • Transcriptional fates of human-specific segmental duplications in brain.

    abstract::Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently d...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.237610.118

    authors: Dougherty ML,Underwood JG,Nelson BJ,Tseng E,Munson KM,Penn O,Nowakowski TJ,Pollen AA,Eichler EE

    更新日期:2018-10-01 00:00:00

  • Centromere repositioning.

    abstract::Primate pericentromeric regions recently have been shown to exhibit extraordinary evolutionary plasticity. In this paper we report an additional peculiar feature of these regions that we discovered while analyzing, by FISH, the evolutionary conservation of primate phylogenetic chromosome IX. If the position of the cen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.9.12.1184

    authors: Montefalcone G,Tempesta S,Rocchi M,Archidiacono N

    更新日期:1999-12-01 00:00:00

  • Alterations in TCF7L2 expression define its role as a key regulator of glucose metabolism.

    abstract::Genome-wide association studies (GWAS) have consistently implicated noncoding variation within the TCF7L2 locus with type 2 diabetes (T2D) risk. While this locus represents the strongest genetic determinant for T2D risk in humans, it remains unclear how these noncoding variants affect disease etiology. To test the hyp...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.123745.111

    authors: Savic D,Ye H,Aneas I,Park SY,Bell GI,Nobrega MA

    更新日期:2011-09-01 00:00:00

  • Exploring expression data: identification and analysis of coexpressed genes.

    abstract::Analysis procedures are needed to extract useful information from the large amount of gene expression data that is becoming available. This work describes a set of analytical tools and their application to yeast cell cycle data. The components of our approach are (1) a similarity measure that reduces the number of fal...

    journal_title:Genome research

    pub_type: 杂志文章,评审

    doi:10.1101/gr.9.11.1106

    authors: Heyer LJ,Kruglyak S,Yooseph S

    更新日期:1999-11-01 00:00:00

  • Inference of population genetic parameters in metagenomics: a clean look at messy data.

    abstract::Metagenomic projects generate short, overlapping fragments of DNA sequence, each deriving from a different individual. We report a new method for inferring the scaled mutation rate, theta = 2Neu, and the scaled exponential growth rate, R = Ner, from the site-frequency spectrum of these data while accounting for sequen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5431206

    authors: Johnson PL,Slatkin M

    更新日期:2006-10-01 00:00:00

  • The effect of genotype and in utero environment on interindividual variation in neonate DNA methylomes.

    abstract::Integrating the genotype with epigenetic marks holds the promise of better understanding the biology that underlies the complex interactions of inherited and environmental components that define the developmental origins of a range of disorders. The quality of the in utero environment significantly influences health o...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.171439.113

    authors: Teh AL,Pan H,Chen L,Ong ML,Dogra S,Wong J,MacIsaac JL,Mah SM,McEwen LM,Saw SM,Godfrey KM,Chong YS,Kwek K,Kwoh CK,Soh SE,Chong MF,Barton S,Karnani N,Cheong CY,Buschdorf JP,Stünkel W,Kobor MS,Meaney MJ,Gluckma

    更新日期:2014-07-01 00:00:00

  • Widespread plasticity in CTCF occupancy linked to DNA methylation.

    abstract::CTCF is a ubiquitously expressed regulator of fundamental genomic processes including transcription, intra- and interchromosomal interactions, and chromatin structure. Because of its critical role in genome function, CTCF binding patterns have long been assumed to be largely invariant across different cellular environ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.136101.111

    authors: Wang H,Maurano MT,Qu H,Varley KE,Gertz J,Pauli F,Lee K,Canfield T,Weaver M,Sandstrom R,Thurman RE,Kaul R,Myers RM,Stamatoyannopoulos JA

    更新日期:2012-09-01 00:00:00

  • A platform for curated products from novel open reading frames prompts reinterpretation of disease variants.

    abstract::Recent evidence from proteomics and deep massively parallel sequencing studies have revealed that eukaryotic genomes contain substantial numbers of as-yet-uncharacterized open reading frames (ORFs). We define these uncharacterized ORFs as novel ORFs (nORFs). nORFs in humans are mostly under 100 codons and are found in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.263202.120

    authors: Neville MDC,Kohze R,Erady C,Meena N,Hayden M,Cooper DN,Mort M,Prabakaran S

    更新日期:2021-01-19 00:00:00

  • Next-generation sequencing identifies the natural killer cell microRNA transcriptome.

    abstract::Natural killer (NK) cells are innate lymphocytes important for early host defense against infectious pathogens and surveillance against malignant transformation. Resting murine NK cells regulate the translation of effector molecule mRNAs (e.g., granzyme B, GzmB) through unclear molecular mechanisms. MicroRNAs (miRNAs)...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.107995.110

    authors: Fehniger TA,Wylie T,Germino E,Leong JW,Magrini VJ,Koul S,Keppel CR,Schneider SE,Koboldt DC,Sullivan RP,Heinz ME,Crosby SD,Nagarajan R,Ramsingh G,Link DC,Ley TJ,Mardis ER

    更新日期:2010-11-01 00:00:00

  • Conservation, regulation, synteny, and introns in a large-scale C. briggsae-C. elegans genomic alignment.

    abstract::A new algorithm, WABA, was developed for doing large-scale alignments between genomic DNA of different species. WABA was used to align 8 million bases of Caenorhabditis briggsae genomic DNA against the entire 97-million-base Caenorhabditis elegans genome. The alignment, including C. briggsae homologs of 154 geneticall...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.8.1115

    authors: Kent WJ,Zahler AM

    更新日期:2000-08-01 00:00:00

  • Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms.

    abstract::Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous pr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4759706

    authors: Zhang W,Qi W,Albert TJ,Motiwala AS,Alland D,Hyytia-Trees EK,Ribot EM,Fields PI,Whittam TS,Swaminathan B

    更新日期:2006-06-01 00:00:00

  • Rapid evolution of mouse Y centromere repeat DNA belies recent sequence stability.

    abstract::The Y centromere sequence of house mouse, Mus musculus, remains unknown despite our otherwise significant knowledge of the genome sequence of this important mammalian model organism. Here, we report the complete molecular characterization of the C57BL/6J chromosome Y centromere, which comprises a highly diverged minor...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.092080.109

    authors: Pertile MD,Graham AN,Choo KH,Kalitsis P

    更新日期:2009-12-01 00:00:00

  • Decay rates of human mRNAs: correlation with functional characteristics and sequence attributes.

    abstract::Although mRNA decay rates are a key determinant of the steady-state concentration for any given mRNA species, relatively little is known, on a population level, about what factors influence turnover rates and how these rates are integrated into cellular decisions. We decided to measure mRNA decay rates in two human ce...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1272403

    authors: Yang E,van Nimwegen E,Zavolan M,Rajewsky N,Schroeder M,Magnasco M,Darnell JE Jr

    更新日期:2003-08-01 00:00:00

  • DIG-seq: a genome-wide CRISPR off-target profiling method using chromatin DNA.

    abstract::To investigate whether and how CRISPR-Cas9 on-target and off-target activities are affected by chromatin in eukaryotic cells, we first identified a series of identical endogenous DNA sequences present in both open and closed chromatin regions and then measured mutation frequencies at these sites in human cells using C...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.236620.118

    authors: Kim D,Kim JS

    更新日期:2018-12-01 00:00:00