Characterization of background noise in capture-based targeted sequencing data.

Abstract:

BACKGROUND:Targeted deep sequencing is increasingly used to detect low-allelic fraction variants; it is therefore essential that errors that constitute baseline noise and impose a practical limit on detection are characterized. In the present study, we systematically evaluate the extent to which errors are incurred during specific steps of the capture-based targeted sequencing process. RESULTS:We removed most sequencing artifacts by filtering out low-quality bases and then analyze the remaining background noise. By recognizing that plasma DNA is naturally fragmented to be of a size comparable to that of mono-nucleosomal DNA, we were able to identify and characterize errors that are specifically associated with acoustic shearing. Two-thirds of C:G > A:T errors and one quarter of C:G > G:C errors were attributed to the oxidation of guanine during acoustic shearing, and this was further validated by comparative experiments conducted under different shearing conditions. The acoustic shearing step also causes A > G and A > T substitutions localized to the end bases of sheared DNA fragments, indicating a probable association of these errors with DNA breakage. Finally, the hybrid selection step contributes to one-third of the remaining C:G > A:T and one-fifth of the C > T errors. CONCLUSIONS:The results of this study provide a comprehensive summary of various errors incurred during targeted deep sequencing, and their underlying causes. This information will be invaluable to drive technical improvements in this sequencing method, and may increase the future usage of targeted deep sequencing methods for low-allelic fraction variant detection.

journal_name

Genome Biol

journal_title

Genome biology

authors

Park G,Park JK,Shin SH,Jeon HJ,Kim NKD,Kim YJ,Shin HT,Lee E,Lee KH,Son DS,Park WY,Park D

doi

10.1186/s13059-017-1275-2

subject

Has Abstract

pub_date

2017-07-21 00:00:00

pages

136

issue

1

eissn

1474-7596

issn

1474-760X

pii

10.1186/s13059-017-1275-2

journal_volume

18

pub_type

杂志文章
  • Survey of human mitochondrial diseases using new genomic/proteomic tools.

    abstract:BACKGROUND:We have constructed Bayesian prior-based, amino-acid sequence profiles for the complete yeast mitochondrial proteome and used them to develop methods for identifying and characterizing the context of protein mutations that give rise to human mitochondrial diseases. (Bayesian priors are conditional probabilit...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2001-2-6-research0021

    authors: Plasterer TN,Smith TF,Mohr SC

    更新日期:2001-01-01 00:00:00

  • Transcriptome analysis reveals new insight into appressorium formation and function in the rice blast fungus Magnaporthe oryzae.

    abstract:BACKGROUND:Rice blast disease is caused by the filamentous Ascomycetous fungus Magnaporthe oryzae and results in significant annual rice yield losses worldwide. Infection by this and many other fungal plant pathogens requires the development of a specialized infection cell called an appressorium. The molecular processe...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-5-r85

    authors: Oh Y,Donofrio N,Pan H,Coughlan S,Brown DE,Meng S,Mitchell T,Dean RA

    更新日期:2008-01-01 00:00:00

  • Quantifying the mechanisms of domain gain in animal proteins.

    abstract:BACKGROUND:Protein domains are protein regions that are shared among different proteins and are frequently functionally and structurally independent from the rest of the protein. Novel domain combinations have a major role in evolutionary innovation. However, the relative contributions of the different molecular mechan...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2010-11-7-r74

    authors: Buljan M,Frankish A,Bateman A

    更新日期:2010-01-01 00:00:00

  • Phytophthora capsici-tomato interaction features dramatic shifts in gene expression associated with a hemi-biotrophic lifestyle.

    abstract:BACKGROUND:Plant-microbe interactions feature complex signal interplay between pathogens and their hosts. Phytophthora species comprise a destructive group of fungus-like plant pathogens, collectively affecting a wide range of plants important to agriculture and natural ecosystems. Despite the availability of genome se...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2013-14-6-r63

    authors: Jupe J,Stam R,Howden AJ,Morris JA,Zhang R,Hedley PE,Huitema E

    更新日期:2013-06-25 00:00:00

  • Comparative genome sequence analysis underscores mycoparasitism as the ancestral life style of Trichoderma.

    abstract:BACKGROUND:Mycoparasitism, a lifestyle where one fungus is parasitic on another fungus, has special relevance when the prey is a plant pathogen, providing a strategy for biological control of pests for plant protection. Probably, the most studied biocontrol agents are species of the genus Hypocrea/Trichoderma. RESULTS...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-4-r40

    authors: Kubicek CP,Herrera-Estrella A,Seidl-Seiboth V,Martinez DA,Druzhinina IS,Thon M,Zeilinger S,Casas-Flores S,Horwitz BA,Mukherjee PK,Mukherjee M,Kredics L,Alcaraz LD,Aerts A,Antal Z,Atanasova L,Cervantes-Badillo MG,Challac

    更新日期:2011-01-01 00:00:00

  • Sirtuins: Sir2-related NAD-dependent protein deacetylases.

    abstract::Silent information regulator 2 (Sir2) proteins, or sirtuins, are protein deacetylases dependent on nicotine adenine dinucleotide (NAD) and are found in organisms ranging from bacteria to humans. In eukaryotes, sirtuins regulate transcriptional repression, recombination, the cell-division cycle, microtubule organizatio...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2004-5-5-224

    authors: North BJ,Verdin E

    更新日期:2004-01-01 00:00:00

  • Statistical tests for differential expression in cDNA microarray experiments.

    abstract::Extracting biological information from microarray data requires appropriate statistical methods. The simplest statistical method for detecting differential expression is the t test, which can be used to compare two conditions when there is replication of samples. With more than two conditions, analysis of variance (AN...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2003-4-4-210

    authors: Cui X,Churchill GA

    更新日期:2003-01-01 00:00:00

  • The amazing world of bacterial structured RNAs.

    abstract::The discovery of several new structured non-coding RNAs in bacterial and archaeal genomes and metagenomes raises burning questions about their biological and biochemical functions. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2010-11-3-108

    authors: Westhof E

    更新日期:2010-01-01 00:00:00

  • Reproducible inference of transcription factor footprints in ATAC-seq and DNase-seq datasets using protocol-specific bias modeling.

    abstract:BACKGROUND:DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence b...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1654-y

    authors: Karabacak Calviello A,Hirsekorn A,Wurmus R,Yusuf D,Ohler U

    更新日期:2019-02-21 00:00:00

  • Comprehensive assessment of computational algorithms in predicting cancer driver mutations.

    abstract:BACKGROUND:The initiation and subsequent evolution of cancer are largely driven by a relatively small number of somatic mutations with critical functional impacts, so-called driver mutations. Identifying driver mutations in a patient's tumor cells is a central task in the era of precision cancer medicine. Over the deca...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-01954-z

    authors: Chen H,Li J,Wang Y,Ng PK,Tsang YH,Shaw KR,Mills GB,Liang H

    更新日期:2020-02-20 00:00:00

  • Epigenetic modifications of histones in cancer.

    abstract::The epigenetic modifications of histones are versatile marks that are intimately connected to development and disease pathogenesis including human cancers. In this review, we will discuss the many different types of histone modifications and the biological processes with which they are involved. Specifically, we revie...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/s13059-019-1870-5

    authors: Zhao Z,Shilatifard A

    更新日期:2019-11-20 00:00:00

  • All Your Base: a fast and accurate probabilistic approach to base calling.

    abstract::The accuracy of base calls produced by Illumina sequencers is adversely affected by several processes, with laser cross-talk and cluster phasing being prominent. We introduce an explicit statistical model of the sequencing process that generalizes current models of phasing and cross-talk and forms the basis of a base ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2012-13-2-r13

    authors: Massingham T,Goldman N

    更新日期:2012-02-29 00:00:00

  • Allele-specific copy number analysis of tumor samples with aneuploidy and tumor heterogeneity.

    abstract::We describe a bioinformatic tool, Tumor Aberration Prediction Suite (TAPS), for the identification of allele-specific copy numbers in tumor samples using data from Affymetrix SNP arrays. It includes detailed visualization of genomic segment characteristics and iterative pattern recognition for copy number identificati...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-10-r108

    authors: Rasmussen M,Sundström M,Göransson Kultima H,Botling J,Micke P,Birgisson H,Glimelius B,Isaksson A

    更新日期:2011-10-24 00:00:00

  • Molecular mechanisms of spindle function.

    abstract::The key molecules involved in regulating the assembly and function of the mitotic spindle are shared by evolutionarily divergent species. Studies in different model systems are leading to convergent conclusions about the central role of microtubule nucleation and dynamics and of kinesin-related motor proteins in spind...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2000-1-1-reviews101

    authors: Walczak CE

    更新日期:2000-01-01 00:00:00

  • POSaM: a fast, flexible, open-source, inkjet oligonucleotide synthesizer and microarrayer.

    abstract::DNA arrays are valuable tools in molecular biology laboratories. Their rapid acceptance was aided by the release of plans for a pin-spotting microarrayer by researchers at Stanford. Inkjet microarraying is a flexible, complementary technique that allows the synthesis of arrays of any oligonucleotide sequences de novo....

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2004-5-8-r58

    authors: Lausted C,Dahl T,Warren C,King K,Smith K,Johnson M,Saleem R,Aitchison J,Hood L,Lasky SR

    更新日期:2004-01-01 00:00:00

  • GNB5 mutation causes a novel neuropsychiatric disorder featuring attention deficit hyperactivity disorder, severely impaired language development and normal cognition.

    abstract:BACKGROUND:Neuropsychiatric disorders are common forms of disability in humans. Despite recent progress in deciphering the genetics of these disorders, their phenotypic complexity continues to be a major challenge. Mendelian neuropsychiatric disorders are rare but their study has the potential to unravel novel mechanis...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-1061-6

    authors: Shamseldin HE,Masuho I,Alenizi A,Alyamani S,Patil DN,Ibrahim N,Martemyanov KA,Alkuraya FS

    更新日期:2016-09-27 00:00:00

  • Games with a scientific purpose.

    abstract::The protein folding game Foldit shows that games are an effective way to recruit, engage and organize ordinary citizens to help solve difficult scientific problems. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-12-135

    authors: Good BM,Su AI

    更新日期:2011-12-28 00:00:00

  • xCell: digitally portraying the tissue cellular heterogeneity landscape.

    abstract::Tissues are complex milieus consisting of numerous cell types. Several recent methods have attempted to enumerate cell subsets from transcriptomes. However, the available methods have used limited sources for training and give only a partial portrayal of the full cellular landscape. Here we present xCell, a novel gene...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1349-1

    authors: Aran D,Hu Z,Butte AJ

    更新日期:2017-11-15 00:00:00

  • Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation.

    abstract:BACKGROUND:Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotype...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-015-0683-4

    authors: Dueck H,Khaladkar M,Kim TK,Spaethling JM,Francis C,Suresh S,Fisher SA,Seale P,Beck SG,Bartfai T,Kuhn B,Eberwine J,Kim J

    更新日期:2015-06-09 00:00:00

  • Predicting the effects of frameshifting indels.

    abstract::Each human has approximately 50 to 280 frameshifting indels, yet their implications are unknown. We created SIFT Indel, a prediction method for frameshifting indels that has 84% accuracy. The percentage of human frameshifting indels predicted to be gene-damaging is negatively correlated with allele frequency. We also ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2012-13-2-r9

    authors: Hu J,Ng PC

    更新日期:2012-02-09 00:00:00

  • EpiTEome: Simultaneous detection of transposable element insertion sites and their DNA methylation levels.

    abstract::The genome-wide investigation of DNA methylation levels has been limited to reference transposable element positions. The methylation analysis of non-reference and mobile transposable elements has only recently been performed, but required both genome resequencing and MethylC-seq datasets. We have created epiTEome, a ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1232-0

    authors: Daron J,Slotkin RK

    更新日期:2017-05-12 00:00:00

  • Quantitative protein expression profiling reveals extensive post-transcriptional regulation and post-translational modifications in schizont-stage malaria parasites.

    abstract:BACKGROUND:Malaria is a one of the most important infectious diseases and is caused by parasitic protozoa of the genus Plasmodium. Previously, quantitative characterization of the P. falciparum transcriptome demonstrated that the strictly controlled progression of these parasites through their intra-erythrocytic develo...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-12-r177

    authors: Foth BJ,Zhang N,Mok S,Preiser PR,Bozdech Z

    更新日期:2008-01-01 00:00:00

  • Barley landraces are characterized by geographically heterogeneous genomic origins.

    abstract:BACKGROUND:The genetic provenance of domesticated plants and the routes along which they were disseminated in prehistory have been a long-standing source of debate. Much of this debate has focused on identifying centers of origins for individual crops. However, many important crops show clear genetic signatures of mult...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-015-0712-3

    authors: Poets AM,Fang Z,Clegg MT,Morrell PL

    更新日期:2015-08-21 00:00:00

  • A genome-wide screen for modifiers of transgene variegation identifies genes with critical roles in development.

    abstract:BACKGROUND:Some years ago we established an N-ethyl-N-nitrosourea screen for modifiers of transgene variegation in the mouse and a preliminary description of the first six mutant lines, named MommeD1-D6, has been published. We have reported the underlying genes in three cases: MommeD1 is a mutation in SMC hinge domain ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-12-r182

    authors: Ashe A,Morgan DK,Whitelaw NC,Bruxner TJ,Vickaryous NK,Cox LL,Butterfield NC,Wicking C,Blewitt ME,Wilkins SJ,Anderson GJ,Cox TC,Whitelaw E

    更新日期:2008-01-01 00:00:00

  • Quantification of cell identity from single-cell gene expression profiles.

    abstract::The definition of cell identity is a central problem in biology. While single-cell RNA-seq provides a wealth of information regarding cell states, better methods are needed to map their identity, especially during developmental transitions. Here, we use repositories of cell type-specific transcriptomes to quantify ide...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-015-0580-x

    authors: Efroni I,Ip PL,Nawy T,Mello A,Birnbaum KD

    更新日期:2015-01-22 00:00:00

  • Protein-protein interactions of the hyperthermophilic archaeon Pyrococcus horikoshii OT3.

    abstract:BACKGROUND:Although 2,061 proteins of Pyrococcus horikoshii OT3, a hyperthermophilic archaeon, have been predicted from the recently completed genome sequence, the majority of proteins show no similarity to those from other organisms and are thus hypothetical proteins of unknown function. Because most proteins operate ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2005-6-12-r98

    authors: Usui K,Katayama S,Kanamori-Katayama M,Ogawa C,Kai C,Okada M,Kawai J,Arakawa T,Carninci P,Itoh M,Takio K,Miyano M,Kidoaki S,Matsuda T,Hayashizaki Y,Suzuki H

    更新日期:2005-01-01 00:00:00

  • PureCLIP: capturing target-specific protein-RNA interaction footprints from single-nucleotide CLIP-seq data.

    abstract::The iCLIP and eCLIP techniques facilitate the detection of protein-RNA interaction sites at high resolution, based on diagnostic events at crosslink sites. However, previous methods do not explicitly model the specifics of iCLIP and eCLIP truncation patterns and possible biases. We developed PureCLIP ( https://github....

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1364-2

    authors: Krakau S,Richard H,Marsico A

    更新日期:2017-12-28 00:00:00

  • Hepatic steatosis risk is partly driven by increased de novo lipogenesis following carbohydrate consumption.

    abstract:BACKGROUND:Diet is a major contributor to metabolic disease risk, but there is controversy as to whether increased incidences of diseases such as non-alcoholic fatty liver disease arise from consumption of saturated fats or free sugars. Here, we investigate whether a sub-set of triacylglycerols (TAGs) were associated w...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-018-1439-8

    authors: Sanders FWB,Acharjee A,Walker C,Marney L,Roberts LD,Imamura F,Jenkins B,Case J,Ray S,Virtue S,Vidal-Puig A,Kuh D,Hardy R,Allison M,Forouhi N,Murray AJ,Wareham N,Vacca M,Koulman A,Griffin JL

    更新日期:2018-06-20 00:00:00

  • Genotyping structural variants in pangenome graphs using the vg toolkit.

    abstract::Structural variants (SVs) remain challenging to represent and study relative to point mutations despite their demonstrated importance. We show that variation graphs, as implemented in the vg toolkit, provide an effective means for leveraging SV catalogs for short-read SV genotyping experiments. We benchmark vg against...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-1941-7

    authors: Hickey G,Heller D,Monlong J,Sibbesen JA,Sirén J,Eizenga J,Dawson ET,Garrison E,Novak AM,Paten B

    更新日期:2020-02-12 00:00:00

  • Genome-wide investigation of light and carbon signaling interactions in Arabidopsis.

    abstract:BACKGROUND:Light and carbon are two essential signals influencing plant growth and development. Little is known about how carbon and light signaling pathways intersect or influence one another to affect gene expression. RESULTS:Microarrays are used to investigate carbon and light signaling interactions at a genome-wid...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2004-5-2-r10

    authors: Thum KE,Shin MJ,Palenchar PM,Kouranov A,Coruzzi GM

    更新日期:2004-01-01 00:00:00