Abstract:
BACKGROUND:Targeted deep sequencing is increasingly used to detect low-allelic fraction variants; it is therefore essential that errors that constitute baseline noise and impose a practical limit on detection are characterized. In the present study, we systematically evaluate the extent to which errors are incurred during specific steps of the capture-based targeted sequencing process. RESULTS:We removed most sequencing artifacts by filtering out low-quality bases and then analyze the remaining background noise. By recognizing that plasma DNA is naturally fragmented to be of a size comparable to that of mono-nucleosomal DNA, we were able to identify and characterize errors that are specifically associated with acoustic shearing. Two-thirds of C:G > A:T errors and one quarter of C:G > G:C errors were attributed to the oxidation of guanine during acoustic shearing, and this was further validated by comparative experiments conducted under different shearing conditions. The acoustic shearing step also causes A > G and A > T substitutions localized to the end bases of sheared DNA fragments, indicating a probable association of these errors with DNA breakage. Finally, the hybrid selection step contributes to one-third of the remaining C:G > A:T and one-fifth of the C > T errors. CONCLUSIONS:The results of this study provide a comprehensive summary of various errors incurred during targeted deep sequencing, and their underlying causes. This information will be invaluable to drive technical improvements in this sequencing method, and may increase the future usage of targeted deep sequencing methods for low-allelic fraction variant detection.
journal_name
Genome Bioljournal_title
Genome biologyauthors
Park G,Park JK,Shin SH,Jeon HJ,Kim NKD,Kim YJ,Shin HT,Lee E,Lee KH,Son DS,Park WY,Park Ddoi
10.1186/s13059-017-1275-2subject
Has Abstractpub_date
2017-07-21 00:00:00pages
136issue
1eissn
1474-7596issn
1474-760Xpii
10.1186/s13059-017-1275-2journal_volume
18pub_type
杂志文章相关文献
GENOME BIOLOGY文献大全abstract:BACKGROUND:We have constructed Bayesian prior-based, amino-acid sequence profiles for the complete yeast mitochondrial proteome and used them to develop methods for identifying and characterizing the context of protein mutations that give rise to human mitochondrial diseases. (Bayesian priors are conditional probabilit...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2001-2-6-research0021
更新日期:2001-01-01 00:00:00
abstract:BACKGROUND:Rice blast disease is caused by the filamentous Ascomycetous fungus Magnaporthe oryzae and results in significant annual rice yield losses worldwide. Infection by this and many other fungal plant pathogens requires the development of a specialized infection cell called an appressorium. The molecular processe...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2008-9-5-r85
更新日期:2008-01-01 00:00:00
abstract:BACKGROUND:Protein domains are protein regions that are shared among different proteins and are frequently functionally and structurally independent from the rest of the protein. Novel domain combinations have a major role in evolutionary innovation. However, the relative contributions of the different molecular mechan...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2010-11-7-r74
更新日期:2010-01-01 00:00:00
abstract:BACKGROUND:Plant-microbe interactions feature complex signal interplay between pathogens and their hosts. Phytophthora species comprise a destructive group of fungus-like plant pathogens, collectively affecting a wide range of plants important to agriculture and natural ecosystems. Despite the availability of genome se...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2013-14-6-r63
更新日期:2013-06-25 00:00:00
abstract:BACKGROUND:Mycoparasitism, a lifestyle where one fungus is parasitic on another fungus, has special relevance when the prey is a plant pathogen, providing a strategy for biological control of pests for plant protection. Probably, the most studied biocontrol agents are species of the genus Hypocrea/Trichoderma. RESULTS...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2011-12-4-r40
更新日期:2011-01-01 00:00:00
abstract::Silent information regulator 2 (Sir2) proteins, or sirtuins, are protein deacetylases dependent on nicotine adenine dinucleotide (NAD) and are found in organisms ranging from bacteria to humans. In eukaryotes, sirtuins regulate transcriptional repression, recombination, the cell-division cycle, microtubule organizatio...
journal_title:Genome biology
pub_type: 杂志文章,评审
doi:10.1186/gb-2004-5-5-224
更新日期:2004-01-01 00:00:00
abstract::Extracting biological information from microarray data requires appropriate statistical methods. The simplest statistical method for detecting differential expression is the t test, which can be used to compare two conditions when there is replication of samples. With more than two conditions, analysis of variance (AN...
journal_title:Genome biology
pub_type: 杂志文章,评审
doi:10.1186/gb-2003-4-4-210
更新日期:2003-01-01 00:00:00
abstract::The discovery of several new structured non-coding RNAs in bacterial and archaeal genomes and metagenomes raises burning questions about their biological and biochemical functions. ...
journal_title:Genome biology
pub_type: 杂志文章,评审
doi:10.1186/gb-2010-11-3-108
更新日期:2010-01-01 00:00:00
abstract:BACKGROUND:DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence b...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-019-1654-y
更新日期:2019-02-21 00:00:00
abstract:BACKGROUND:The initiation and subsequent evolution of cancer are largely driven by a relatively small number of somatic mutations with critical functional impacts, so-called driver mutations. Identifying driver mutations in a patient's tumor cells is a central task in the era of precision cancer medicine. Over the deca...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-020-01954-z
更新日期:2020-02-20 00:00:00
abstract::The epigenetic modifications of histones are versatile marks that are intimately connected to development and disease pathogenesis including human cancers. In this review, we will discuss the many different types of histone modifications and the biological processes with which they are involved. Specifically, we revie...
journal_title:Genome biology
pub_type: 杂志文章,评审
doi:10.1186/s13059-019-1870-5
更新日期:2019-11-20 00:00:00
abstract::The accuracy of base calls produced by Illumina sequencers is adversely affected by several processes, with laser cross-talk and cluster phasing being prominent. We introduce an explicit statistical model of the sequencing process that generalizes current models of phasing and cross-talk and forms the basis of a base ...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2012-13-2-r13
更新日期:2012-02-29 00:00:00
abstract::We describe a bioinformatic tool, Tumor Aberration Prediction Suite (TAPS), for the identification of allele-specific copy numbers in tumor samples using data from Affymetrix SNP arrays. It includes detailed visualization of genomic segment characteristics and iterative pattern recognition for copy number identificati...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2011-12-10-r108
更新日期:2011-10-24 00:00:00
abstract::The key molecules involved in regulating the assembly and function of the mitotic spindle are shared by evolutionarily divergent species. Studies in different model systems are leading to convergent conclusions about the central role of microtubule nucleation and dynamics and of kinesin-related motor proteins in spind...
journal_title:Genome biology
pub_type: 杂志文章,评审
doi:10.1186/gb-2000-1-1-reviews101
更新日期:2000-01-01 00:00:00
abstract::DNA arrays are valuable tools in molecular biology laboratories. Their rapid acceptance was aided by the release of plans for a pin-spotting microarrayer by researchers at Stanford. Inkjet microarraying is a flexible, complementary technique that allows the synthesis of arrays of any oligonucleotide sequences de novo....
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2004-5-8-r58
更新日期:2004-01-01 00:00:00
abstract:BACKGROUND:Neuropsychiatric disorders are common forms of disability in humans. Despite recent progress in deciphering the genetics of these disorders, their phenotypic complexity continues to be a major challenge. Mendelian neuropsychiatric disorders are rare but their study has the potential to unravel novel mechanis...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-016-1061-6
更新日期:2016-09-27 00:00:00
abstract::The protein folding game Foldit shows that games are an effective way to recruit, engage and organize ordinary citizens to help solve difficult scientific problems. ...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2011-12-12-135
更新日期:2011-12-28 00:00:00
abstract::Tissues are complex milieus consisting of numerous cell types. Several recent methods have attempted to enumerate cell subsets from transcriptomes. However, the available methods have used limited sources for training and give only a partial portrayal of the full cellular landscape. Here we present xCell, a novel gene...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-017-1349-1
更新日期:2017-11-15 00:00:00
abstract:BACKGROUND:Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotype...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-015-0683-4
更新日期:2015-06-09 00:00:00
abstract::Each human has approximately 50 to 280 frameshifting indels, yet their implications are unknown. We created SIFT Indel, a prediction method for frameshifting indels that has 84% accuracy. The percentage of human frameshifting indels predicted to be gene-damaging is negatively correlated with allele frequency. We also ...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2012-13-2-r9
更新日期:2012-02-09 00:00:00
abstract::The genome-wide investigation of DNA methylation levels has been limited to reference transposable element positions. The methylation analysis of non-reference and mobile transposable elements has only recently been performed, but required both genome resequencing and MethylC-seq datasets. We have created epiTEome, a ...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-017-1232-0
更新日期:2017-05-12 00:00:00
abstract:BACKGROUND:Malaria is a one of the most important infectious diseases and is caused by parasitic protozoa of the genus Plasmodium. Previously, quantitative characterization of the P. falciparum transcriptome demonstrated that the strictly controlled progression of these parasites through their intra-erythrocytic develo...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2008-9-12-r177
更新日期:2008-01-01 00:00:00
abstract:BACKGROUND:The genetic provenance of domesticated plants and the routes along which they were disseminated in prehistory have been a long-standing source of debate. Much of this debate has focused on identifying centers of origins for individual crops. However, many important crops show clear genetic signatures of mult...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-015-0712-3
更新日期:2015-08-21 00:00:00
abstract:BACKGROUND:Some years ago we established an N-ethyl-N-nitrosourea screen for modifiers of transgene variegation in the mouse and a preliminary description of the first six mutant lines, named MommeD1-D6, has been published. We have reported the underlying genes in three cases: MommeD1 is a mutation in SMC hinge domain ...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2008-9-12-r182
更新日期:2008-01-01 00:00:00
abstract::The definition of cell identity is a central problem in biology. While single-cell RNA-seq provides a wealth of information regarding cell states, better methods are needed to map their identity, especially during developmental transitions. Here, we use repositories of cell type-specific transcriptomes to quantify ide...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-015-0580-x
更新日期:2015-01-22 00:00:00
abstract:BACKGROUND:Although 2,061 proteins of Pyrococcus horikoshii OT3, a hyperthermophilic archaeon, have been predicted from the recently completed genome sequence, the majority of proteins show no similarity to those from other organisms and are thus hypothetical proteins of unknown function. Because most proteins operate ...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2005-6-12-r98
更新日期:2005-01-01 00:00:00
abstract::The iCLIP and eCLIP techniques facilitate the detection of protein-RNA interaction sites at high resolution, based on diagnostic events at crosslink sites. However, previous methods do not explicitly model the specifics of iCLIP and eCLIP truncation patterns and possible biases. We developed PureCLIP ( https://github....
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-017-1364-2
更新日期:2017-12-28 00:00:00
abstract:BACKGROUND:Diet is a major contributor to metabolic disease risk, but there is controversy as to whether increased incidences of diseases such as non-alcoholic fatty liver disease arise from consumption of saturated fats or free sugars. Here, we investigate whether a sub-set of triacylglycerols (TAGs) were associated w...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-018-1439-8
更新日期:2018-06-20 00:00:00
abstract::Structural variants (SVs) remain challenging to represent and study relative to point mutations despite their demonstrated importance. We show that variation graphs, as implemented in the vg toolkit, provide an effective means for leveraging SV catalogs for short-read SV genotyping experiments. We benchmark vg against...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/s13059-020-1941-7
更新日期:2020-02-12 00:00:00
abstract:BACKGROUND:Light and carbon are two essential signals influencing plant growth and development. Little is known about how carbon and light signaling pathways intersect or influence one another to affect gene expression. RESULTS:Microarrays are used to investigate carbon and light signaling interactions at a genome-wid...
journal_title:Genome biology
pub_type: 杂志文章
doi:10.1186/gb-2004-5-2-r10
更新日期:2004-01-01 00:00:00