Benchmarking of computational error-correction methods for next-generation sequencing data.

Abstract:

BACKGROUND:Recent advancements in next-generation sequencing have rapidly improved our ability to study genomic material at an unprecedented scale. Despite substantial improvements in sequencing technologies, errors present in the data still risk confounding downstream analysis and limiting the applicability of sequencing technologies in clinical tools. Computational error correction promises to eliminate sequencing errors, but the relative accuracy of error correction algorithms remains unknown. RESULTS:In this paper, we evaluate the ability of error correction algorithms to fix errors across different types of datasets that contain various levels of heterogeneity. We highlight the advantages and limitations of computational error correction techniques across different domains of biology, including immunogenomics and virology. To demonstrate the efficacy of our technique, we apply the UMI-based high-fidelity sequencing protocol to eliminate sequencing errors from both simulated data and the raw reads. We then perform a realistic evaluation of error-correction methods. CONCLUSIONS:In terms of accuracy, we find that method performance varies substantially across different types of datasets with no single method performing best on all types of examined data. Finally, we also identify the techniques that offer a good balance between precision and sensitivity.

journal_name

Genome Biol

journal_title

Genome biology

authors

Mitchell K,Brito JJ,Mandric I,Wu Q,Knyazev S,Chang S,Martin LS,Karlsberg A,Gerasimov E,Littman R,Hill BL,Wu NC,Yang HT,Hsieh K,Chen L,Littman E,Shabani T,Enik G,Yao D,Sun R,Schroeder J,Eskin E,Zelikovsky A,S

doi

10.1186/s13059-020-01988-3

subject

Has Abstract

pub_date

2020-03-17 00:00:00

pages

71

issue

1

eissn

1474-7596

issn

1474-760X

pii

10.1186/s13059-020-01988-3

journal_volume

21

pub_type

杂志文章
  • Genome-wide mapping of FOXM1 binding reveals co-binding with estrogen receptor alpha in breast cancer cells.

    abstract:BACKGROUND:The forkhead transcription factor FOXM1 is a key regulator of the cell cycle. It is frequently over-expressed in cancer and is emerging as an important therapeutic target. In breast cancer FOXM1 expression is linked with estrogen receptor (ERα) activity and resistance to endocrine therapies, with high levels...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2013-14-1-r6

    authors: Sanders DA,Ross-Innes CS,Beraldi D,Carroll JS,Balasubramanian S

    更新日期:2013-01-24 00:00:00

  • An evaluation of methods correcting for cell-type heterogeneity in DNA methylation studies.

    abstract:BACKGROUND:Many different methods exist to adjust for variability in cell-type mixture proportions when analyzing DNA methylation studies. Here we present the result of an extensive simulation study, built on cell-separated DNA methylation profiles from Illumina Infinium 450K methylation data, to compare the performanc...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-0935-y

    authors: McGregor K,Bernatsky S,Colmegna I,Hudson M,Pastinen T,Labbe A,Greenwood CM

    更新日期:2016-05-03 00:00:00

  • Rapid gene isolation in barley and wheat by mutant chromosome sequencing.

    abstract::Identification of causal mutations in barley and wheat is hampered by their large genomes and suppressed recombination. To overcome these obstacles, we have developed MutChromSeq, a complexity reduction approach based on flow sorting and sequencing of mutant chromosomes, to identify induced mutations by comparison to ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-1082-1

    authors: Sánchez-Martín J,Steuernagel B,Ghosh S,Herren G,Hurni S,Adamski N,Vrána J,Kubaláková M,Krattinger SG,Wicker T,Doležel J,Keller B,Wulff BB

    更新日期:2016-10-31 00:00:00

  • Topoisomerase II beta interacts with cohesin and CTCF at topological domain borders.

    abstract:BACKGROUND:Type II DNA topoisomerases (TOP2) regulate DNA topology by generating transient double stranded breaks during replication and transcription. Topoisomerase II beta (TOP2B) facilitates rapid gene expression and functions at the later stages of development and differentiation. To gain new insight into the genom...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-1043-8

    authors: Uusküla-Reimand L,Hou H,Samavarchi-Tehrani P,Rudan MV,Liang M,Medina-Rivera A,Mohammed H,Schmidt D,Schwalie P,Young EJ,Reimand J,Hadjur S,Gingras AC,Wilson MD

    更新日期:2016-08-31 00:00:00

  • Systematic identification of genetic influences on methylation across the human life course.

    abstract:BACKGROUND:The influence of genetic variation on complex diseases is potentially mediated through a range of highly dynamic epigenetic processes exhibiting temporal variation during development and later life. Here we present a catalogue of the genetic influences on DNA methylation (methylation quantitative trait loci ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-0926-z

    authors: Gaunt TR,Shihab HA,Hemani G,Min JL,Woodward G,Lyttleton O,Zheng J,Duggirala A,McArdle WL,Ho K,Ring SM,Evans DM,Davey Smith G,Relton CL

    更新日期:2016-03-31 00:00:00

  • Transcriptional profiling of long non-coding RNAs and novel transcribed regions across a diverse panel of archived human cancers.

    abstract:BACKGROUND:Molecular characterization of tumors has been critical for identifying important genes in cancer biology and for improving tumor classification and diagnosis. Long non-coding RNAs, as a new, relatively unstudied class of transcripts, provide a rich opportunity to identify both functional drivers and cancer-t...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2012-13-8-r75

    authors: Brunner AL,Beck AH,Edris B,Sweeney RT,Zhu SX,Li R,Montgomery K,Varma S,Gilks T,Guo X,Foley JW,Witten DM,Giacomini CP,Flynn RA,Pollack JR,Tibshirani R,Chang HY,van de Rijn M,West RB

    更新日期:2012-08-28 00:00:00

  • Genome-wide analyses of Shavenbaby target genes reveals distinct features of enhancer organization.

    abstract:BACKGROUND:Developmental programs are implemented by regulatory interactions between Transcription Factors (TFs) and their target genes, which remain poorly understood. While recent studies have focused on regulatory cascades of TFs that govern early development, little is known about how the ultimate effectors of cell...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2013-14-8-r86

    authors: Menoret D,Santolini M,Fernandes I,Spokony R,Zanet J,Gonzalez I,Latapie Y,Ferrer P,Rouault H,White KP,Besse P,Hakim V,Aerts S,Payre F,Plaza S

    更新日期:2013-08-23 00:00:00

  • Inhibition of RNA polymerase II allows controlled mobilisation of retrotransposons for plant breeding.

    abstract:BACKGROUND:Retrotransposons play a central role in plant evolution and could be a powerful endogenous source of genetic and epigenetic variability for crop breeding. To ensure genome integrity several silencing mechanisms have evolved to repress retrotransposon mobility. Even though retrotransposons fully depend on tra...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1265-4

    authors: Thieme M,Lanciano S,Balzergue S,Daccord N,Mirouze M,Bucher E

    更新日期:2017-07-07 00:00:00

  • Surfing waves of data in San Diego: sophisticated analyses provide a broad view of human genetic diversity.

    abstract::A report on the 64th annual American Society of Human Genetics meeting held in San Diego, USA, 18-22 October, 2014. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/s13059-014-0562-4

    authors: Reppell M,Koch E,Peter BM,Novembre J

    更新日期:2014-12-17 00:00:00

  • Large scale genomic reorganization of topological domains at the HoxD locus.

    abstract:BACKGROUND:The transcriptional activation of HoxD genes during mammalian limb development involves dynamic interactions with two topologically associating domains (TADs) flanking the HoxD cluster. In particular, the activation of the most posterior HoxD genes in developing digits is controlled by regulatory elements lo...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1278-z

    authors: Fabre PJ,Leleu M,Mormann BH,Lopez-Delisle L,Noordermeer D,Beccari L,Duboule D

    更新日期:2017-08-07 00:00:00

  • Signal sequence analysis of expressed sequence tags from the nematode Nippostrongylus brasiliensis and the evolution of secreted proteins in parasites.

    abstract:BACKGROUND:Parasitism is a highly successful mode of life and one that requires suites of gene adaptations to permit survival within a potentially hostile host. Among such adaptations is the secretion of proteins capable of modifying or manipulating the host environment. Nippostrongylus brasiliensis is a well-studied m...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2004-5-6-r39

    authors: Harcus YM,Parkinson J,Fernández C,Daub J,Selkirk ME,Blaxter ML,Maizels RM

    更新日期:2004-01-01 00:00:00

  • Altered retinal microRNA expression profile in a mouse model of retinitis pigmentosa.

    abstract:BACKGROUND:The role played by microRNAs (miRs) as common regulators in physiologic processes such as development and various disease states was recently highlighted. Retinitis pigmentosa (RP) linked to RHO (which encodes rhodopsin) is the most frequent form of inherited retinal degeneration that leads to blindness, for...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2007-8-11-r248

    authors: Loscher CJ,Hokamp K,Kenna PF,Ivens AC,Humphries P,Palfi A,Farrar GJ

    更新日期:2007-01-01 00:00:00

  • Characterization and comparative profiling of the small RNA transcriptomes in two phases of locust.

    abstract:BACKGROUND:All the reports on insect small RNAs come from holometabolous insects whose genome sequence data are available. Therefore, study of hemimetabolous insect small RNAs could provide more insights into evolution and function of small RNAs in insects. The locust is an important, economically harmful hemimetabolou...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2009-10-1-r6

    authors: Wei Y,Chen S,Yang P,Ma Z,Kang L

    更新日期:2009-01-01 00:00:00

  • ReMixT: clone-specific genomic structure estimation in cancer.

    abstract::Somatic evolution of malignant cells produces tumors composed of multiple clonal populations, distinguished in part by rearrangements and copy number changes affecting chromosomal segments. Whole genome sequencing mixes the signals of sampled populations, diluting the signals of clone-specific aberrations, and complic...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1267-2

    authors: McPherson AW,Roth A,Ha G,Chauve C,Steif A,de Souza CPE,Eirew P,Bouchard-Côté A,Aparicio S,Sahinalp SC,Shah SP

    更新日期:2017-07-27 00:00:00

  • An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge.

    abstract:BACKGROUND:There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challe...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2014-15-3-r53

    authors: Brownstein CA,Beggs AH,Homer N,Merriman B,Yu TW,Flannery KC,DeChene ET,Towne MC,Savage SK,Price EN,Holm IA,Luquette LJ,Lyon E,Majzoub J,Neupert P,McCallie D Jr,Szolovits P,Willard HF,Mendelsohn NJ,Temme R,Finkel R

    更新日期:2014-03-25 00:00:00

  • The relationship between proteome size, structural disorder and organism complexity.

    abstract:BACKGROUND:Sequencing the genomes of the first few eukaryotes created the impression that gene number shows no correlation with organism complexity, often referred to as the G-value paradox. Several attempts have previously been made to resolve this paradox, citing multifunctionality of proteins, alternative splicing, ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-12-r120

    authors: Schad E,Tompa P,Hegyi H

    更新日期:2011-12-19 00:00:00

  • The small RNA diversity from Medicago truncatula roots under biotic interactions evidences the environmental plasticity of the miRNAome.

    abstract:BACKGROUND:Legume roots show a remarkable plasticity to adapt their architecture to biotic and abiotic constraints, including symbiotic interactions. However, global analysis of miRNA regulation in roots is limited, and a global view of the evolution of miRNA-mediated diversification in different ecotypes is lacking. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-014-0457-4

    authors: Formey D,Sallet E,Lelandais-Brière C,Ben C,Bustos-Sanmamed P,Niebel A,Frugier F,Combier JP,Debellé F,Hartmann C,Poulain J,Gavory F,Wincker P,Roux C,Gentzbittel L,Gouzy J,Crespi M

    更新日期:2014-09-24 00:00:00

  • Systematic analysis of dark and camouflaged genes reveals disease-relevant genes hiding in plain sight.

    abstract:BACKGROUND:The human genome contains "dark" gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1707-2

    authors: Ebbert MTW,Jensen TD,Jansen-West K,Sens JP,Reddy JS,Ridge PG,Kauwe JSK,Belzil V,Pregent L,Carrasquillo MM,Keene D,Larson E,Crane P,Asmann YW,Ertekin-Taner N,Younkin SG,Ross OA,Rademakers R,Petrucelli L,Fryer JD

    更新日期:2019-05-20 00:00:00

  • Membrane transporters and protein traffic networks differentially affecting metal tolerance: a genomic phenotyping study in yeast.

    abstract:BACKGROUND:The cellular mechanisms that underlie metal toxicity and detoxification are rather variegated and incompletely understood. Genomic phenotyping was used to assess the roles played by all nonessential Saccharomyces cerevisiae proteins in modulating cell viability after exposure to cadmium, nickel, and other me...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-4-r67

    authors: Ruotolo R,Marchini G,Ottonello S

    更新日期:2008-04-07 00:00:00

  • Mathematical models in mammalian cell biology.

    abstract::A report on the Conference on Systems Biology of Mammalian Cells, Dresden, Germany, 22-24 May 2008. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb-2008-9-7-316

    authors: Herzel H,Blüthgen N

    更新日期:2008-01-01 00:00:00

  • An integrated computational pipeline and database to support whole-genome sequence annotation.

    abstract::We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2002-3-12-research0081

    authors: Mungall CJ,Misra S,Berman BP,Carlson J,Frise E,Harris N,Marshall B,Shu S,Kaminker JS,Prochnik SE,Smith CD,Smith E,Tupy JL,Wiel C,Rubin GM,Lewis SE

    更新日期:2002-01-01 00:00:00

  • An improved method for detecting and delineating genomic regions with altered gene expression in cancer.

    abstract::Genomic regions with altered gene expression are a characteristic feature of cancer cells. We present a novel method for identifying such regions in gene expression maps. This method is based on total variation minimization, a classical signal restoration technique. In systematic evaluations, we show that our method c...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-1-r13

    authors: Nilsson B,Johansson M,Heyden A,Nelander S,Fioretos T

    更新日期:2008-01-21 00:00:00

  • A Drosophila protein-interaction map centered on cell-cycle regulators.

    abstract:BACKGROUND:Maps depicting binary interactions between proteins can be powerful starting points for understanding biological systems. A proven technology for generating such maps is high-throughput yeast two-hybrid screening. In the most extensive screen to date, a Gal4-based two-hybrid system was used recently to detec...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2004-5-12-r96

    authors: Stanyon CA,Liu G,Mangiola BA,Patel N,Giot L,Kuang B,Zhang H,Zhong J,Finley RL Jr

    更新日期:2004-01-01 00:00:00

  • Comparative biology and genomics join forces to decipher the diversity of life.

    abstract::A report on the Cold Spring Harbor Laboratory meeting on the Evolution of Developmental Diversity, Cold Spring Harbor, NY, USA, 17-21 April 2002. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb-2002-3-8-reports4023

    authors: King N

    更新日期:2002-07-15 00:00:00

  • A fuzzy gene expression-based computational approach improves breast cancer prognostication.

    abstract::Early gene expression studies classified breast tumors into at least three clinically relevant subtypes. Although most current gene signatures are prognostic for estrogen receptor (ER) positive/human epidermal growth factor receptor 2 (HER2) negative breast cancers, few are informative for ER negative/HER2 negative an...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2010-11-2-r18

    authors: Haibe-Kains B,Desmedt C,Rothé F,Piccart M,Sotiriou C,Bontempi G

    更新日期:2010-01-01 00:00:00

  • Visualization of pseudogenes in intracellular bacteria reveals the different tracks to gene destruction.

    abstract:BACKGROUND:Pseudogenes reveal ancestral gene functions. Some obligate intracellular bacteria, such as Mycobacterium leprae and Rickettsia spp., carry substantial fractions of pseudogenes. Until recently, horizontal gene transfers were considered to be rare events in obligate host-associated bacteria. RESULTS:We presen...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-2-r42

    authors: Fuxelius HH,Darby AC,Cho NH,Andersson SG

    更新日期:2008-01-01 00:00:00

  • Reduced intrinsic DNA curvature leads to increased mutation rate.

    abstract:BACKGROUND:Mutation rates vary across the genome. Many trans factors that influence mutation rates have been identified, as have specific sequence motifs at the 1-7-bp scale, but cis elements remain poorly characterized. The lack of understanding regarding why different sequences have different mutation rates hampers o...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-018-1525-y

    authors: Duan C,Huan Q,Chen X,Wu S,Carey LB,He X,Qian W

    更新日期:2018-09-14 00:00:00

  • Asymmetric relationships between proteins shape genome evolution.

    abstract:BACKGROUND:The relationships between proteins are often asymmetric: one protein (A) depends for its function on another protein (B), but the second protein does not depend on the first. In metabolic networks there are multiple pathways that converge into one central pathway. The enzymes in the converging pathways depen...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2009-10-2-r19

    authors: Notebaart RA,Kensche PR,Huynen MA,Dutilh BE

    更新日期:2009-02-12 00:00:00

  • Direct measurement of transcription rates reveals multiple mechanisms for configuration of the Arabidopsis ambient temperature response.

    abstract:BACKGROUND:Sensing and responding to ambient temperature is important for controlling growth and development of many organisms, in part by regulating mRNA levels. mRNA abundance can change with temperature, but it is unclear whether this results from changes in transcription or decay rates, and whether passive or activ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2014-15-3-r45

    authors: Sidaway-Lee K,Costa MJ,Rand DA,Finkenstadt B,Penfield S

    更新日期:2014-03-03 00:00:00

  • External signals shape the epigenome.

    abstract::A new study shows how a single cytokine, interleukin-4, regulates hematopoietic lineage choice by activating the JAK3-STAT6 pathway, which causes dendritic-cell-specific DNA demethylation. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-0884-5

    authors: Lennartsson A

    更新日期:2016-02-01 00:00:00