Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing.

Abstract:

BACKGROUND:Structural variations (SVs) or copy number variations (CNVs) greatly impact the functions of the genes encoded in the genome and are responsible for diverse human diseases. Although a number of existing SV detection algorithms can detect many types of SVs using whole genome sequencing (WGS) data, no single algorithm can call every type of SVs with high precision and high recall. RESULTS:We comprehensively evaluate the performance of 69 existing SV detection algorithms using multiple simulated and real WGS datasets. The results highlight a subset of algorithms that accurately call SVs depending on specific types and size ranges of the SVs and that accurately determine breakpoints, sizes, and genotypes of the SVs. We enumerate potential good algorithms for each SV category, among which GRIDSS, Lumpy, SVseq2, SoftSV, Manta, and Wham are better algorithms in deletion or duplication categories. To improve the accuracy of SV calling, we systematically evaluate the accuracy of overlapping calls between possible combinations of algorithms for every type and size range of SVs. The results demonstrate that both the precision and recall for overlapping calls vary depending on the combinations of specific algorithms rather than the combinations of methods used in the algorithms. CONCLUSION:These results suggest that careful selection of the algorithms for each type and size range of SVs is required for accurate calling of SVs. The selection of specific pairs of algorithms for overlapping calls promises to effectively improve the SV detection accuracy.

journal_name

Genome Biol

journal_title

Genome biology

authors

Kosugi S,Momozawa Y,Liu X,Terao C,Kubo M,Kamatani Y

doi

10.1186/s13059-019-1720-5

subject

Has Abstract

pub_date

2019-06-03 00:00:00

pages

117

issue

1

eissn

1474-7596

issn

1474-760X

pii

10.1186/s13059-019-1720-5

journal_volume

20

pub_type

杂志文章
  • Avianbase: a community resource for bird genomics.

    abstract::Giving access to sequence and annotation data for genome assemblies is important because, while facilitating research, it places both assembly and annotation quality under scrutiny, resulting in improvements to both. Therefore we announce Avianbase, a resource for bird genomics, which provides access to data released ...

    journal_title:Genome biology

    pub_type: 信件

    doi:10.1186/s13059-015-0588-2

    authors: Eöry L,Gilbert MT,Li C,Li B,Archibald A,Aken BL,Zhang G,Jarvis E,Flicek P,Burt DW

    更新日期:2015-01-29 00:00:00

  • Recurrent insertion and duplication generate networks of transposable element sequences in the Drosophila melanogaster genome.

    abstract:BACKGROUND:The recent availability of genome sequences has provided unparalleled insights into the broad-scale patterns of transposable element (TE) sequences in eukaryotic genomes. Nevertheless, the difficulties that TEs pose for genome assembly and annotation have prevented detailed, quantitative inferences about the...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2006-7-11-r112

    authors: Bergman CM,Quesneville H,Anxolabéhère D,Ashburner M

    更新日期:2006-01-01 00:00:00

  • To be or not to be a piRNA: genomic origin and processing of piRNAs.

    abstract::Piwi-interacting RNAs (piRNAs) originate from genomic regions dubbed piRNA clusters. How cluster transcripts are selected for processing into piRNAs is not understood. We discuss evidence for the involvement of chromatin structure and maternally inherited piRNAs in determining their fate. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb4154

    authors: Le Thomas A,Tóth KF,Aravin AA

    更新日期:2014-01-27 00:00:00

  • Toxicity in mice expressing short hairpin RNAs gives new insight into RNAi.

    abstract::Short hairpin RNAs can provide stable gene silencing via RNA interference. Recent studies have shown toxicity in vivo that appears to be related to saturation of the endogenous microRNA pathway. Will these findings limit the therapeutic use of such hairpins? ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2006-7-8-231

    authors: Snøve O Jr,Rossi JJ

    更新日期:2006-01-01 00:00:00

  • The uses of genome-wide yeast mutant collections.

    abstract::We assess five years of usage of the major genome-wide collections of mutants from Saccharomyces cerevisiae: single deletion mutants, double mutants conferring 'synthetic' lethality and the 'TRIPLES' collection of mutants obtained by random transposon insertion. Over 100 experimental conditions have been tested and mo...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2004-5-7-229

    authors: Scherens B,Goffeau A

    更新日期:2004-01-01 00:00:00

  • Species-wide distribution of highly polymorphic minisatellite markers suggests past and present genetic exchanges among house mouse subspecies.

    abstract:BACKGROUND:Four hypervariable minisatellite loci were scored on a panel of 116 individuals of various geographical origins representing a large part of the diversity present in house mouse subspecies. Internal structures of alleles were determined by minisatellite variant repeat mapping PCR to produce maps of interming...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2007-8-5-r80

    authors: Bonhomme F,Rivals E,Orth A,Grant GR,Jeffreys AJ,Bois PR

    更新日期:2007-01-01 00:00:00

  • Anticipating the 1,000 dollar genome.

    abstract::A new generation of DNA-sequencing platforms will become commercially available over the next few years. These instruments will enable re-sequencing of human genomes at a previously unimagined throughput and low cost. Here, I examine why the 1,000 dollar human genome is an important goal for research and clinical diag...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2006-7-7-112

    authors: Mardis ER

    更新日期:2006-01-01 00:00:00

  • The diversity of endothelial cells: a challenge for therapeutic angiogenesis.

    abstract::Vascular endothelia comprise a diverse population of cells that specialize in response to genetic programs and environmental cues to take on distinct roles in different vessels, tissues, and organs, and in response to pathophysiological stresses. Characterization of endothelial-cell diversity will facilitate the devel...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2004-5-2-207

    authors: Conway EM,Carmeliet P

    更新日期:2004-01-01 00:00:00

  • RNA methylomes reveal the m6A-mediated regulation of DNA demethylase gene SlDML2 in tomato fruit ripening.

    abstract:BACKGROUND:Methylation of nucleotides, notably in the forms of 5-methylcytosine (5mC) in DNA and N6-methyladenosine (m6A) in mRNA, carries important information for gene regulation. 5mC has been elucidated to participate in the regulation of fruit ripening, whereas the function of m6A in this process and the interplay ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1771-7

    authors: Zhou L,Tian S,Qin G

    更新日期:2019-08-06 00:00:00

  • Redistribution of H3K27me3 upon DNA hypomethylation results in de-repression of Polycomb target genes.

    abstract:BACKGROUND:DNA methylation and the Polycomb repression system are epigenetic mechanisms that play important roles in maintaining transcriptional repression. Recent evidence suggests that DNA methylation can attenuate the binding of Polycomb protein components to chromatin and thus plays a role in determining their geno...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2013-14-3-r25

    authors: Reddington JP,Perricone SM,Nestor CE,Reichmann J,Youngson NA,Suzuki M,Reinhardt D,Dunican DS,Prendergast JG,Mjoseng H,Ramsahoye BH,Whitelaw E,Greally JM,Adams IR,Bickmore WA,Meehan RR

    更新日期:2013-03-25 00:00:00

  • Mechanisms of aging in senescence-accelerated mice.

    abstract:BACKGROUND:Progressive neurological dysfunction is a key aspect of human aging. Because of underlying differences in the aging of mice and humans, useful mouse models have been difficult to obtain and study. We have used gene-expression analysis and polymorphism screening to study molecular senescence of the retina and...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2005-6-6-r48

    authors: Carter TA,Greenhall JA,Yoshida S,Fuchs S,Helton R,Swaroop A,Lockhart DJ,Barlow C

    更新日期:2005-01-01 00:00:00

  • FAN-C: a feature-rich framework for the analysis and visualisation of chromosome conformation capture data.

    abstract::Chromosome conformation capture data, particularly from high-throughput approaches such as Hi-C, are typically very complex to analyse. Existing analysis tools are often single-purpose, or limited in compatibility to a small number of data formats, frequently making Hi-C analyses tedious and time-consuming. Here, we p...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-02215-9

    authors: Kruse K,Hug CB,Vaquerizas JM

    更新日期:2020-12-17 00:00:00

  • Preferential binding of HIF-1 to transcriptionally active loci determines cell-type specific response to hypoxia.

    abstract:BACKGROUND:Hypoxia-inducible factor 1 (HIF-1) plays a key role in cellular adaptation to hypoxia. To better understand the determinants of HIF-1 binding and transactivation, we used ChIP-chip and gene expression profiling to define the relationship between the epigenetic landscape, sites of HIF-1 binding, and genes tra...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2009-10-10-r113

    authors: Xia X,Kung AL

    更新日期:2009-01-01 00:00:00

  • Games with a scientific purpose.

    abstract::The protein folding game Foldit shows that games are an effective way to recruit, engage and organize ordinary citizens to help solve difficult scientific problems. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-12-135

    authors: Good BM,Su AI

    更新日期:2011-12-28 00:00:00

  • Supervised harvesting of expression trees.

    abstract:BACKGROUND:We propose a new method for supervised learning from gene expression data. We call it 'tree harvesting'. This technique starts with a hierarchical clustering of genes, then models the outcome variable as a sum of the average expression profiles of chosen clusters and their products. It can be applied to many...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2001-2-1-research0003

    authors: Hastie T,Tibshirani R,Botstein D,Brown P

    更新日期:2001-01-01 00:00:00

  • New tricks for old NODs.

    abstract::Recent work has identified the human NOD-like receptor NLRX1 as a negative regulator of intracellular signaling leading to type I interferon production. Here we discuss these findings and the questions and implications they raise regarding the function of NOD-like receptors in the antiviral response. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2008-9-4-217

    authors: Pietras EM,Cheng G

    更新日期:2008-04-25 00:00:00

  • MicroPro: using metagenomic unmapped reads to provide insights into human microbiota and disease associations.

    abstract::We develop a metagenomic data analysis pipeline, MicroPro, that takes into account all reads from known and unknown microbial organisms and associates viruses with complex diseases. We utilize MicroPro to analyze four metagenomic datasets relating to colorectal cancer, type 2 diabetes, and liver cirrhosis and show tha...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1773-5

    authors: Zhu Z,Ren J,Michail S,Sun F

    更新日期:2019-08-06 00:00:00

  • Probing the yeast proteome for RNA-processing factors.

    abstract::A method has been developed to identify proteins required for the biogenesis of non-coding RNA in yeast, using a microarray to screen for aberrant patterns of RNA processing in mutant strains, and new proteins involved in the processing of ribosomal and non-coding RNAs have been found. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2003-4-10-229

    authors: Granneman S,Baserga SJ

    更新日期:2003-01-01 00:00:00

  • Modeling double strand break susceptibility to interrogate structural variation in cancer.

    abstract:BACKGROUND:Structural variants (SVs) are known to play important roles in a variety of cancers, but their origins and functional consequences are still poorly understood. Many SVs are thought to emerge from errors in the repair processes following DNA double strand breaks (DSBs). RESULTS:We used experimentally quantif...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1635-1

    authors: Ballinger TJ,Bouwman BAM,Mirzazadeh R,Garnerone S,Crosetto N,Semple CA

    更新日期:2019-02-08 00:00:00

  • Systematic evaluation of CRISPR-Cas systems reveals design principles for genome editing in human cells.

    abstract:BACKGROUND:While CRISPR-Cas systems hold tremendous potential for engineering the human genome, it is unclear how well each system performs against one another in both non-homologous end joining (NHEJ)-mediated and homology-directed repair (HDR)-mediated genome editing. RESULTS:We systematically compare five different...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-018-1445-x

    authors: Wang Y,Liu KI,Sutrisnoh NB,Srinivasan H,Zhang J,Li J,Zhang F,Lalith CRJ,Xing H,Shanmugam R,Foo JN,Yeo HT,Ooi KH,Bleckwehl T,Par YYR,Lee SM,Ismail NNB,Sanwari NAB,Lee STV,Lew J,Tan MH

    更新日期:2018-05-29 00:00:00

  • A new recruit for the army of the men of death.

    abstract::The army of the men of death, in John Bunyan's memorable phrase, has a new recruit, and fear has a new face: a face wearing a surgical mask. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2003-4-7-113

    authors: Petsko GA

    更新日期:2003-01-01 00:00:00

  • Epigenetic engineering and the art of epigenetic manipulation.

    abstract::A report on the Epigenetic Engineering Meeting hosted by the Barts Institute of Cancer, held in London, UK, May 7, 2014. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb4179

    authors: Magnani L

    更新日期:2014-01-01 00:00:00

  • Multiclass classification of microarray data with repeated measurements: application to cancer.

    abstract::Prediction of the diagnostic category of a tissue sample from its gene-expression profile and selection of relevant genes for class prediction have important applications in cancer research. We have developed the uncorrelated shrunken centroid (USC) and error-weighted, uncorrelated shrunken centroid (EWUSC) algorithms...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2003-4-12-r83

    authors: Yeung KY,Bumgarner RE

    更新日期:2003-01-01 00:00:00

  • Extensive localization of long noncoding RNAs to the cytosol and mono- and polyribosomal complexes.

    abstract:BACKGROUND:Long noncoding RNAs (lncRNAs) form an abundant class of transcripts, but the function of the majority of them remains elusive. While it has been shown that some lncRNAs are bound by ribosomes, it has also been convincingly demonstrated that these transcripts do not code for proteins. To obtain a comprehensiv...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2014-15-1-r6

    authors: van Heesch S,van Iterson M,Jacobi J,Boymans S,Essers PB,de Bruijn E,Hao W,MacInnes AW,Cuppen E,Simonis M

    更新日期:2014-01-07 00:00:00

  • Individual mRNA expression profiles reveal the effects of specific microRNAs.

    abstract:BACKGROUND:MicroRNAs (miRNAs) are oligoribonucleotides with an important role in regulation of gene expression at the level of translation. Despite imperfect target complementarity, they can also significantly reduce mRNA levels. The validity of miRNA target gene predictions is difficult to assess at the protein level....

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-5-r82

    authors: Arora A,Simpson DA

    更新日期:2008-01-01 00:00:00

  • Minimal genome-wide human CRISPR-Cas9 library.

    abstract::CRISPR guide RNA libraries have been iteratively improved to provide increasingly efficient reagents, although their large size is a barrier for many applications. We design an optimised minimal genome-wide human CRISPR-Cas9 library (MinLibCas9) by mining existing large-scale gene loss-of-function datasets, resulting ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-021-02268-4

    authors: Gonçalves E,Thomas M,Behan FM,Picco G,Pacini C,Allen F,Vinceti A,Sharma M,Jackson DA,Price S,Beaver CM,Dovey O,Parry-Smith D,Iorio F,Parts L,Yusa K,Garnett MJ

    更新日期:2021-01-21 00:00:00

  • The greatest catch: big game fishing for mRNA-bound proteins.

    abstract::Purification of proteins cross-linked to mRNAs has identified 800 mRNA-binding proteins and their characteristics. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb4030

    authors: Sibley CR,Attig J,Ule J

    更新日期:2012-07-17 00:00:00

  • Clustering of phosphorylation site recognition motifs can be exploited to predict the targets of cyclin-dependent kinase.

    abstract::Protein kinases are critical to cellular signalling and post-translational gene regulation, but their biological substrates are difficult to identify. We show that cyclin-dependent kinase (CDK) consensus motifs are frequently clustered in CDK substrate proteins. Based on this, we introduce a new computational strategy...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2007-8-2-r23

    authors: Moses AM,Hériché JK,Durbin R

    更新日期:2007-01-01 00:00:00

  • SCALE: modeling allele-specific gene expression by single-cell RNA sequencing.

    abstract::Allele-specific expression is traditionally studied by bulk RNA sequencing, which measures average expression across cells. Single-cell RNA sequencing allows the comparison of expression distribution between the two alleles of a diploid organism and the characterization of allele-specific bursting. Here, we propose SC...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1200-8

    authors: Jiang Y,Zhang NR,Li M

    更新日期:2017-04-26 00:00:00

  • Intraepithelial gamma delta T cells exposed by functional genomics.

    abstract::Epithelial tissues house gammadelta T cells, which are important for the mucosal immune system and may be involved in controlling malignancies, infections and inflammation. Whole-genome gene-expression analysis provides a new way to study the signals required for the activation of gammadelta T cells, their mode of act...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2001-2-11-reviews1031

    authors: Boismenu R,Havran WL

    更新日期:2001-01-01 00:00:00