Benchmarking computational tools for polymorphic transposable element detection.

Abstract:

:Transposable elements (TEs) are an important source of human genetic variation with demonstrable effects on phenotype. Recently, a number of computational methods for the detection of polymorphic TE (polyTE) insertion sites from next-generation sequence data have been developed. The use of such tools will become increasingly important as the pace of human genome sequencing accelerates. For this report, we performed a comparative benchmarking and validation analysis of polyTE detection tools in an effort to inform their selection and use by the TE research community. We analyzed a core set of seven tools with respect to ease of use and accessibility, polyTE detection performance and runtime parameters. An experimentally validated set of 893 human polyTE insertions was used for this purpose, along with a series of simulated data sets that allowed us to assess the impact of sequence coverage on tool performance. The recently developed tool MELT showed the best overall performance followed by Mobster and then RetroSeq. PolyTE detection tools can best detect Alu insertion events in the human genome with reduced reliability for L1 insertions and substantially lowered performance for SVA insertions. We also show evidence that different polyTE detection tools are complementary with respect to their ability to detect a complete set of insertion events. Accordingly, a combined approach, coupled with manual inspection of individual results, may yield the best overall performance. In addition to the benchmarking results, we also provide notes on tool installation and usage as well as suggestions for future polyTE detection algorithm development.

journal_name

Brief Bioinform

authors

Rishishwar L,Mariño-Ramírez L,Jordan IK

doi

10.1093/bib/bbw072

subject

Has Abstract

pub_date

2017-11-01 00:00:00

pages

908-918

issue

6

eissn

1467-5463

issn

1477-4054

pii

bbw072

journal_volume

18

pub_type

杂志文章
  • Irinotecan and vandetanib create synergies for treatment of pancreatic cancer patients with concomitant TP53 and KRAS mutations.

    abstract:BACKGROUND:The most frequently mutated gene pairs in pancreatic adenocarcinoma (PAAD) are KRAS and TP53, and our goal is to illustrate the multiomics and molecular dynamics landscapes of KRAS/TP53 mutation and also to obtain prospective novel drugs for KRAS- and TP53-mutated PAAD patients. Moreover, we also made an att...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa149

    authors: Kaushik AC,Wang YJ,Wang X,Wei DQ

    更新日期:2020-07-31 00:00:00

  • Links between kinetic data and sequences in the alpha/beta-hydrolases fold database.

    abstract::While the number of sequenced genes is increasing dramatically, the number of different protein structural families is expected to be more limited. Changes in enzymatic activity or protein interactions can dramatically modify the role of homologous proteins in different organisms or mutants. However, experimental data...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/2.1.30

    authors: Chatonnet A,Cousin X,Robinson A

    更新日期:2001-03-01 00:00:00

  • The mechanistic, diagnostic and therapeutic novel nucleic acids for hepatocellular carcinoma emerging in past score years.

    abstract::Despite The Central Dogma states the destiny of gene as 'DNA makes RNA and RNA makes protein', the nucleic acids not only store and transmit genetic information but also, surprisingly, join in intracellular vital movement as a regulator of gene expression. Bioinformatics has contributed to knowledge for a series of em...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa023

    authors: Zhang S,Zhou Y,Wang Y,Wang Z,Xiao Q,Zhang Y,Lou Y,Qiu Y,Zhu F

    更新日期:2020-04-06 00:00:00

  • Toward more realistic drug-target interaction predictions.

    abstract::A number of supervised machine learning models have recently been introduced for the prediction of drug-target interactions based on chemical structure and genomic sequence information. Although these models could offer improved means for many network pharmacology applications, such as repositioning of drugs for new t...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbu010

    authors: Pahikkala T,Airola A,Pietilä S,Shakyawar S,Szwajda A,Tang J,Aittokallio T

    更新日期:2015-03-01 00:00:00

  • Deep-DRM: a computational method for identifying disease-related metabolites based on graph deep learning approaches.

    abstract:MOTIVATION:The functional changes of the genes, RNAs and proteins will eventually be reflected in the metabolic level. Increasing number of researchers have researched mechanism, biomarkers and targeted drugs by metabolites. However, compared with our knowledge about genes, RNAs, and proteins, we still know few about d...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa212

    authors: Zhao T,Hu Y,Cheng L

    更新日期:2020-10-13 00:00:00

  • RNA-mediated translation regulation in viral genomes: computational advances in the recognition of sequences and structures.

    abstract::RNA structures are widely distributed across all life forms. The global conformation of these structures is defined by a variety of constituent structural units such as helices, hairpin loops, kissing-loop motifs and pseudoknots, which often behave in a modular way. Their ubiquitous distribution is associated with a v...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz054

    authors: Gupta A,Bansal M

    更新日期:2020-07-15 00:00:00

  • Towards a comprehensive picture of the genetic landscape of complex traits.

    abstract::The formation of phenotypic traits, such as biomass production, tumor volume and viral abundance, undergoes a complex process in which interactions between genes and developmental stimuli take place at each level of biological organization from cells to organisms. Traditional studies emphasize the impact of genes by d...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbs049

    authors: Wang Z,Wang Y,Wang N,Wang J,Wang Z,Vallejos CE,Wu R

    更新日期:2014-01-01 00:00:00

  • TrimNet: learning molecular representation from triplet messages for biomedicine.

    abstract:MOTIVATION:Computational methods accelerate drug discovery and play an important role in biomedicine, such as molecular property prediction and compound-protein interaction (CPI) identification. A key challenge is to learn useful molecular representation. In the early years, molecular properties are mainly calculated b...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa266

    authors: Li P,Li Y,Hsieh CY,Zhang S,Liu X,Liu H,Song S,Yao X

    更新日期:2020-11-04 00:00:00

  • TRCirc: a resource for transcriptional regulation information of circRNAs.

    abstract::In recent years, high-throughput genomic technologies like chromatin immunoprecipitation sequencing (ChIp-seq) and transcriptome sequencing (RNA-seq) have been becoming both more refined and less expensive, making them more accessible. Many circular RNAs (circRNAs) that originate from back-spliced exons have been iden...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bby083

    authors: Tang Z,Li X,Zhao J,Qian F,Feng C,Li Y,Zhang J,Jiang Y,Yang Y,Wang Q,Li C

    更新日期:2019-11-27 00:00:00

  • Tools for the functional interpretation of metabolomic experiments.

    abstract::The so-called 'omics' approaches used in modern biology aim at massively characterizing the molecular repertories of living systems at different levels. Metabolomics is one of the last additions to the 'omics' family and it deals with the characterization of the set of metabolites in a given biological system. As meta...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbs055

    authors: Chagoyen M,Pazos F

    更新日期:2013-11-01 00:00:00

  • Accounting for differential variability in detecting differentially methylated regions.

    abstract::DNA methylation plays an essential role in cancer. Differential variability (DV) in cancer was recently observed that contributes to cancer heterogeneity and has been shown to be crucial in detecting epigenetic field defects, DNA methylation alterations happening early in carcinogenesis. As neighboring CpG sites are h...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx097

    authors: Wang Y,Teschendorff AE,Widschwendter M,Wang S

    更新日期:2019-01-18 00:00:00

  • Using prior knowledge from cellular pathways and molecular networks for diagnostic specimen classification.

    abstract::For many complex diseases, an earlier and more reliable diagnosis is considered a key prerequisite for developing more effective therapies to prevent or delay disease progression. Classical statistical learning approaches for specimen classification using omics data, however, often cannot provide diagnostic models wit...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv044

    authors: Glaab E

    更新日期:2016-05-01 00:00:00

  • Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing.

    abstract::Technical advances such as the development of molecular cloning, Sanger sequencing, PCR and oligonucleotide microarrays are key to our current capacity to sequence, annotate and study complete organismal genomes. Recent years have seen the development of a variety of so-called 'next-generation' sequencing platforms, w...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbp046

    authors: Horner DS,Pavesi G,Castrignanò T,De Meo PD,Liuni S,Sammeth M,Picardi E,Pesole G

    更新日期:2010-03-01 00:00:00

  • InstaDock: A single-click graphical user interface for molecular docking-based virtual high-throughput screening.

    abstract::Exploring protein-ligand interactions is a subject of immense interest, as it provides deeper insights into molecular recognition, mechanism of interaction and subsequent functions. Predicting an accurate model for a protein-ligand interaction is a challenging task. Molecular docking is a computational method used for...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa279

    authors: Mohammad T,Mathur Y,Hassan MI

    更新日期:2020-10-26 00:00:00

  • Comparative genome assembly.

    abstract::One of the most complex and computationally intensive tasks of genome sequence analysis is genome assembly. Even today, few centres have the resources, in both software and hardware, to assemble a genome from the thousands or millions of individual sequences generated in a whole-genome shotgun sequencing project. With...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/5.3.237

    authors: Pop M,Phillippy A,Delcher AL,Salzberg SL

    更新日期:2004-09-01 00:00:00

  • LARMD: integration of bioinformatic resources to profile ligand-driven protein dynamics with a case on the activation of estrogen receptor.

    abstract::Protein dynamics is central to all biological processes, including signal transduction, cellular regulation and biological catalysis. Among them, in-depth exploration of ligand-driven protein dynamics contributes to an optimal understanding of protein function, which is particularly relevant to drug discovery. Hence, ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz141

    authors: Yang JF,Wang F,Chen YZ,Hao GF,Yang GF

    更新日期:2020-12-01 00:00:00

  • Opportunities for community awareness platforms in personal genomics and bioinformatics education.

    abstract::Precision and personalized medicine will be increasingly based on the integration of various type of information, particularly electronic health records and genome sequences. The availability of cheap genome sequencing services and the information interoperability will increase the role of online bioinformatics analys...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbw078

    authors: Bianchi L,Liò P

    更新日期:2017-11-01 00:00:00

  • Unraveling chloroplast transcriptomes with ChloroSeq, an organelle RNA-Seq bioinformatics pipeline.

    abstract::Online sequence repositories are teeming with RNA sequencing (RNA-Seq) data from a wide range of eukaryotes. Although most of these data sets contain large numbers of organelle-derived reads, researchers tend to ignore these data, focusing instead on the nuclear-derived transcripts. Consequently, GenBank contains mass...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbw088

    authors: Smith DR,Sanitá Lima M

    更新日期:2017-11-01 00:00:00

  • Identifying miRNAs, targets and functions.

    abstract::microRNAs (miRNAs) are small endogenous non-coding RNAs that function as the universal specificity factors in post-transcriptional gene silencing. Discovering miRNAs, identifying their targets and further inferring miRNA functions have been a critical strategy for understanding normal biological processes of miRNAs an...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbs075

    authors: Liu B,Li J,Cairns MJ

    更新日期:2014-01-01 00:00:00

  • Common introns within orthologous genes: software and application to plants.

    abstract::The residence of spliceosomal introns within protein-coding genes can fluctuate over time, with genes gaining, losing or conserving introns in a complex process that is not entirely understood. One approach for studying intron evolution is to compare introns with respect to position and type within closely related gen...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbp051

    authors: Wilkerson MD,Ru Y,Brendel VP

    更新日期:2009-11-01 00:00:00

  • Comprehensive characterization of tissue-specific circular RNAs in the human and mouse genomes.

    abstract::Circular RNA (circRNA) is a group of RNA family generated by RNA circularization, which was discovered ubiquitously across different species and tissues. However, there is no global view of tissue specificity for circRNAs to date. Here we performed the comprehensive analysis to characterize the features of human and m...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbw081

    authors: Xia S,Feng J,Lei L,Hu J,Xia L,Wang J,Xiang Y,Liu L,Zhong S,Han L,He C

    更新日期:2017-11-01 00:00:00

  • Automated glycopeptide analysis--review of current state and future directions.

    abstract::Glycosylation of proteins is involved in immune defense, cell-cell adhesion, cellular recognition and pathogen binding and is one of the most common and complex post-translational modifications. Science is still struggling to assign detailed mechanisms and functions to this form of conjugation. Even the structural ana...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbs045

    authors: Dallas DC,Martin WF,Hua S,German JB

    更新日期:2013-05-01 00:00:00

  • Exploring the function of genetic variants in the non-coding genomic regions: approaches for identifying human regulatory variants affecting gene expression.

    abstract::Understanding the genetic basis of human traits/diseases and the underlying mechanisms of how these traits/diseases are affected by genetic variations is critical for public health. Current genome-wide functional genomics data uncovered a large number of functional elements in the noncoding regions of human genome, pr...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbu018

    authors: Li MJ,Yan B,Sham PC,Wang J

    更新日期:2015-05-01 00:00:00

  • Docking of peptides to GPCRs using a combination of CABS-dock with FlexPepDock refinement.

    abstract::The structural description of peptide ligands bound to G protein-coupled receptors (GPCRs) is important for the discovery of new drugs and deeper understanding of the molecular mechanisms of life. Here we describe a three-stage protocol for the molecular docking of peptides to GPCRs using a set of different programs: ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa109

    authors: Badaczewska-Dawid AE,Kmiecik S,Koliński M

    更新日期:2020-06-10 00:00:00

  • Empirical comparison and analysis of web-based cell-penetrating peptide prediction tools.

    abstract::Cell-penetrating peptides (CPPs) facilitate the delivery of therapeutically relevant molecules, including DNA, proteins and oligonucleotides, into cells both in vitro and in vivo. This unique ability explores the possibility of CPPs as therapeutic delivery and its potential applications in clinical therapy. Over the l...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bby124

    authors: Su R,Hu J,Zou Q,Manavalan B,Wei L

    更新日期:2020-03-23 00:00:00

  • Allotetraploid and autotetraploid models of linkage analysis.

    abstract::As a group of important plant species in agriculture and biology, polyploids have been increasingly studied in terms of their genome structure and organization. There are two types of polyploids, allopolyploids and autopolyploids, each resulting from a different genetic origin, which undergo meiotic divisions of a dis...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbt075

    authors: Xu F,Tong C,Lyu Y,Bo W,Pang X,Wu R

    更新日期:2015-01-01 00:00:00

  • Dr AFC: drug repositioning through anti-fibrosis characteristic.

    abstract::Fibrosis is a key component in the pathogenic mechanism of a variety of diseases. These diseases involving fibrosis may share common mechanisms and therapeutic targets, and therefore common intervention strategies and medicines may be applicable for these diseases. For this reason, deliberately introducing anti-fibros...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa115

    authors: Wu D,Gao W,Li X,Tian C,Jiao N,Fang S,Xiao J,Xu Z,Zhu L,Zhang G,Zhu R

    更新日期:2020-06-22 00:00:00

  • A review of bioinformatics education in the UK.

    abstract::If the completion of the first draft of the human genome represents the coming of age of bioinformatics, then the emergence of bioinformatics as a university degree subject represents its establishment. In this paper bioinformatics as a subject for formal study is discussed, rather than as a subject for research, and ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/4.1.7

    authors: Counsell D

    更新日期:2003-03-01 00:00:00

  • Public data and open source tools for multi-assay genomic investigation of disease.

    abstract::Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays, has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing precision therapies. Large publicly funded projects have gene...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbv080

    authors: Kannan L,Ramos M,Re A,El-Hachem N,Safikhani Z,Gendoo DM,Davis S,Gomez-Cabrero D,Castelo R,Hansen KD,Carey VJ,Morgan M,Culhane AC,Haibe-Kains B,Waldron L

    更新日期:2016-07-01 00:00:00

  • Improving structure-based virtual screening performance via learning from scoring function components.

    abstract::Scoring functions (SFs) based on complex machine learning (ML) algorithms have gradually emerged as a promising alternative to overcome the weaknesses of classical SFs. However, extensive efforts have been devoted to the development of SFs based on new protein-ligand interaction representations and advanced alternativ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa094

    authors: Xiong GL,Ye WL,Shen C,Lu AP,Hou TJ,Cao DS

    更新日期:2020-06-04 00:00:00