Toward more realistic drug-target interaction predictions.

Abstract:

:A number of supervised machine learning models have recently been introduced for the prediction of drug-target interactions based on chemical structure and genomic sequence information. Although these models could offer improved means for many network pharmacology applications, such as repositioning of drugs for new therapeutic uses, the prediction models are often being constructed and evaluated under overly simplified settings that do not reflect the real-life problem in practical applications. Using quantitative drug-target bioactivity assays for kinase inhibitors, as well as a popular benchmarking data set of binary drug-target interactions for enzyme, ion channel, nuclear receptor and G protein-coupled receptor targets, we illustrate here the effects of four factors that may lead to dramatic differences in the prediction results: (i) problem formulation (standard binary classification or more realistic regression formulation), (ii) evaluation data set (drug and target families in the application use case), (iii) evaluation procedure (simple or nested cross-validation) and (iv) experimental setting (whether training and test sets share common drugs and targets, only drugs or targets or neither). Each of these factors should be taken into consideration to avoid reporting overoptimistic drug-target interaction prediction results. We also suggest guidelines on how to make the supervised drug-target interaction prediction studies more realistic in terms of such model formulations and evaluation setups that better address the inherent complexity of the prediction task in the practical applications, as well as novel benchmarking data sets that capture the continuous nature of the drug-target interactions for kinase inhibitors.

journal_name

Brief Bioinform

authors

Pahikkala T,Airola A,Pietilä S,Shakyawar S,Szwajda A,Tang J,Aittokallio T

doi

10.1093/bib/bbu010

subject

Has Abstract

pub_date

2015-03-01 00:00:00

pages

325-37

issue

2

eissn

1467-5463

issn

1477-4054

pii

bbu010

journal_volume

16

pub_type

杂志文章
  • Computational prediction and analysis of species-specific fungi phosphorylation via feature optimization strategy.

    abstract::Protein phosphorylation is a reversible and ubiquitous post-translational modification that primarily occurs at serine, threonine and tyrosine residues and regulates a variety of biological processes. In this paper, we first briefly summarized the current progresses in computational prediction of eukaryotic protein ph...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bby122

    authors: Cao M,Chen G,Yu J,Shi S

    更新日期:2020-03-23 00:00:00

  • The digital revolution in phenotyping.

    abstract::Phenotypes have gained increased notoriety in the clinical and biological domain owing to their application in numerous areas such as the discovery of disease genes and drug targets, phylogenetics and pharmacogenomics. Phenotypes, defined as observable characteristics of organisms, can be seen as one of the bridges th...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv083

    authors: Oellrich A,Collier N,Groza T,Rebholz-Schuhmann D,Shah N,Bodenreider O,Boland MR,Georgiev I,Liu H,Livingston K,Luna A,Mallon AM,Manda P,Robinson PN,Rustici G,Simon M,Wang L,Winnenburg R,Dumontier M

    更新日期:2016-09-01 00:00:00

  • Elucidating the editome: bioinformatics approaches for RNA editing detection.

    abstract::RNA editing is a widespread co/posttranscriptional mechanism affecting primary RNAs by specific nucleotide modifications, which plays relevant roles in molecular processes including regulation of gene expression and/or the processing of noncoding RNAs. In recent years, the detection of editing sites has been improved ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbx129

    authors: Diroma MA,Ciaccia L,Pesole G,Picardi E

    更新日期:2019-03-22 00:00:00

  • Systems pharmacology in drug discovery and therapeutic insight for herbal medicines.

    abstract::Systems pharmacology is an emerging field that integrates systems biology and pharmacology to advance the process of drug discovery, development and the understanding of therapeutic mechanisms. The aim of the present work is to highlight the role that the systems pharmacology plays across the traditional herbal medici...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbt035

    authors: Huang C,Zheng C,Li Y,Wang Y,Lu A,Yang L

    更新日期:2014-09-01 00:00:00

  • Tools for the functional interpretation of metabolomic experiments.

    abstract::The so-called 'omics' approaches used in modern biology aim at massively characterizing the molecular repertories of living systems at different levels. Metabolomics is one of the last additions to the 'omics' family and it deals with the characterization of the set of metabolites in a given biological system. As meta...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbs055

    authors: Chagoyen M,Pazos F

    更新日期:2013-11-01 00:00:00

  • Sequencing technologies and tools for short tandem repeat variation detection.

    abstract::Short tandem repeats are highly polymorphic and associated with a wide range of phenotypic variation, some of which cause neurodegenerative disease in humans. With advances in high-throughput sequencing technologies, there are novel opportunities to study genetic variation. While available sequencing technologies and ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbu001

    authors: Cao MD,Balasubramanian S,Bodén M

    更新日期:2015-03-01 00:00:00

  • Reproducible probe-level analysis of the Affymetrix Exon 1.0 ST array with R/Bioconductor.

    abstract::The presence of different transcripts of a gene across samples can be analysed by whole-transcriptome microarrays. Reproducing results from published microarray data represents a challenge owing to the vast amounts of data and the large variety of preprocessing and filtering steps used before the actual analysis is ca...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbt011

    authors: Rodrigo-Domingo M,Waagepetersen R,Bødker JS,Falgreen S,Kjeldsen MK,Johnsen HE,Dybkær K,Bøgsted M

    更新日期:2014-07-01 00:00:00

  • Survey of miRNA-miRNA cooperative regulation principles across cancer types.

    abstract::Cooperative regulation among multiple microRNAs (miRNAs) is a complex type of posttranscriptional regulation in human; however, the global view of the system-level regulatory principles across cancers is still unclear. Here, we investigated miRNA-miRNA cooperative regulatory landscape across 18 cancer types and summar...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bby038

    authors: Shao T,Wang G,Chen H,Xie Y,Jin X,Bai J,Xu J,Li X,Huang J,Jin Y,Li Y

    更新日期:2019-09-27 00:00:00

  • Molecular dynamics simulations for genetic interpretation in protein coding regions: where we are, where to go and when.

    abstract::The increasing ease with which massive genetic information can be obtained from patients or healthy individuals has stimulated the development of interpretive bioinformatics tools as aids in clinical practice. Most such tools analyze evolutionary information and simple physical-chemical properties to predict whether r...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz146

    authors: Galano-Frutos JJ,García-Cebollada H,Sancho J

    更新日期:2021-01-18 00:00:00

  • Computational recognition for long non-coding RNA (lncRNA): Software and databases.

    abstract::Since the completion of the Human Genome Project, it has been widely established that most DNA is not transcribed into proteins. These non-protein-coding regions are believed to be moderators within transcriptional and post-transcriptional processes, which play key roles in the onset of diseases. Long non-coding RNAs ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbv114

    authors: Yotsukura S,duVerle D,Hancock T,Natsume-Kitatani Y,Mamitsuka H

    更新日期:2017-01-01 00:00:00

  • Structural database resources for biological macromolecules.

    abstract::This Briefing reviews the widely used, currently active, up-to-date databases derived from the worldwide Protein Data Bank (PDB) to facilitate browsing, finding and exploring its entries. These databases contain visualization and analysis tools tailored to specific kinds of molecules and interactions, often including ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbw049

    authors: Abriata LA

    更新日期:2017-07-01 00:00:00

  • LiBis: an ultrasensitive alignment augmentation for low-input bisulfite sequencing.

    abstract::The cell-free DNA (cfDNA) methylation profile in liquid biopsy has been utilized to diagnose early-stage disease and estimate therapy response. However, typical clinical procedures are capable of purifying only very small amounts of cfDNA. Whole-genome bisulfite sequencing (WGBS) is the gold standard for measuring DNA...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa332

    authors: Yin Y,Li J,Li J,Lee M,Zhao S,Guo L,Li J,Zhang M,Huang Y,Li XN,Deng Z,Sun D

    更新日期:2020-12-15 00:00:00

  • Cloud 3D-QSAR: a web tool for the development of quantitative structure-activity relationship models in drug discovery.

    abstract::Effective drug discovery contributes to the treatment of numerous diseases but is limited by high costs and long cycles. The Quantitative Structure-Activity Relationship (QSAR) method was introduced to evaluate the activity of a large number of compounds virtually, reducing the time and labor costs required for chemic...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa276

    authors: Wang YL,Wang F,Shi XX,Jia CY,Wu FX,Hao GF,Yang GF

    更新日期:2020-11-03 00:00:00

  • Data-driven rational biosynthesis design: from molecules to cell factories.

    abstract::A proliferation of chemical, reaction and enzyme databases, new computational methods and software tools for data-driven rational biosynthesis design have emerged in recent years. With the coming of the era of big data, particularly in the bio-medical field, data-driven rational biosynthesis design could potentially b...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz065

    authors: Chen F,Yuan L,Ding S,Tian Y,Hu QN

    更新日期:2020-07-15 00:00:00

  • Agents in bioinformatics, computational and systems biology.

    abstract::The adoption of agent technologies and multi-agent systems constitutes an emerging area in bioinformatics. In this article, we report on the activity of the Working Group on Agents in Bioinformatics (BIOAGENTS) founded during the first AgentLink III Technical Forum meeting on the 2nd of July, 2004, in Rome. The meetin...

    journal_title:Briefings in bioinformatics

    pub_type:

    doi:10.1093/bib/bbl014

    authors: Merelli E,Armano G,Cannata N,Corradini F,d'Inverno M,Doms A,Lord P,Martin A,Milanesi L,Möller S,Schroeder M,Luck M

    更新日期:2007-01-01 00:00:00

  • Allotetraploid and autotetraploid models of linkage analysis.

    abstract::As a group of important plant species in agriculture and biology, polyploids have been increasingly studied in terms of their genome structure and organization. There are two types of polyploids, allopolyploids and autopolyploids, each resulting from a different genetic origin, which undergo meiotic divisions of a dis...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbt075

    authors: Xu F,Tong C,Lyu Y,Bo W,Pang X,Wu R

    更新日期:2015-01-01 00:00:00

  • AlzRiskMR database: an online database for the impact of exposure factors on Alzheimer's disease.

    abstract::In view of great difficulties in the pathogenesis analysis of Alzheimer's disease (AD) presently, profiling the modifiable risk factors is crucial for early detection and intervention of AD. However, the causal associations among them have yet to be identified, and the effective integration and application of these da...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa213

    authors: Wang Z,Meng L,Liu H,Shen L,Ji HF

    更新日期:2020-09-21 00:00:00

  • Gene-based mediation analysis in epigenetic studies.

    abstract::Mediation analysis has been a useful tool for investigating the effect of mediators that lie in the path from the independent variable to the outcome. With the increasing dimensionality of mediators such as in (epi)genomics studies, high-dimensional mediation model is needed. In this work, we focus on epigenetic studi...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa113

    authors: Fang R,Yang H,Gao Y,Cao H,Goode EL,Cui Y

    更新日期:2020-07-01 00:00:00

  • HVIDB: a comprehensive database for human-virus protein-protein interactions.

    abstract::While leading to millions of people's deaths every year the treatment of viral infectious diseases remains a huge public health challenge.Therefore, an in-depth understanding of human-virus protein-protein interactions (PPIs) as the molecular interface between a virus and its host cell is of paramount importance to ob...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa425

    authors: Yang X,Lian X,Fu C,Wuchty S,Yang S,Zhang Z

    更新日期:2021-01-30 00:00:00

  • Docking of peptides to GPCRs using a combination of CABS-dock with FlexPepDock refinement.

    abstract::The structural description of peptide ligands bound to G protein-coupled receptors (GPCRs) is important for the discovery of new drugs and deeper understanding of the molecular mechanisms of life. Here we describe a three-stage protocol for the molecular docking of peptides to GPCRs using a set of different programs: ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa109

    authors: Badaczewska-Dawid AE,Kmiecik S,Koliński M

    更新日期:2020-06-10 00:00:00

  • A bi-Poisson model for clustering gene expression profiles by RNA-seq.

    abstract::With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important. We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene e...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbt029

    authors: Wang N,Wang Y,Hao H,Wang L,Wang Z,Wang J,Wu R

    更新日期:2014-07-01 00:00:00

  • Identification and comprehensive characterization of lncRNAs with copy number variations and their driving transcriptional perturbed subpathways reveal functional significance for cancer.

    abstract::Numerous studies have shown that copy number variation (CNV) in lncRNA regions play critical roles in the initiation and progression of cancer. However, our knowledge about their functionalities is still limited. Here, we firstly provided a computational method to identify lncRNAs with copy number variation (lncRNAs-C...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz113

    authors: Xu Y,Wu T,Li F,Dong Q,Wang J,Shang D,Xu Y,Zhang C,Dou Y,Hu C,Yang H,Zheng X,Zhang Y,Wang L,Li X

    更新日期:2020-12-01 00:00:00

  • Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees.

    abstract::Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference tre...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbr034

    authors: Boeckmann B,Robinson-Rechavi M,Xenarios I,Dessimoz C

    更新日期:2011-09-01 00:00:00

  • Dr AFC: drug repositioning through anti-fibrosis characteristic.

    abstract::Fibrosis is a key component in the pathogenic mechanism of a variety of diseases. These diseases involving fibrosis may share common mechanisms and therapeutic targets, and therefore common intervention strategies and medicines may be applicable for these diseases. For this reason, deliberately introducing anti-fibros...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa115

    authors: Wu D,Gao W,Li X,Tian C,Jiao N,Fang S,Xiao J,Xu Z,Zhu L,Zhang G,Zhu R

    更新日期:2020-06-22 00:00:00

  • In silico signaling modeling to understand cancer pathways and treatment responses.

    abstract::Precision medicine has changed thinking in cancer therapy, highlighting a better understanding of the individual clinical interventions. But what role do the drivers and pathways identified from pan-cancer genome analysis play in the tumor? In this letter, we will highlight the importance of in silico modeling in prec...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz033

    authors: Kunz M,Jeromin J,Fuchs M,Christoph J,Veronesi G,Flentje M,Nietzer S,Dandekar G,Dandekar T

    更新日期:2020-05-21 00:00:00

  • Towards scaling elementary flux mode computation.

    abstract::While elementary flux mode (EFM) analysis is now recognized as a cornerstone computational technique for cellular pathway analysis and engineering, EFM application to genome-scale models remains computationally prohibitive. This article provides a review of aspects of EFM computation that elucidates bottlenecks in sca...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz094

    authors: Ullah E,Yosafshahi M,Hassoun S

    更新日期:2020-12-01 00:00:00

  • Dynamics of transcriptional and post-transcriptional regulation.

    abstract::Despite gene expression programs being notoriously complex, RNA abundance is usually assumed as a proxy for transcriptional activity. Recently developed approaches, able to disentangle transcriptional and post-transcriptional regulatory processes, have revealed a more complex scenario. It is now possible to work out h...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa389

    authors: Furlan M,de Pretis S,Pelizzola M

    更新日期:2020-12-22 00:00:00

  • TOD-CUP: a gene expression rank-based majority vote algorithm for tissue origin diagnosis of cancers of unknown primary.

    abstract::Gene expression profiling holds great potential as a new approach to histological diagnosis and precision medicine of cancers of unknown primary (CUP). Batch effects and different data types greatly decrease the predictive performance of biomarker-based algorithms, and few methods have been widely applied to identify ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa031

    authors: Shen Y,Chu Q,Yin X,He Y,Bai P,Wang Y,Fang W,Timko MP,Fan L,Jiang W

    更新日期:2020-04-08 00:00:00

  • Class-imbalanced classifiers for high-dimensional data.

    abstract::A class-imbalanced classifier is a decision rule to predict the class membership of new samples from an available data set where the class sizes differ considerably. When the class sizes are very different, most standard classification algorithms may favor the larger (majority) class resulting in poor accuracy in the ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbs006

    authors: Lin WJ,Chen JJ

    更新日期:2013-01-01 00:00:00

  • Conceptual and computational framework for logical modelling of biological networks deregulated in diseases.

    abstract::Mathematical models can serve as a tool to formalize biological knowledge from diverse sources, to investigate biological questions in a formal way, to test experimental hypotheses, to predict the effect of perturbations and to identify underlying mechanisms. We present a pipeline of computational tools that performs ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx163

    authors: Montagud A,Traynard P,Martignetti L,Bonnet E,Barillot E,Zinovyev A,Calzone L

    更新日期:2019-07-19 00:00:00