The scoring of poses in protein-protein docking: current capabilities and future directions.

Abstract:

BACKGROUND:Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is a need to explore the differences and commonalities of these methods with each other, as well as with functions developed in the fields of molecular dynamics and homology modelling. RESULTS:We present an evaluation of 115 scoring functions on an unbound docking decoy benchmark covering 118 complexes for which a near-native solution can be found, yielding top 10 success rates of up to 58%. Hierarchical clustering is performed, so as to group together functions which identify near-natives in similar subsets of complexes. Three set theoretic approaches are used to identify pairs of scoring functions capable of correctly scoring different complexes. This shows that functions in different clusters capture different aspects of binding and are likely to work together synergistically. CONCLUSIONS:All functions designed specifically for docking perform well, indicating that functions are transferable between sampling methods. We also identify promising methods from the field of homology modelling. Further, differential success rates by docking difficulty and solution quality suggest a need for flexibility-dependent scoring. Investigating pairs of scoring functions, the set theoretic measures identify known scoring strategies as well as a number of novel approaches, indicating promising augmentations of traditional scoring methods. Such augmentation and parameter combination strategies are discussed in the context of the learning-to-rank paradigm.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Moal IH,Torchala M,Bates PA,Fernández-Recio J

doi

10.1186/1471-2105-14-286

subject

Has Abstract

pub_date

2013-10-01 00:00:00

pages

286

issn

1471-2105

pii

1471-2105-14-286

journal_volume

14

pub_type

杂志文章
  • Nanopore-based kinetics analysis of individual antibody-channel and antibody-antigen interactions.

    abstract:BACKGROUND:The UNO/RIC Nanopore Detector provides a new way to study the binding and conformational changes of individual antibodies. Many critical questions regarding antibody function are still unresolved, questions that can be approached in a new way with the nanopore detector. RESULTS:We present evidence that diff...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S7-S20

    authors: Winters-Hilt S,Morales E,Amin I,Stoyanov A

    更新日期:2007-11-01 00:00:00

  • Estimating the individualized HIV-1 genetic barrier to resistance using a nelfinavir fitness landscape.

    abstract:BACKGROUND:Failure on Highly Active Anti-Retroviral Treatment is often accompanied with development of antiviral resistance to one or more drugs included in the treatment. In general, the virus is more likely to develop resistance to drugs with a lower genetic barrier. Previously, we developed a method to reverse engin...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-409

    authors: Theys K,Deforche K,Beheydt G,Moreau Y,van Laethem K,Lemey P,Camacho RJ,Rhee SY,Shafer RW,Van Wijngaerden E,Vandamme AM

    更新日期:2010-08-03 00:00:00

  • Automated peptide mapping and protein-topographical annotation of proteomics data.

    abstract:BACKGROUND:In quantitative proteomics, peptide mapping is a valuable approach to combine positional quantitative information with topographical and domain information of proteins. Quantitative proteomic analysis of cell surface shedding is an exemplary application area of this approach. RESULTS:We developed ImproViser...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-207

    authors: Videm P,Gunasekaran D,Schröder B,Mayer B,Biniossek ML,Schilling O

    更新日期:2014-06-19 00:00:00

  • Evaluation of absolute quantitation by nonlinear regression in probe-based real-time PCR.

    abstract:BACKGROUND:In real-time PCR data analysis, the cycle threshold (CT) method is currently the gold standard. This method is based on an assumption of equal PCR efficiency in all reactions, and precision may suffer if this condition is not met. Nonlinear regression analysis (NLR) or curve fitting has therefore been sugges...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-107

    authors: Goll R,Olsen T,Cui G,Florholmen J

    更新日期:2006-03-03 00:00:00

  • Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma.

    abstract:BACKGROUND:One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene expression data. Deep...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1798-2

    authors: Young JD,Cai C,Lu X

    更新日期:2017-10-03 00:00:00

  • BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data.

    abstract:BACKGROUND:The classification of cancer subtypes is of great importance to cancer disease diagnosis and therapy. Many supervised learning approaches have been applied to cancer subtype classification in the past few years, especially of deep learning based approaches. Recently, the deep forest model has been proposed a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2095-4

    authors: Guo Y,Liu S,Li Z,Shang X

    更新日期:2018-04-11 00:00:00

  • Improved functional prediction of proteins by learning kernel combinations in multilabel settings.

    abstract:BACKGROUND:We develop a probabilistic model for combining kernel matrices to predict the function of proteins. It extends previous approaches in that it can handle multiple labels which naturally appear in the context of protein function. RESULTS:Explicit modeling of multilabels significantly improves the capability o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S2-S12

    authors: Roth V,Fischer B

    更新日期:2007-05-03 00:00:00

  • Predicting anatomic therapeutic chemical classification codes using tiered learning.

    abstract:BACKGROUND:The low success rate and high cost of drug discovery requires the development of new paradigms to identify molecules of therapeutic value. The Anatomical Therapeutic Chemical (ATC) Code System is a World Health Organization (WHO) proposed classification that assigns multi-level codes to compounds based on th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1660-6

    authors: Olson T,Singh R

    更新日期:2017-06-07 00:00:00

  • Sequencing error correction without a reference genome.

    abstract:BACKGROUND:Next (second) generation sequencing is an increasingly important tool for many areas of molecular biology, however, care must be taken when interpreting its output. Even a low error rate can cause a large number of errors due to the high number of nucleotides being sequenced. Identifying sequencing errors fr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-367

    authors: Sleep JA,Schreiber AW,Baumann U

    更新日期:2013-12-18 00:00:00

  • NPBSS: a new PacBio sequencing simulator for generating the continuous long reads with an empirical model.

    abstract:BACKGROUND:PacBio sequencing platform offers longer read lengths than the second-generation sequencing technologies. It has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. Due to its extremely wide range of application areas, fast sequencing simulation syste...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2208-0

    authors: Wei ZG,Zhang SW

    更新日期:2018-05-22 00:00:00

  • Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    abstract:BACKGROUND:Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as p...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S11-S2

    authors: Nagar A,Hahsler M

    更新日期:2013-01-01 00:00:00

  • A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data.

    abstract:BACKGROUND:Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide application area. Several standalone applications had been developed in order t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-480

    authors: Corradi L,Fato M,Porro I,Scaglione S,Torterolo L

    更新日期:2008-11-13 00:00:00

  • Probe-specific mixed-model approach to detect copy number differences using multiplex ligation-dependent probe amplification (MLPA).

    abstract:BACKGROUND:MLPA method is a potentially useful semi-quantitative method to detect copy number alterations in targeted regions. In this paper, we propose a method for the normalization procedure based on a non-linear mixed-model, as well as a new approach for determining the statistical significance of altered probes ba...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-261

    authors: González JR,Carrasco JL,Armengol L,Villatoro S,Jover L,Yasui Y,Estivill X

    更新日期:2008-06-04 00:00:00

  • Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.

    abstract:BACKGROUND:Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. Th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-99

    authors: Vlasblom J,Wodak SJ

    更新日期:2009-03-30 00:00:00

  • Predicting nucleosome positioning using a duration Hidden Markov Model.

    abstract:BACKGROUND:The nucleosome is the fundamental packing unit of DNAs in eukaryotic cells. Its detailed positioning on the genome is closely related to chromosome functions. Increasing evidence has shown that genomic DNA sequence itself is highly predictive of nucleosome positioning genome-wide. Therefore a fast software t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-346

    authors: Xi L,Fondufe-Mittendorf Y,Xia L,Flatow J,Widom J,Wang JP

    更新日期:2010-06-24 00:00:00

  • A multiresolution approach to automated classification of protein subcellular location images.

    abstract:BACKGROUND:Fluorescence microscopy is widely used to determine the subcellular location of proteins. Efforts to determine location on a proteome-wide basis create a need for automated methods to analyze the resulting images. Over the past ten years, the feasibility of using machine learning methods to recognize all maj...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-210

    authors: Chebira A,Barbotin Y,Jackson C,Merryman T,Srinivasa G,Murphy RF,Kovacević J

    更新日期:2007-06-19 00:00:00

  • GenomeCAT: a versatile tool for the analysis and integrative visualization of DNA copy number variants.

    abstract:BACKGROUND:The analysis of DNA copy number variants (CNV) has increasing impact in the field of genetic diagnostics and research. However, the interpretation of CNV data derived from high resolution array CGH or NGS platforms is complicated by the considerable variability of the human genome. Therefore, tools for multi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1430-x

    authors: Tebel K,Boldt V,Steininger A,Port M,Ebert G,Ullmann R

    更新日期:2017-01-06 00:00:00

  • ChemEx: information extraction system for chemical data curation.

    abstract:BACKGROUND:Manual chemical data curation from publications is error-prone, time consuming, and hard to maintain up-to-date data sets. Automatic information extraction can be used as a tool to reduce these problems. Since chemical structures usually described in images, information extraction needs to combine structure ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S17-S9

    authors: Tharatipyakul A,Numnark S,Wichadakul D,Ingsriswang S

    更新日期:2012-01-01 00:00:00

  • Multi-view feature selection for identifying gene markers: a diversified biological data driven approach.

    abstract:BACKGROUND:In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene express...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03810-0

    authors: Acharya S,Cui L,Pan Y

    更新日期:2020-12-30 00:00:00

  • m6Acomet: large-scale functional prediction of individual m6A RNA methylation sites from an RNA co-methylation network.

    abstract:BACKGROUND:Over one hundred different types of post-transcriptional RNA modifications have been identified in human. Researchers discovered that RNA modifications can regulate various biological processes, and RNA methylation, especially N6-methyladenosine, has become one of the most researched topics in epigenetics. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2840-3

    authors: Wu X,Wei Z,Chen K,Zhang Q,Su J,Liu H,Zhang L,Meng J

    更新日期:2019-05-02 00:00:00

  • IILLS: predicting virus-receptor interactions based on similarity and semi-supervised learning.

    abstract:BACKGROUND:Viral infectious diseases are the serious threat for human health. The receptor-binding is the first step for the viral infection of hosts. To more effectively treat human viral infectious diseases, the hidden virus-receptor interactions must be discovered. However, current computational methods for predicti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3278-3

    authors: Yan C,Duan G,Wu FX,Wang J

    更新日期:2019-12-27 00:00:00

  • Graph based fusion of miRNA and mRNA expression data improves clinical outcome prediction in prostate cancer.

    abstract:BACKGROUND:One of the main goals in cancer studies including high-throughput microRNA (miRNA) and mRNA data is to find and assess prognostic signatures capable of predicting clinical outcome. Both mRNA and miRNA expression changes in cancer diseases are described to reflect clinical characteristics like staging and pro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-488

    authors: Gade S,Porzelius C,Fälth M,Brase JC,Wuttig D,Kuner R,Binder H,Sültmann H,Beissbarth T

    更新日期:2011-12-21 00:00:00

  • 'Unite and conquer': enhanced prediction of protein subcellular localization by integrating multiple specialized tools.

    abstract:BACKGROUND:Knowing the subcellular location of proteins provides clues to their function as well as the interconnectivity of biological processes. Dozens of tools are available for predicting protein location in the eukaryotic cell. Each tool performs well on certain data sets, but their predictions often disagree for ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-420

    authors: Shen YQ,Burger G

    更新日期:2007-10-29 00:00:00

  • Leveraging TCGA gene expression data to build predictive models for cancer drug response.

    abstract:BACKGROUND:Machine learning has been utilized to predict cancer drug response from multi-omics data generated from sensitivities of cancer cell lines to different therapeutic compounds. Here, we build machine learning models using gene expression data from patients' primary tumor tissues to predict whether a patient wi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03690-4

    authors: Clayton EA,Pujol TA,McDonald JF,Qiu P

    更新日期:2020-09-30 00:00:00

  • An SVM-based system for predicting protein subnuclear localizations.

    abstract:BACKGROUND:The large gap between the number of protein sequences in databases and the number of functionally characterized proteins calls for the development of a fast computational tool for the prediction of subnuclear and subcellular localizations generally applicable to protein sequences. The information on localiza...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-291

    authors: Lei Z,Dai Y

    更新日期:2005-12-07 00:00:00

  • Local search for the generalized tree alignment problem.

    abstract:BACKGROUND:A phylogeny postulates shared ancestry relationships among organisms in the form of a binary tree. Phylogenies attempt to answer an important question posed in biology: what are the ancestor-descendent relationships between organisms? At the core of every biological problem lies a phylogenetic component. The...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-66

    authors: Varón A,Wheeler WC

    更新日期:2013-02-26 00:00:00

  • Components of the antigen processing and presentation pathway revealed by gene expression microarray analysis following B cell antigen receptor (BCR) stimulation.

    abstract:BACKGROUND:Activation of naïve B lymphocytes by extracellular ligands, e.g. antigen, lipopolysaccharide (LPS) and CD40 ligand, induces a combination of common and ligand-specific phenotypic changes through complex signal transduction pathways. For example, although all three of these ligands induce proliferation, only ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-237

    authors: Lee JA,Sinkovits RS,Mock D,Rab EL,Cai J,Yang P,Saunders B,Hsueh RC,Choi S,Subramaniam S,Scheuermann RH,Alliance for Cellular Signaling.

    更新日期:2006-05-02 00:00:00

  • FANTOM: Functional and taxonomic analysis of metagenomes.

    abstract:BACKGROUND:Interpretation of quantitative metagenomics data is important for our understanding of ecosystem functioning and assessing differences between various environmental samples. There is a need for an easy to use tool to explore the often complex metagenomics data in taxonomic and functional context. RESULTS:He...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-38

    authors: Sanli K,Karlsson FH,Nookaew I,Nielsen J

    更新日期:2013-02-01 00:00:00

  • CorrelaGenes: a new tool for the interpretation of the human transcriptome.

    abstract:BACKGROUND:The amount of gene expression data available in public repositories has grown exponentially in the last years, now requiring new data mining tools to transform them in information easily accessible to biologists. RESULTS:By exploiting expression data publicly available in the Gene Expression Omnibus (GEO) d...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S1-S6

    authors: Cremaschi P,Rovida S,Sacchi L,Lisa A,Calvi F,Montecucco A,Biamonti G,Bione S,Sacchi G

    更新日期:2014-01-01 00:00:00

  • Colonyzer: automated quantification of micro-organism growth characteristics on solid agar.

    abstract:BACKGROUND:High-throughput screens comparing growth rates of arrays of distinct micro-organism cultures on solid agar are useful, rapid methods of quantifying genetic interactions. Growth rate is an informative phenotype which can be estimated by measuring cell densities at one or more times after inoculation. Precise ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-287

    authors: Lawless C,Wilkinson DJ,Young A,Addinall SG,Lydall DA

    更新日期:2010-05-28 00:00:00