Using the multi-objective optimization replica exchange Monte Carlo enhanced sampling method for protein-small molecule docking.

Abstract:

BACKGROUND:In this study, we extended the replica exchange Monte Carlo (REMC) sampling method to protein-small molecule docking conformational prediction using RosettaLigand. In contrast to the traditional Monte Carlo (MC) and REMC sampling methods, these methods use multi-objective optimization Pareto front information to facilitate the selection of replicas for exchange. RESULTS:The Pareto front information generated to select lower energy conformations as representative conformation structure replicas can facilitate the convergence of the available conformational space, including available near-native structures. Furthermore, our approach directly provides min-min scenario Pareto optimal solutions, as well as a hybrid of the min-min and max-min scenario Pareto optimal solutions with lower energy conformations for use as structure templates in the REMC sampling method. These methods were validated based on a thorough analysis of a benchmark data set containing 16 benchmark test cases. An in-depth comparison between MC, REMC, multi-objective optimization-REMC (MO-REMC), and hybrid MO-REMC (HMO-REMC) sampling methods was performed to illustrate the differences between the four conformational search strategies. CONCLUSIONS:Our findings demonstrate that the MO-REMC and HMO-REMC conformational sampling methods are powerful approaches for obtaining protein-small molecule docking conformational predictions based on the binding energy of complexes in RosettaLigand.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Wang H,Liu H,Cai L,Wang C,Lv Q

doi

10.1186/s12859-017-1733-6

subject

Has Abstract

pub_date

2017-07-10 00:00:00

pages

327

issue

1

issn

1471-2105

pii

10.1186/s12859-017-1733-6

journal_volume

18

pub_type

杂志文章
  • Combining techniques for screening and evaluating interaction terms on high-dimensional time-to-event data.

    abstract:BACKGROUND:Molecular data, e.g. arising from microarray technology, is often used for predicting survival probabilities of patients. For multivariate risk prediction models on such high-dimensional data, there are established techniques that combine parameter estimation and variable selection. One big challenge is to i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-58

    authors: Sariyar M,Hoffmann I,Binder H

    更新日期:2014-02-26 00:00:00

  • MergeAlign: improving multiple sequence alignment performance by dynamic reconstruction of consensus multiple sequence alignments.

    abstract:BACKGROUND:The generation of multiple sequence alignments (MSAs) is a crucial step for many bioinformatic analyses. Thus improving MSA accuracy and identifying potential errors in MSAs is important for a wide range of post-genomic research. We present a novel method called MergeAlign which constructs consensus MSAs fro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-117

    authors: Collingridge PW,Kelly S

    更新日期:2012-05-30 00:00:00

  • Low degree metabolites explain essential reactions and enhance modularity in biological networks.

    abstract:BACKGROUND:Recently there has been a lot of interest in identifying modules at the level of genetic and metabolic networks of organisms, as well as in identifying single genes and reactions that are essential for the organism. A goal of computational and systems biology is to go beyond identification towards an explana...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-118

    authors: Samal A,Singh S,Giri V,Krishna S,Raghuram N,Jain S

    更新日期:2006-03-08 00:00:00

  • Pripper: prediction of caspase cleavage sites from whole proteomes.

    abstract:BACKGROUND:Caspases are a family of proteases that have central functions in programmed cell death (apoptosis) and inflammation. Caspases mediate their effects through aspartate-specific cleavage of their target proteins, and at present almost 400 caspase substrates are known. There are several methods developed to pre...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-320

    authors: Piippo M,Lietzén N,Nevalainen OS,Salmi J,Nyman TA

    更新日期:2010-06-15 00:00:00

  • DeepCryoPicker: fully automated deep neural network for single protein particle picking in cryo-EM.

    abstract:BACKGROUND:Cryo-electron microscopy (Cryo-EM) is widely used in the determination of the three-dimensional (3D) structures of macromolecules. Particle picking from 2D micrographs remains a challenging early step in the Cryo-EM pipeline due to the diversity of particle shapes and the extremely low signal-to-noise ratio ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03809-7

    authors: Al-Azzawi A,Ouadou A,Max H,Duan Y,Tanner JJ,Cheng J

    更新日期:2020-11-09 00:00:00

  • phyloXML: XML for evolutionary biology and comparative genomics.

    abstract:BACKGROUND:Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree b...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-356

    authors: Han MV,Zmasek CM

    更新日期:2009-10-27 00:00:00

  • Evolutionary Pareto-optimization of stably folding peptides.

    abstract:BACKGROUND:As a rule, peptides are more flexible and unstructured than proteins with their substantial stabilizing hydrophobic cores. Nevertheless, a few stably folding peptides have been discovered. This raises the question whether there may be more such peptides that are unknown as yet. These molecules could be helpf...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-109

    authors: Gronwald W,Hohm T,Hoffmann D

    更新日期:2008-02-19 00:00:00

  • Computational identification of ubiquitylation sites from protein sequences.

    abstract:BACKGROUND:Ubiquitylation plays an important role in regulating protein functions. Recently, experimental methods were developed toward effective identification of ubiquitylation sites. To efficiently explore more undiscovered ubiquitylation sites, this study aims to develop an accurate sequence-based prediction method...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-310

    authors: Tung CW,Ho SY

    更新日期:2008-07-15 00:00:00

  • Methodology capture: discriminating between the "best" and the rest of community practice.

    abstract:BACKGROUND:The methodologies we use both enable and help define our research. However, as experimental complexity has increased the choice of appropriate methodologies has become an increasingly difficult task. This makes it difficult to keep track of available bioinformatics software, let alone the most suitable proto...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-359

    authors: Eales JM,Pinney JW,Stevens RD,Robertson DL

    更新日期:2008-09-01 00:00:00

  • Detecting broad domains and narrow peaks in ChIP-seq data with hiddenDomains.

    abstract:BACKGROUND:Correctly identifying genomic regions enriched with histone modifications and transcription factors is key to understanding their regulatory and developmental roles. Conceptually, these regions are divided into two categories, narrow peaks and broad domains, and different algorithms are used to identify each...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0991-z

    authors: Starmer J,Magnuson T

    更新日期:2016-03-24 00:00:00

  • The textual characteristics of traditional and Open Access scientific journals are similar.

    abstract:BACKGROUND:Recent years have seen an increased amount of natural language processing (NLP) work on full text biomedical journal publications. Much of this work is done with Open Access journal articles. Such work assumes that Open Access articles are representative of biomedical publications in general and that methods...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-183

    authors: Verspoor K,Cohen KB,Hunter L

    更新日期:2009-06-15 00:00:00

  • KRLMM: an adaptive genotype calling method for common and low frequency variants.

    abstract:BACKGROUND:SNP genotyping microarrays have revolutionized the study of complex disease. The current range of commercially available genotyping products contain extensive catalogues of low frequency and rare variants. Existing SNP calling algorithms have difficulty dealing with these low frequency variants, as the under...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-158

    authors: Liu R,Dai Z,Yeager M,Irizarry RA,Ritchie ME

    更新日期:2014-05-23 00:00:00

  • Meta-aligner: long-read alignment based on genome statistics.

    abstract:BACKGROUND:Current development of sequencing technologies is towards generating longer and noisier reads. Evidently, accurate alignment of these reads play an important role in any downstream analysis. Similarly, reducing the overall cost of sequencing is related to the time consumption of the aligner. The tradeoff bet...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1518-y

    authors: Nashta-Ali D,Aliyari A,Ahmadian Moghadam A,Edrisi MA,Motahari SA,Hossein Khalaj B

    更新日期:2017-02-23 00:00:00

  • A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm.

    abstract:BACKGROUND:The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been im...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-419

    authors: Podell S,Gaasterland T,Allen EE

    更新日期:2008-10-07 00:00:00

  • Reranking candidate gene models with cross-species comparison for improved gene prediction.

    abstract:BACKGROUND:Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-433

    authors: Liu Q,Crammer K,Pereira FC,Roos DS

    更新日期:2008-10-14 00:00:00

  • Assessing and predicting protein interactions by combining manifold embedding with multiple information integration.

    abstract:BACKGROUND:Protein-protein interactions (PPIs) play crucial roles in virtually every aspect of cellular function within an organism. Over the last decade, the development of novel high-throughput techniques has resulted in enormous amounts of data and provided valuable resources for studying protein interactions. Howev...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S7-S3

    authors: Lei YK,You ZH,Ji Z,Zhu L,Huang DS

    更新日期:2012-05-08 00:00:00

  • Predicting and improving the protein sequence alignment quality by support vector regression.

    abstract:BACKGROUND:For successful protein structure prediction by comparative modeling, in addition to identifying a good template protein with known structure, obtaining an accurate sequence alignment between a query protein and a template protein is critical. It has been known that the alignment accuracy can vary significant...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-471

    authors: Lee M,Jeong CS,Kim D

    更新日期:2007-12-03 00:00:00

  • Robust pathway sampling in phenotype prediction. Application to triple negative breast cancer.

    abstract:BACKGROUND:Phenotype prediction problems are usually considered ill-posed, as the amount of samples is very limited with respect to the scrutinized genetic probes. This fact complicates the sampling of the defective genetic pathways due to the high number of possible discriminatory genetic networks involved. In this re...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3356-6

    authors: Cernea A,Fernández-Martínez JL,deAndrés-Galiana EJ,Fernández-Ovies FJ,Alvarez-Machancoses O,Fernández-Muñiz Z,Saligan LN,Sonis ST

    更新日期:2020-03-11 00:00:00

  • HMM Logos for visualization of protein families.

    abstract:BACKGROUND:Profile Hidden Markov Models (pHMMs) are a widely used tool for protein family research. Up to now, however, there exists no method to visualize all of their central aspects graphically in an intuitively understandable way. RESULTS:We present a visualization method that incorporates both emission and transi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-7

    authors: Schuster-Böckler B,Schultz J,Rahmann S

    更新日期:2004-01-21 00:00:00

  • BINDER: computationally inferring a gene regulatory network for Mycobacterium abscessus.

    abstract:BACKGROUND:Although many of the genic features in Mycobacterium abscessus have been fully validated, a comprehensive understanding of the regulatory elements remains lacking. Moreover, there is little understanding of how the organism regulates its transcriptomic profile, enabling cells to survive in hostile environmen...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3042-8

    authors: Staunton PM,Miranda-CasoLuengo AA,Loftus BJ,Gormley IC

    更新日期:2019-09-10 00:00:00

  • CLU: a new algorithm for EST clustering.

    abstract:BACKGROUND:The continuous flow of EST data remains one of the richest sources for discoveries in modern biology. The first step in EST data mining is usually associated with EST clustering, the process of grouping of original fragments according to their annotation, similarity to known genomic DNA or each other. Cluste...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-S2-S3

    authors: Ptitsyn A,Hide W

    更新日期:2005-07-15 00:00:00

  • EGNAS: an exhaustive DNA sequence design algorithm.

    abstract:BACKGROUND:The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA) is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of seq...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-138

    authors: Kick A,Bönsch M,Mertig M

    更新日期:2012-06-20 00:00:00

  • QPath: a method for querying pathways in a protein-protein interaction network.

    abstract:BACKGROUND:Sequence comparison is one of the most prominent tools in biological research, and is instrumental in studying gene function and evolution. The rapid development of high-throughput technologies for measuring protein interactions calls for extending this fundamental operation to the level of pathways in prote...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-199

    authors: Shlomi T,Segal D,Ruppin E,Sharan R

    更新日期:2006-04-10 00:00:00

  • Application of the common base method to regression and analysis of covariance (ANCOVA) in qPCR experiments and subsequent relative expression calculation.

    abstract:BACKGROUND:Quantitative polymerase chain reaction (qPCR) is the technique of choice for quantifying gene expression. While the technique itself is well established, approaches for the analysis of qPCR data continue to improve. RESULTS:Here we expand on the common base method to develop procedures for testing linear re...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03696-y

    authors: Ganger MT,Dietz GD,Headley P,Ewing SJ

    更新日期:2020-09-29 00:00:00

  • ICEKAT: an interactive online tool for calculating initial rates from continuous enzyme kinetic traces.

    abstract:BACKGROUND:Continuous enzyme kinetic assays are often used in high-throughput applications, as they allow rapid acquisition of large amounts of kinetic data and increased confidence compared to discontinuous assays. However, data analysis is often rate-limiting in high-throughput enzyme assays, as manual inspection and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3513-y

    authors: Olp MD,Kalous KS,Smith BC

    更新日期:2020-05-14 00:00:00

  • Web-TCGA: an online platform for integrated analysis of molecular cancer data sets.

    abstract:BACKGROUND:The Cancer Genome Atlas (TCGA) is a pool of molecular data sets publicly accessible and freely available to cancer researchers anywhere around the world. However, wide spread use is limited since an advanced knowledge of statistics and statistical software is required. RESULTS:In order to improve accessibil...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0917-9

    authors: Deng M,Brägelmann J,Schultze JL,Perner S

    更新日期:2016-02-06 00:00:00

  • Venn-diaNet : venn diagram based network propagation analysis framework for comparing multiple biological experiments.

    abstract:BACKGROUND:The main research topic in this paper is how to compare multiple biological experiments using transcriptome data, where each experiment is measured and designed to compare control and treated samples. Comparison of multiple biological experiments is usually performed in terms of the number of DEGs in an arbi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3302-7

    authors: Hur B,Kang D,Lee S,Moon JH,Lee G,Kim S

    更新日期:2019-12-27 00:00:00

  • Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach.

    abstract:BACKGROUND:The identification of genes responsible for human inherited diseases is one of the most challenging tasks in human genetics. Recent studies based on phenotype similarity and gene proximity have demonstrated great success in prioritizing candidate genes for human diseases. However, most of these methods rely ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S1-S11

    authors: Zhang W,Sun F,Jiang R

    更新日期:2011-02-15 00:00:00

  • SitesIdentify: a protein functional site prediction tool.

    abstract:BACKGROUND:The rate of protein structures being deposited in the Protein Data Bank surpasses the capacity to experimentally characterise them and therefore computational methods to analyse these structures have become increasingly important. Identifying the region of the protein most likely to be involved in function i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-379

    authors: Bray T,Chan P,Bougouffa S,Greaves R,Doig AJ,Warwicker J

    更新日期:2009-11-18 00:00:00

  • Classification of viral zoonosis through receptor pattern analysis.

    abstract:BACKGROUND:Viral zoonosis, the transmission of a virus from its primary vertebrate reservoir species to humans, requires ubiquitous cellular proteins known as receptor proteins. Zoonosis can occur not only through direct transmission from vertebrates to humans, but also through intermediate reservoirs or other environm...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-96

    authors: Bae SE,Son HS

    更新日期:2011-04-13 00:00:00