Structural and Sequence Similarity Makes a Significant Impact on Machine-Learning-Based Scoring Functions for Protein-Ligand Interactions.

Abstract:

:The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.

journal_name

J Chem Inf Model

authors

Li Y,Yang J

doi

10.1021/acs.jcim.7b00049

subject

Has Abstract

pub_date

2017-04-24 00:00:00

pages

1007-1012

issue

4

eissn

1549-9596

issn

1549-960X

journal_volume

57

pub_type

杂志文章
  • Molecular simulations of aromatase reveal new insights into the mechanism of ligand binding.

    abstract::CYP19A1, also known as aromatase or estrogen synthetase, is the rate-limiting enzyme in the biosynthesis of estrogens from their corresponding androgens. Several clinically used breast cancer therapies target aromatase. In this work, explicitly solvated all-atom molecular dynamics simulations of aromatase with a model...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400225w

    authors: Park J,Czapla L,Amaro RE

    更新日期:2013-08-26 00:00:00

  • CRDOCK: an ultrafast multipurpose protein-ligand docking tool.

    abstract::An ultrafast docking and virtual screening program, CRDOCK, is presented that contains (1) a search engine that can use a variety of sampling methods and an initial energy evaluation function, (2) several energy minimization algorithms for fine tuning the binding poses, and (3) different scoring functions. This modula...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300194a

    authors: Cortés Cabrera Á,Klett J,Dos Santos HG,Perona A,Gil-Redondo R,Francis SM,Priego EM,Gago F,Morreale A

    更新日期:2012-08-27 00:00:00

  • Novel Consensus Architecture To Improve Performance of Large-Scale Multitask Deep Learning QSAR Models.

    abstract::Advances in the development of high-throughput screening and automated chemistry have rapidly accelerated the production of chemical and biological data, much of them freely accessible through literature aggregator services such as ChEMBL and PubChem. Here, we explore how to use this comprehensive mapping of chemical ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00526

    authors: Zakharov AV,Zhao T,Nguyen DT,Peryea T,Sheils T,Yasgar A,Huang R,Southall N,Simeonov A

    更新日期:2019-11-25 00:00:00

  • Characterization of DNA primary sequences by a new similarity/diversity measure based on the partial ordering.

    abstract::The similarity/diversity measures play a fundamental role in library searching, virtual screening, and quantitative structure-activity relationship/quantitative structure-property relationship modeling as well as in genomics and proteomics. In this paper, a new similarity/diversity measure is proposed as a new approac...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060099e

    authors: Todeschini R,Consonni V,Mauri A,Ballabio D

    更新日期:2006-09-01 00:00:00

  • Virtual Screening with Generative Topographic Maps: How Many Maps Are Required?

    abstract::Universal generative topographic maps (GTMs) provide two-dimensional representations of chemical space selected for their "polypharmacological competence", that is, the ability to simultaneously represent meaningful activity and property landscapes, associated with many distinct targets and properties. Several such GT...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00650

    authors: Casciuc I,Zabolotna Y,Horvath D,Marcou G,Bajorath J,Varnek A

    更新日期:2019-01-28 00:00:00

  • Homology modeling and docking evaluation of aminergic G protein-coupled receptors.

    abstract::We report the development of homology models of dopamine (D(2), D(3), and D(4)), serotonin (5-HT(1B), 5-HT(2A), 5-HT(2B), and 5-HT(2C)), histamine (H(1)), and muscarinic (M(1)) receptors, based on the high-resolution structure of the beta(2)-adrenergic receptor. The homology models were built and refined using Prime. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900444q

    authors: McRobb FM,Capuano B,Crosby IT,Chalmers DK,Yuriev E

    更新日期:2010-04-26 00:00:00

  • Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets.

    abstract::With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100264e

    authors: Kramer C,Gedeck P

    更新日期:2010-11-22 00:00:00

  • Flux (1): a virtual synthesis scheme for fragment-based de novo design.

    abstract::It is demonstrated that the fragmentation of druglike molecules by applying simplistic pseudo-retrosynthesis results in a stock of chemically meaningful building blocks for de novo molecule generation. A stochastic search algorithm in conjunction with ligand-based similarity scoring (Flux: fragment-based ligand builde...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0503560

    authors: Fechner U,Schneider G

    更新日期:2006-03-01 00:00:00

  • Annotation of Allosteric Compounds to Enhance Bioactivity Modeling for Class A GPCRs.

    abstract::Proteins often have both orthosteric and allosteric binding sites. Endogenous ligands, such as hormones and neurotransmitters, bind to the orthosteric site, while synthetic ligands may bind to orthosteric or allosteric sites, which has become a focal point in drug discovery. Usually, such allosteric modulators bind to...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00695

    authors: Burggraaff L,van Veen A,Lam CC,van Vlijmen HWT,IJzerman AP,van Westen GJP

    更新日期:2020-10-26 00:00:00

  • Generalized topological indices. Modeling gas-phase rate coefficients of atmospheric relevance.

    abstract::We develop the idea that the use of ad hoc molecular descriptors in QSAR/QSPR studies is not an optimal solution. Instead, we propose to optimize these descriptors for the specific properties under study. In the case of topological indices (TIs) we propose the use of the generalized topological indices (GTIs), which a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600448b

    authors: Estrada E,Matamala AR

    更新日期:2007-05-01 00:00:00

  • Insights on the facet specific adsorption of amino acids and peptides toward platinum.

    abstract::Engineering shape-controlled bionanomaterials requires comprehensive understanding of interactions between biomolecules and inorganic surfaces. We explore the origin of facet-selective binding of peptides adsorbed onto Pt(100) and Pt(111) crystallographic planes. Using molecular dynamics simulations, we show that upon...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400630d

    authors: Ramakrishnan SK,Martin M,Cloitre T,Firlej L,Cuisinier FJ,Gergely C

    更新日期:2013-12-23 00:00:00

  • Isomerization and Decomposition of 2-Methylfuran with External Forces.

    abstract::The primary goal of this project was to evaluate the performance of the Standard and Enforced Geometry Optimization (SEGO) method which we have recently developed. The SEGO method has been designed for an automatic location of multiple minima on the molecular Potential Energy Surface (PES), and its usefulness has been...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00352

    authors: Brzyska A,Woliński K

    更新日期:2019-08-26 00:00:00

  • Symplectic molecular dynamics simulations on specially designed parallel computers.

    abstract::We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050216q

    authors: Borstnik U,Janezic D

    更新日期:2005-11-01 00:00:00

  • L-arginine binding to human inducible nitric oxide synthase: an antisymmetric funnel route toward isoform-specific inhibitors?

    abstract::Nitric oxide (NO) is an important signaling molecule produced by a family of enzymes called nitric oxide synthases (NOS). Because NO is involved in various pathological conditions, the development of potent and isoform-selective NOS inhibitors is an important challenge. In the present study, the dimer of oxygenase dom...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100422v

    authors: Floquet N,Hernandez JF,Boucher JL,Martinez J

    更新日期:2011-06-27 00:00:00

  • Role of halogen bonds in thyroid hormone receptor selectivity: pharmacophore-based 3D-QSSR studies.

    abstract::Most physiological effects of thyroid hormones are mediated by the two thyroid hormone receptor subtypes, TRalpha and TRbeta. Several pharmacological effects mediated by TRbeta might be beneficial in important medical conditions such as obesity, hypercholesterolemia and diabetes, and selective TRbeta activation may el...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900316e

    authors: Valadares NF,Salum LB,Polikarpov I,Andricopulo AD,Garratt RC

    更新日期:2009-11-01 00:00:00

  • Matched Molecular Series Analysis for ADME Property Prediction.

    abstract::Generation and prioritization of new molecules are the most central part of the drug design process. Matched molecular series analysis (MMSA) has recently been proposed as a formal approach that captures both of these key elements of design. In order to better understand the power of MMSA and its specific limitations,...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00269

    authors: Awale M,Riniker S,Kramer C

    更新日期:2020-06-22 00:00:00

  • Structure-based CoMFA as a predictive model - CYP2C9 inhibitors as a test case.

    abstract::In this study, we tried to establish a general scheme to create a model that could predict the affinity of small compounds to their target proteins. This scheme consists of a search for ligand-binding sites on a protein, a generation of bound conformations (poses) of ligands in each of the sites by docking, identifica...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800313h

    authors: Yasuo K,Yamaotsu N,Gouda H,Tsujishita H,Hirono S

    更新日期:2009-04-01 00:00:00

  • Molecular Dynamics Simulations of Membrane-Bound STIM1 to Investigate Conformational Changes during STIM1 Activation upon Calcium Release.

    abstract::Calcium is involved in important intracellular processes, such as intracellular signaling from cell membrane receptors to the nucleus. Typically, calcium levels are kept at less than 100 nM in the nucleus and cytosol, but some calcium is stored in the endoplasmic reticulum (ER) lumen for rapid release to activate intr...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00475

    authors: Mukherjee S,Karolak A,Debant M,Buscaglia P,Renaudineau Y,Mignen O,Guida WC,Brooks WH

    更新日期:2017-02-27 00:00:00

  • BCL::MolAlign: Three-Dimensional Small Molecule Alignment for Pharmacophore Mapping.

    abstract::Small molecule flexible alignment is a critical component of both ligand- and structure-based methods in computer-aided drug discovery. Despite its importance, the availability of high-quality flexible alignment software packages is limited. Here, we present BCL::MolAlign, a freely available property-based molecular a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00020

    authors: Brown BP,Mendenhall J,Meiler J

    更新日期:2019-02-25 00:00:00

  • Development of novel statistical potentials describing cation-pi interactions in proteins and comparison with semiempirical and quantum chemistry approaches.

    abstract::Novel statistical potentials derived from known protein structures are presented. They are designed to describe cation-pi and amino-pi interactions between a positively charged amino acid or an amino acid carrying a partially charged amino group and an aromatic moiety. These potentials are based on the propensity of r...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050395b

    authors: Gilis D,Biot C,Buisine E,Dehouck Y,Rooman M

    更新日期:2006-03-01 00:00:00

  • QM/MM calculations in drug discovery: a useful method for studying binding phenomena?

    abstract::Herein we investigate whether QM/MM could prove useful as a tool to study the often subtle binding phenomena found within pharmaceutical drug discovery programs. The goal of this investigation is to determine whether it is possible to employ high level QM/MM calculations to answer specific questions around a binding e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800419j

    authors: Gleeson MP,Gleeson D

    更新日期:2009-03-01 00:00:00

  • Evaluating QM/MM Free Energy Surfaces for Ranking Cysteine Protease Covalent Inhibitors.

    abstract::One tactic for cysteine protease inhibition is to form a covalent bond between an electrophilic atom of the inhibitor and the thiol of the catalytic cysteine. In this study, we evaluate the reaction free energy obtained from a hybrid quantum mechanical/molecular mechanical (QM/MM) free energy profile as a predictor of...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00847

    authors: da Costa CHS,Bonatto V,Dos Santos AM,Lameira J,Leitão A,Montanari CA

    更新日期:2020-02-24 00:00:00

  • Benchmark performance of MultiCASE Inc. software in Ames mutagenicity set.

    abstract::The predictive performances of MC4PC were evaluated using its learning machine functionality. Its superior characteristics are demonstrated in this following up study using the newly published Ames mutagenicity benchmark set. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 评论,信件

    doi:10.1021/ci1000899

    authors: Saiakhov RD,Klopman G

    更新日期:2010-09-27 00:00:00

  • Prediction of molecular solvation free energy based on the optimization of atomic solvation parameters with genetic algorithm.

    abstract::We propose an improved solvent contact model to estimate the solvation free energy of an organic molecule from individual atomic contributions. The modification of the solvation model involves the optimization of three kinds of parameters in the solvation free energy function: atomic fragmental volume, maximum atomic ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600453b

    authors: Kang H,Choi H,Park H

    更新日期:2007-03-01 00:00:00

  • GESSE: Predicting Drug Side Effects from Drug-Target Relationships.

    abstract::The in silico prediction of unwanted side effects (SEs) caused by the promiscuous behavior of drugs and their targets is highly relevant to the pharmaceutical industry. Considerable effort is now being put into computational and experimental screening of several suspected off-target proteins in the hope that SEs might...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00120

    authors: Pérez-Nueno VI,Souchet M,Karaboga AS,Ritchie DW

    更新日期:2015-09-28 00:00:00

  • Random forest models to predict aqueous solubility.

    abstract::Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueou...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060164k

    authors: Palmer DS,O'Boyle NM,Glen RC,Mitchell JB

    更新日期:2007-01-01 00:00:00

  • Algorithm for reaction classification.

    abstract::Reaction classification has important applications, and many approaches to classification have been applied. Our own algorithm tests all maximum common substructures (MCS) between all reactant and product molecules in order to find an atom mapping containing the minimum chemical distance (MCD). Recent publications hav...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400442f

    authors: Kraut H,Eiblmaier J,Grethe G,Löw P,Matuszczyk H,Saller H

    更新日期:2013-11-25 00:00:00

  • The ensemble performance index: an improved measure for assessing ensemble pose prediction performance.

    abstract::We present a theoretical study on the performance of ensemble docking methodologies considering multiple protein structures. We perform a theoretical analysis of pose prediction experiments which is completely unbiased, as we make no assumptions about specific scoring functions, search paradigms, protein structures, o...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci2002796

    authors: Korb O,McCabe P,Cole J

    更新日期:2011-11-28 00:00:00

  • Use of 3D QSAR models for database screening: a feasibility study.

    abstract::The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined. A protocol requiring minimal user intervention has been established to align training and test set molecules using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are exemplified stu...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7002945

    authors: Hillebrecht A,Klebe G

    更新日期:2008-02-01 00:00:00

  • Structure-Based Kinase Profiling To Understand the Polypharmacological Behavior of Therapeutic Molecules.

    abstract::Several drugs elicit their therapeutic efficacy by modulating multiple cellular targets and possess varied polypharmacological actions. The identification of the molecular targets of a potent bioactive molecule is essential in determining its overall polypharmacological profile. Experimental procedures are expensive a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00227

    authors: Dutta D,Das R,Mandal C,Mandal C

    更新日期:2018-01-22 00:00:00