Structural and Sequence Similarity Makes a Significant Impact on Machine-Learning-Based Scoring Functions for Protein-Ligand Interactions.

Abstract:

:The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.

journal_name

J Chem Inf Model

authors

Li Y,Yang J

doi

10.1021/acs.jcim.7b00049

subject

Has Abstract

pub_date

2017-04-24 00:00:00

pages

1007-1012

issue

4

eissn

1549-9596

issn

1549-960X

journal_volume

57

pub_type

杂志文章
  • Searching for recursively defined generic chemical patterns in nonenumerated fragment spaces.

    abstract::Retrieving molecules with specific structural features is a fundamental requirement of today's molecular database technologies. Estimates claim the chemical space relevant for drug discovery to be around 10⁶⁰ molecules. This figure is many orders of magnitude larger than the amount of molecules conventional databases ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400107k

    authors: Ehrlich HC,Henzler AM,Rarey M

    更新日期:2013-07-22 00:00:00

  • Toward high throughput 3D virtual screening using spherical harmonic surface representations.

    abstract::Searching chemical databases for possible drug leads is often one of the main activities conducted during the early stages of a drug development project. This article shows that spherical harmonic molecular shape representations provide a powerful way to search and cluster small-molecule databases rapidly and accurate...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7001507

    authors: Mavridis L,Hudson BD,Ritchie DW

    更新日期:2007-09-01 00:00:00

  • Criterion for evaluating the predictive ability of nonlinear regression models without cross-validation.

    abstract::We propose predictive performance criteria for nonlinear regression models without cross-validation. The proposed criteria are the determination coefficient and the root-mean-square error for the midpoints between k-nearest-neighbor data points. These criteria can be used to evaluate predictive ability after the regre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4003766

    authors: Kaneko H,Funatsu K

    更新日期:2013-09-23 00:00:00

  • Facile Solutions to the Problems Associated with Chemical Information and Mathematical Symbolism While Using Machine Translation Tools.

    abstract::Advances in computer-aided translation technology have made tremendous progress in accuracy in the past few years. Chemical Abstracts Service of the American Chemical Society summarizes scientific works from more than 50 languages and allows the users to search papers in nine selected languages. Currently, only the ab...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00274

    authors: Wahab MF,Zulfiqar S,Sarwar MI,Lieberwirth I

    更新日期:2020-07-27 00:00:00

  • The valence state combination model: a generic framework for handling tautomers and protonation states.

    abstract::The consistent handling of molecules is probably the most basic and important requirement in the field of cheminformatics. Reliable results can only be obtained if the underlying calculations are independent of the specific way molecules are represented in the input data. However, ensuring consistency is a complex tas...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400724v

    authors: Urbaczek S,Kolodzik A,Rarey M

    更新日期:2014-03-24 00:00:00

  • LiCABEDS II. Modeling of ligand selectivity for G-protein-coupled cannabinoid receptors.

    abstract::The cannabinoid receptor subtype 2 (CB2) is a promising therapeutic target for blood cancer, pain relief, osteoporosis, and immune system disease. The recent withdrawal of Rimonabant, which targets another closely related cannabinoid receptor (CB1), accentuates the importance of selectivity for the development of CB2 ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3003914

    authors: Ma C,Wang L,Yang P,Myint KZ,Xie XQ

    更新日期:2013-01-28 00:00:00

  • Effect of structural stress on the flexibility and adaptability of HIV-1 protease.

    abstract::Resistance remains a major issue with regards to HIV-1 protease, despite the availability of numerous HIV-1 protease inhibitors and copious amounts of structural and binding data. In an effort to improve our understanding of how HIV-1 protease is able to "outsmart" new drugs, we have investigated the flexibility of HI...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci2000677

    authors: Oehme DP,Wilson DJ,Brownlee RT

    更新日期:2011-05-23 00:00:00

  • Getting Docking into Shape Using Negative Image-Based Rescoring.

    abstract::The failure of default scoring functions to ensure virtual screening enrichment is a persistent problem for the molecular docking algorithms used in structure-based drug discovery. To remedy this problem, elaborate rescoring and postprocessing schemes have been developed with a varying degree of success, specificity, ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00383

    authors: Kurkinen ST,Lätti S,Pentikäinen OT,Postila PA

    更新日期:2019-08-26 00:00:00

  • Exploring Topological Pharmacophore Graphs for Scaffold Hopping.

    abstract::The primary goal of ligand-based virtual screening is to identify active compounds consisting of a core scaffold that is not found in the current active compound pool. Scaffold hopping is the term used for this purpose. In the present study, topological representations of pharmacophore features on chemical graphs were...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00098

    authors: Nakano H,Miyao T,Funatsu K

    更新日期:2020-04-27 00:00:00

  • Structural basis for the mutation-induced dysfunction of human CYP2J2: a computational study.

    abstract::Arachidonic acid is an essential fatty acid in cells, acting as a key inflammatory intermediate in inflammatory reactions. In cardiac tissues, CYP2J2 can adopt arachidonic acid as a major substrate to produce epoxyeicosatrienoic acids (EETs), which can protect endothelial cells from ischemic or hypoxic injuries and ha...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400003p

    authors: Cong S,Ma XT,Li YX,Wang JF

    更新日期:2013-06-24 00:00:00

  • Protein kinases: docking and homology modeling reliability.

    abstract::A database of about 700 high-resolution kinase structures was used to test the reliability of 17 docking procedures (using six docking software packages) by means of self- and cross-docking studies. The analysis of about 80 000 docking calculations suggests that the docking of an unknown ligand into a kinase has a pro...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100161z

    authors: Tuccinardi T,Botta M,Giordano A,Martinelli A

    更新日期:2010-08-23 00:00:00

  • Get Your Atoms in Order--An Open-Source Implementation of a Novel and Robust Molecular Canonicalization Algorithm.

    abstract::Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00543

    authors: Schneider N,Sayle RA,Landrum GA

    更新日期:2015-10-26 00:00:00

  • Determination of Structural Ensembles of Flexible Molecules in Solution from NMR Data Undergoing Spin Diffusion.

    abstract::Spin diffusion is a formidable problem when interpreting NMR data of chemical compounds. We developed a method to reconstruct the conformational ensemble of flexible molecules displaying spin diffusion, which minimizes the subjective bias in the interpretation of experimental data and which can be used routinely to ob...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00259

    authors: Vasile F,Tiana G

    更新日期:2019-06-24 00:00:00

  • VMD Store-A VMD Plugin to Browse, Discover, and Install VMD Extensions.

    abstract::Herein we present the VMD Store, an open-source VMD plugin that simplifies the way that users browse, discover, install, update, and uninstall extensions for the Visual Molecular Dynamics (VMD) software. The VMD Store obtains data about all the indexed VMD extensions hosted on GitHub and presents a one-click mechanism...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00739

    authors: Fernandes HS,Sousa SF,Cerqueira NMFSA

    更新日期:2019-11-25 00:00:00

  • Whole-molecule calculation of log p based on molar volume, hydrogen bonds, and simulated 13C NMR spectra.

    abstract::The prediction of Log P is usually accomplished using either substructure or whole-molecule approaches. However, these methods are complicated, and previous whole-molecule approaches have not been successful for the prediction of Log P in very complex molecules. The observed chemical shifts in nuclear magnetic resonan...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci049643e

    authors: Schnackenberg LK,Beger RD

    更新日期:2005-03-01 00:00:00

  • CRDOCK: an ultrafast multipurpose protein-ligand docking tool.

    abstract::An ultrafast docking and virtual screening program, CRDOCK, is presented that contains (1) a search engine that can use a variety of sampling methods and an initial energy evaluation function, (2) several energy minimization algorithms for fine tuning the binding poses, and (3) different scoring functions. This modula...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300194a

    authors: Cortés Cabrera Á,Klett J,Dos Santos HG,Perona A,Gil-Redondo R,Francis SM,Priego EM,Gago F,Morreale A

    更新日期:2012-08-27 00:00:00

  • Growth of ligand-target interaction data in ChEMBL is associated with increasing and activity measurement-dependent compound promiscuity.

    abstract::Compounds with high-confidence target annotations and activity measurements in the original and current release of the ChEMBL database have been compared to better understand how the growth of compound activity data might influence the spectrum of ligand-target interactions and the degree of target promiscuity among a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3003304

    authors: Hu Y,Bajorath J

    更新日期:2012-10-22 00:00:00

  • Ensemble docking into multiple crystallographically derived protein structures: an evaluation based on the statistical analysis of enrichments.

    abstract::Docking into multiple receptor conformations ("ensemble docking") has been proposed, and employed, in the hope that it may account for receptor flexibility in virtual screening and thus provide higher enrichments than docking into single rigid receptor structures. The statistical analyses presented in this paper provi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900407c

    authors: Craig IR,Essex JW,Spiegel K

    更新日期:2010-04-26 00:00:00

  • Training a scoring function for the alignment of small molecules.

    abstract::A comprehensive data set of aligned ligands with highly similar binding pockets from the Protein Data Bank has been built. Based on this data set, a scoring function for recognizing good alignment poses for small molecules has been developed. This function is based on atoms and hydrogen-bond projected features. The co...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100227h

    authors: Chan SL,Labute P

    更新日期:2010-09-27 00:00:00

  • iUmami-SCM: A Novel Sequence-Based Predictor for Prediction and Analysis of Umami Peptides Using a Scoring Card Method with Propensity Scores of Dipeptides.

    abstract::Umami or the taste of monosodium glutamate represents one of the major attractive taste modalities in humans. Therefore, knowledge about biophysical and biochemical properties of the umami taste is important for both scientific research and the food industry. Experimental approaches for predicting umami peptides are l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00707

    authors: Charoenkwan P,Yana J,Nantasenamat C,Hasan MM,Shoombuatong W

    更新日期:2020-12-28 00:00:00

  • Comparative analysis of binding energy of chymostatin with human cathepsin A and its homologous proteins by molecular orbital calculation.

    abstract::Cathepsin A is a mammalian lysosomal enzyme that catalyzes the hydrolysis of the carboxy-terminal amino acids of polypeptides and also regulates beta-galactosidase and neuraminidase-1 activities through the formation of a multienzymic complex in lysosomes. Human cathepsin A (hCathA), yeast carboxypeptidase (CPY), and ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060093p

    authors: Yoshida T,Lepp Z,Kadota Y,Satoh Y,Itoh K,Chuman H

    更新日期:2006-09-01 00:00:00

  • T-Cell Receptor Binding Affects the Dynamics of the Peptide/MHC-I Complex.

    abstract::The recognition of peptide/MHC by T-cell receptors is one of the most important interactions in the adaptive immune system. A large number of computational studies have investigated the structural dynamics of this interaction. However, to date only limited attention has been paid to differences between the dynamics of...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00511

    authors: Knapp B,Deane CM

    更新日期:2016-01-25 00:00:00

  • Structural insight into the unique binding properties of pyridylethanol(phenylethyl)amine inhibitor in human CYP51.

    abstract::Sterol 14α-demethylase (CYP51) is the main drug target for the treatment of fungal infections. The discovery of new efficient fungal CYP51 inhibitors requires an understanding of the structural requirements for selectivity for the fungal over the human ortholog. In this study, a binding mode of the pyridylethanol(phen...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500556k

    authors: Zelenko U,Hodošček M,Rozman D,Golič Grdadolnik S

    更新日期:2014-12-22 00:00:00

  • Stability studies of transition-metal linkage isomers using quantum mechanical methods. Groups 11 and 12 transition metals.

    abstract::Several hypotheses to elucidate the linkage isomer preference of the thiocyanate (SCN(-)) ion have been offered. For complexes with small coordination numbers (i.e., 1 and 2) and groups 11 (Cu-triad) and 12 (Zn-triad) metals, different levels of theory and a variety of basis sets have been employed to study linkage is...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050050t

    authors: Buda C,Kazi AB,Dinescu A,Cundari TR

    更新日期:2005-07-01 00:00:00

  • Ranking chemical structures for drug discovery: a new machine learning approach.

    abstract::With chemical libraries increasingly containing millions of compounds or more, there is a fast-growing need for computational methods that can rank or prioritize compounds for screening. Machine learning methods have shown considerable promise for this task; indeed, classification methods such as support vector machin...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9003865

    authors: Agarwal S,Dugar D,Sengupta S

    更新日期:2010-05-24 00:00:00

  • Accurate prediction of adsorption energies on graphene, using a dispersion-corrected semiempirical method including solvation.

    abstract::The accurate prediction of the adsorption energies of unsaturated molecules on graphene in the presence of water is essential for the design of molecules that can modify its properties and that can aid its processability. We here show that a semiempirical MO method corrected for dispersive interactions (PM6-DH2) can p...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci5003729

    authors: Vincent MA,Hillier IH

    更新日期:2014-08-25 00:00:00

  • GalaxyDock: protein-ligand docking with flexible protein side-chains.

    abstract::An important issue in developing protein-ligand docking methods is how to incorporate receptor flexibility. Consideration of receptor flexibility using an ensemble of precompiled receptor conformations or by employing an effectively enlarged binding pocket has been reported to be useful. However, direct consideration ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300342z

    authors: Shin WH,Seok C

    更新日期:2012-12-21 00:00:00

  • GPCR-Bench: A Benchmarking Set and Practitioners' Guide for G Protein-Coupled Receptor Docking.

    abstract::Virtual screening is routinely used to discover new ligands and in particular new ligand chemotypes for G protein-coupled receptors (GPCRs). To prepare for a virtual screen, we often tailor a docking protocol that will enable us to select the best candidates for further screening. To aid this, we created GPCR-Bench, a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00660

    authors: Weiss DR,Bortolato A,Tehan B,Mason JS

    更新日期:2016-04-25 00:00:00

  • Virtual drug screen schema based on multiview similarity integration and ranking aggregation.

    abstract::The current drug virtual screen (VS) methods mainly include two categories. i.e., ligand/target structure-based virtual screen and that, utilizing protein-ligand interaction fingerprint information based on the large number of complex structures. Since the former one focuses on the one-side information while the later...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200481c

    authors: Kang H,Sheng Z,Zhu R,Huang Q,Liu Q,Cao Z

    更新日期:2012-03-26 00:00:00

  • Molecular Self-Assembly Strategy for Encapsulation of an Amphipathic α-Helical Antimicrobial Peptide into the Different Polymeric and Copolymeric Nanoparticles.

    abstract::Encapsulation of peptide and protein-based drugs in polymeric nanoparticles is one of the fundamental fields in controlled-release drug delivery systems. The molecular mechanisms of absorption of peptides to the polymeric nanoparticles are still unknown, and there is no precise molecular data on the encapsulation proc...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00641

    authors: Jafari M,Doustdar F,Mehrnejad F

    更新日期:2019-01-28 00:00:00