Random Forest Refinement of Pairwise Potentials for Protein-Ligand Decoy Detection.

Abstract:

:An accurate scoring function is expected to correctly select the most stable structure from a set of pose candidates. One can hypothesize that a scoring function's ability to identify the most stable structure might be improved by emphasizing the most relevant atom pairwise interactions. However, it is hard to evaluate the relevant importance for each atom pair using traditional means. With the introduction of machine learning (ML) methods, it has become possible to determine the relative importance for each atom pair present in a scoring function. In this work, we use the Random Forest (RF) method to refine a pair potential developed by our laboratory (GARF; Zhang , Z. J. Chem. Theory Comput. 2018 , 14 , 5045 ) by identifying relevant atom pairs that optimize the performance of the potential on our given task. Our goal is to construct a ML model that can accurately differentiate the native ligand binding pose from candidate poses using a potential refined by RF optimization. We successfully constructed RF models on an unbalanced data set with the "comparison" concept, and the resultant RF models were tested on CASF-2013 ( Li , Y. J. Chem. Inf.Model. 2014 , 54 , 1700 ). In a comparison of the performance of our RF models against 29 scoring functions, we found that our models outperformed the other scoring functions in predicting the native pose. In addition, we created two artificially designed potential function sets to address the importance of the GARF potential in the RF models: (1) a scrambled probability function set, which was obtained by mixing up atom pairs and probability functions in GARF, and (2) a uniform probability function set, which shares the same peak positions with GARF but has fixed peak heights. The results of accuracy comparison from RF models based on the scrambled, uniform, and original GARF potential clearly showed that the peak positions in the GARF potential are important while the well depths are not. All code and data used in this work are available at https://github.com/JunPei000/random_forest_protein_ligand_decoy_detection .

journal_name

J Chem Inf Model

authors

Pei J,Zheng Z,Kim H,Song LF,Walworth S,Merz MR,Merz KM Jr

doi

10.1021/acs.jcim.9b00356

subject

Has Abstract

pub_date

2019-07-22 00:00:00

pages

3305-3315

issue

7

eissn

1549-9596

issn

1549-960X

journal_volume

59

pub_type

杂志文章
  • Optimal Measurement Network of Pairwise Differences.

    abstract::When both the difference between two quantities and their individual values can be measured or computationally predicted, multiple quantities can be determined from the measurements or predictions of select individual quantities and select pairwise differences. These measurements and predictions form a network connect...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00528

    authors: Xu H

    更新日期:2019-11-25 00:00:00

  • Development of an informatics platform for therapeutic protein and peptide analytics.

    abstract::The momentum gained by research on biologics has not been met yet with equal thrust on the informatics side. There is a noticeable lack of software for data management that empowers the bench scientists working on the development of biologic therapeutics. SARvision|Biologics is a tool to analyze data associated with b...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400333x

    authors: Hansen MR,Villar HO,Feyfant E

    更新日期:2013-10-28 00:00:00

  • Benchmark Sets for Binding Hot Spot Identification in Fragment-Based Ligand Discovery.

    abstract::Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00877

    authors: Wakefield AE,Yueh C,Beglov D,Castilho MS,Kozakov D,Keserű GM,Whitty A,Vajda S

    更新日期:2020-12-28 00:00:00

  • Nonadditivity Analysis.

    abstract::We introduce the statistics behind a novel type of SAR analysis named "nonadditivity analysis". On the basis of all pairs of matched pairs within a given data set, the approach analyzes whether the same transformations between related molecules have the same effect, i.e., whether they are additive. Assuming that the e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00631

    authors: Kramer C

    更新日期:2019-09-23 00:00:00

  • Imputation of Assay Bioactivity Data Using Deep Learning.

    abstract::We describe a novel deep learning neural network method and its application to impute assay pIC50 values. Unlike conventional machine learning approaches, this method is trained on sparse bioactivity data as input, typical of that found in public and commercial databases, enabling it to learn directly from correlation...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00768

    authors: Whitehead TM,Irwin BWJ,Hunt P,Segall MD,Conduit GJ

    更新日期:2019-03-25 00:00:00

  • Combining 3-D quantitative structure-activity relationship with ligand based and structure based alignment procedures for in silico screening of new hepatitis C virus NS5B polymerase inhibitors.

    abstract::The viral NS5B RNA-dependent RNA-polymerase (RdRp) is one of the best-studied and promising targets for the development of novel therapeutics against hepatitis C virus (HCV). Allosteric inhibition of this enzyme has emerged as a viable strategy toward blocking replication of viral RNA in cell based systems. Herein, we...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9004749

    authors: Musmuca I,Caroli A,Mai A,Kaushik-Basu N,Arora P,Ragno R

    更新日期:2010-04-26 00:00:00

  • Probing the Binding Pathway of BRACO19 to a Parallel-Stranded Human Telomeric G-Quadruplex Using Molecular Dynamics Binding Simulation with AMBER DNA OL15 and Ligand GAFF2 Force Fields.

    abstract::Human telomeric DNA G-quadruplex has been identified as a good therapeutic target in cancer treatment. G-quadruplex-specific ligands that stabilize the G-quadruplex have great potential to be developed as anticancer agents. Two crystal structures (an apo form of parallel stranded human telomeric G-quadruplex and its h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00287

    authors: Machireddy B,Kalra G,Jonnalagadda S,Ramanujachary K,Wu C

    更新日期:2017-11-27 00:00:00

  • Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets.

    abstract::On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00143

    authors: Clark AM,Dole K,Coulon-Spektor A,McNutt A,Grass G,Freundlich JS,Reynolds RC,Ekins S

    更新日期:2015-06-22 00:00:00

  • Estimation of ligand efficacies of metabotropic glutamate receptors from conformational forces obtained from molecular dynamics simulations.

    abstract::Group 1 metabotropic glutamate receptors (mGluR) are G-protein coupled receptors with a large bilobate extracellular ligand binding region (LBR) that resembles a Venus fly trap. Closing of this LBR in the presence of a ligand is associated with the activation of the receptor. From conformational sampling of the LBR-li...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400160x

    authors: Lakkaraju SK,Xue F,Faden AI,MacKerell AD Jr

    更新日期:2013-06-24 00:00:00

  • Homology model-guided 3D-QSAR studies of HIV-1 integrase inhibitors.

    abstract::In the present study, we report the exploration of binding modes of potent HIV-1 integrase (IN) inhibitors MK-0518 (raltegravir) and GS-9137 (elvitegravir) as well as chalcone and related amide IN inhibitors we recently synthesized and the development of 3D-QSAR models for integrase inhibition. Homology models of DNA-...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200485a

    authors: Sharma H,Cheng X,Buolamwini JK

    更新日期:2012-02-27 00:00:00

  • Systematic analysis of enzyme-catalyzed reaction patterns and prediction of microbial biodegradation pathways.

    abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700006f

    authors: Oh M,Yamada T,Hattori M,Goto S,Kanehisa M

    更新日期:2007-07-01 00:00:00

  • iUmami-SCM: A Novel Sequence-Based Predictor for Prediction and Analysis of Umami Peptides Using a Scoring Card Method with Propensity Scores of Dipeptides.

    abstract::Umami or the taste of monosodium glutamate represents one of the major attractive taste modalities in humans. Therefore, knowledge about biophysical and biochemical properties of the umami taste is important for both scientific research and the food industry. Experimental approaches for predicting umami peptides are l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00707

    authors: Charoenkwan P,Yana J,Nantasenamat C,Hasan MM,Shoombuatong W

    更新日期:2020-12-28 00:00:00

  • An Analysis of Different Components of a High-Throughput Screening Library.

    abstract::Since many projects at pharmaceutical organizations get their start from a high-throughput screening (HTS) campaign, improving the quality of the HTS deck can improve the likelihood of discovering a high-quality lead molecule that can be progressed to a drug candidate. Over the past decade, Janssen has implemented sev...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00258

    authors: Saha A,Varghese T,Liu A,Allen SJ,Mirzadegan T,Hack MD

    更新日期:2018-10-22 00:00:00

  • Probing fragment complementation by rigid-body docking: in silico reconstitution of calbindin D9k.

    abstract::Fragment complementation is gaining an increasing impact as a nonperturbing method to probe noncovalent interactions within protein supersecondary structures. In this study, the fast Fourier transform rigid-body docking algorithm ZDOCK has been employed for in silico reconstitution of the calcium binding protein calbi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0501995

    authors: Dell'Orco D,Seeber M,De Benedetti PG,Fanelli F

    更新日期:2005-09-01 00:00:00

  • Alanine Scanning Effects on the Biochemical and Biophysical Properties of Intrinsically Disordered Proteins: A Case Study of the Histidine to Alanine Mutations in Amyloid-β42.

    abstract::Alanine scanning is a tool in molecular biology that is commonly used to evaluate the contribution of a specific amino acid residue to the stability and function of a protein. Additionally, this tool is also used to understand whether the side chain of a specific amino acid residue plays a role in the protein's bioact...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00926

    authors: Coskuner-Weber O,Uversky VN

    更新日期:2019-02-25 00:00:00

  • Instrument monitoring, data sharing, and archiving using Common Instrument Middleware Architecture (CIMA).

    abstract::The Common Instrument Middleware Architecture (CIMA) aims at Grid-enabling a wide range of scientific instruments and sensors to enable easy access to and sharing and storage of data produced by these instruments and sensors. This paper describes the implementation of CIMA applied to the field of single-crystal X-ray ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050368l

    authors: Bramley R,Chiu K,Devadithya T,Gupta N,Hart C,Huffman JC,Huffman K,Ma Y,McMullen DF

    更新日期:2006-05-01 00:00:00

  • Molecular Dynamics Simulation of the Conformational Preferences of Pseudouridine Derivatives: Improving the Distribution in the Glycosidic Torsion Space.

    abstract::There are only four derivatives of pseudouridine (Ψ) that are known to occur naturally in RNA as post-transcriptional modifications. We have studied the conformational consequences of pseudouridylation and further modifications using replica exchange molecular dynamics simulations at the nucleoside level, and the simu...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00369

    authors: Dutta N,Sarzynska J,Lahiri A

    更新日期:2020-10-26 00:00:00

  • AlphaSpace: Fragment-Centric Topographical Mapping To Target Protein-Protein Interaction Interfaces.

    abstract::Inhibition of protein-protein interactions (PPIs) is emerging as a promising therapeutic strategy despite the difficulty in targeting such interfaces with drug-like small molecules. PPIs generally feature large and flat binding surfaces as compared to typical drug targets. These features pose a challenge for structura...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00103

    authors: Rooklin D,Wang C,Katigbak J,Arora PS,Zhang Y

    更新日期:2015-08-24 00:00:00

  • Pharmacophore identification, in silico screening, and virtual library design for inhibitors of the human factor Xa.

    abstract::Factor Xa inhibitors are innovative anticoagulant agents that provide a better safety/efficacy profile compared to other anticoagulative drugs. A chemical feature-based modeling approach was applied to identify crucial pharmacophore patterns from 3D crystal structures of inhibitors bound to human factor Xa (Pdb entrie...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci049778k

    authors: Krovat EM,Frühwirth KH,Langer T

    更新日期:2005-01-01 00:00:00

  • Flexophore, a new versatile 3D pharmacophore descriptor that considers molecular flexibility.

    abstract::A novel pharmacophore descriptor Flexophore is presented, which considers molecular flexibility when comparing descriptor similarities. The descriptor is a complete reduced graph of the underlying molecule. Its nodes are represented by enhanced MM2 atom types, while the edge descriptions encode the molecular flexibili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700359j

    authors: von Korff M,Freyss J,Sander T

    更新日期:2008-04-01 00:00:00

  • Enrichment factor analyses on G-protein coupled receptors with known crystal structure.

    abstract::G-protein coupled receptors (GPCRs) are highly relevant drug targets. Four GPCRs with known crystal structure were analyzed with docking (AutoDock4) and postdocking (MM-PBSA) in order to evaluate the ability to recognize known antagonists from a larger database of molecular decoys and to predict correct binding modes....

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4000745

    authors: Anighoro A,Rastelli G

    更新日期:2013-04-22 00:00:00

  • In silico renal clearance model using classical Volsurf approach.

    abstract::A data set of 130 diverse compounds containing both central nervous system (CNS) and non-CNS drugs was used to generate a renal clearance model using a classical Volsurf approach. Percentage renal clearance data was used as a biological input. The score plots obtained from principal component analysis and partial leas...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0503309

    authors: Doddareddy MR,Cho YS,Koh HY,Kim DH,Pae AN

    更新日期:2006-05-01 00:00:00

  • Unraveling Energy and Dynamics Determinants to Interpret Protein Functional Plasticity: The Limonene-1,2-epoxide-hydrolase Case Study.

    abstract::The balance between structural stability and functional plasticity in proteins that share common three-dimensional folds is the key factor that drives protein evolvability. The ability to distinguish the parts of homologous proteins that underlie common structural organization patterns from the parts acting as regulat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00504

    authors: Rinaldi S,Gori A,Annovazzi C,Ferrandi EE,Monti D,Colombo G

    更新日期:2017-04-24 00:00:00

  • Regulation of JAK2 activation by Janus homology 2: evidence from molecular dynamics simulations.

    abstract::Janus kinase 2 (JAK2) is a protein tyrosine kinase implicated in signaling by specific members of the cytokine receptor family. Although it has been established that the JAK2 tyrosine kinase is negatively regulated by the JAK homology 2 (JH2) pseudokinase domain, the underlying mechanism of JH2 mediated regulation rem...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300308g

    authors: Wan S,Coveney PV

    更新日期:2012-11-26 00:00:00

  • Exploration of Interfacial Hydration Networks of Target-Ligand Complexes.

    abstract::Interfacial hydration strongly influences interactions between biomolecules. For example, drug-target complexes are often stabilized by hydration networks formed between hydrophilic residues and water molecules at the interface. Exhaustive exploration of hydration networks is challenging for experimental as well as th...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00638

    authors: Jeszenői N,Bálint M,Horváth I,van der Spoel D,Hetényi C

    更新日期:2016-01-25 00:00:00

  • Advantages of Relative versus Absolute Data for the Development of Quantitative Structure-Activity Relationship Classification Models.

    abstract::The appropriate selection of a chemical space represented by the data set, the selection of its chemical data representation, the development of a correct modeling process using a robust and reproducible algorithm, and the performance of an exhaustive training and external validation determine the usability and reprod...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00492

    authors: Ruiz IL,Gómez-Nieto MÁ

    更新日期:2017-11-27 00:00:00

  • Loop Grafting between Similar Local Environments for Fc-Silent Antibodies.

    abstract::Reduction of the affinity of the fragment crystallizable (Fc) region with immune receptors by substitution of one or a few amino acids, known as Fc-silencing, is an established approach to reduce the immune effector functions of monoclonal antibody therapeutics. This approach to Fc-silencing, however, is problematic a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01198

    authors: Lešnik S,Hodošček M,Podobnik B,Konc J

    更新日期:2020-11-23 00:00:00

  • Ranking Reversible Covalent Drugs: From Free Energy Perturbation to Fragment Docking.

    abstract::Reversible covalent inhibitors have drawn increasing attention in drug design, as they are likely more potent than noncovalent inhibitors and less toxic than covalent inhibitors. Despite those advantages, the computational prediction of reversible covalent binding presents a formidable challenge because the binding pr...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00959

    authors: Zhang H,Jiang W,Chatterjee P,Luo Y

    更新日期:2019-05-28 00:00:00

  • FORTRAN interface for code interoperability in quantum chemistry: the Q5Cost library.

    abstract::Ab initio quantum-chemistry programs produce and use large amounts of data, which are usually stored on disk in the form of binary files. A FORTRAN library, named Q5Cost, has been designed and implemented in order to allow the storage of these data sets in a special data format built with the HDF5 technology. This dat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7000567

    authors: Borini S,Monari A,Rossi E,Tajti A,Angeli C,Bendazzoli GL,Cimiraglia R,Emerson A,Evangelisti S,Maynau D,Sanchez-Marin J,Szalay PG

    更新日期:2007-05-01 00:00:00

  • Assessing different classification methods for virtual screening.

    abstract::How well do different classification methods perform in selecting the ligands of a protein target out of large compound collections not used to train the model? Support vector machines, random forest, artificial neural networks, k-nearest-neighbor classification with genetic-algorithm-optimized feature selection, tren...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050519k

    authors: Plewczynski D,Spieser SA,Koch U

    更新日期:2006-05-01 00:00:00