Support vector regression scoring of receptor-ligand complexes for rank-ordering and virtual screening of chemical libraries.

Abstract:

:The community structure-activity resource (CSAR) data sets are used to develop and test a support vector machine-based scoring function in regression mode (SVR). Two scoring functions (SVR-KB and SVR-EP) are derived with the objective of reproducing the trend of the experimental binding affinities provided within the two CSAR data sets. The features used to train SVR-KB are knowledge-based pairwise potentials, while SVR-EP is based on physicochemical properties. SVR-KB and SVR-EP were compared to seven other widely used scoring functions, including Glide, X-score, GoldScore, ChemScore, Vina, Dock, and PMF. Results showed that SVR-KB trained with features obtained from three-dimensional complexes of the PDBbind data set outperformed all other scoring functions, including best performing X-score, by nearly 0.1 using three correlation coefficients, namely Pearson, Spearman, and Kendall. It was interesting that higher performance in rank ordering did not translate into greater enrichment in virtual screening assessed using the 40 targets of the Directory of Useful Decoys (DUD). To remedy this situation, a variant of SVR-KB (SVR-KBD) was developed by following a target-specific tailoring strategy that we had previously employed to derive SVM-SP. SVR-KBD showed a much higher enrichment, outperforming all other scoring functions tested, and was comparable in performance to our previously derived scoring function SVM-SP.

journal_name

J Chem Inf Model

authors

Li L,Wang B,Meroueh SO

doi

10.1021/ci200078f

subject

Has Abstract

pub_date

2011-09-26 00:00:00

pages

2132-8

issue

9

eissn

1549-9596

issn

1549-960X

journal_volume

51

pub_type

杂志文章
  • Viscosity Prediction of Lubricants by a General Feed-Forward Neural Network.

    abstract::Modern industrial lubricants are often blended with an assortment of chemical additives to improve the performance of the base stock. Machine learning-based predictive models allow fast and veracious derivation of material properties and facilitate novel and innovative material designs. In this study, we outline the d...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01068

    authors: Loh GC,Lee HC,Tee XY,Chow PS,Zheng JW

    更新日期:2020-03-23 00:00:00

  • Protein Preparation Automatic Protocol for High-Throughput Inverse Virtual Screening: Accelerating the Target Identification by Computational Methods.

    abstract::Structure-based virtual screening is highly used in the early stages of drug discovery to identify new putative lead compounds for a given target. However, when a small molecule elicits a biological effect, but its target is unknown, or the side effects it causes arise from its undesired interaction with unknown count...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00428

    authors: De Vita S,Lauro G,Ruggiero D,Terracciano S,Riccio R,Bifulco G

    更新日期:2019-11-25 00:00:00

  • Transplant-insert-constrain-relax-assemble (TICRA): protein-ligand complex structure modeling and application to kinases.

    abstract::We introduce TICRA (transplant-insert-constrain-relax-assemble), a method for modeling the structure of unknown protein-ligand complexes using the X-ray crystal structures of homologous proteins and ligands with known activity. We present results from modeling the structures of protein kinase-inhibitor complexes using...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100256u

    authors: Meshkat S,Klon AE,Zou J,Wiseman JS,Konteatis Z

    更新日期:2011-01-24 00:00:00

  • Accurate prediction of adsorption energies on graphene, using a dispersion-corrected semiempirical method including solvation.

    abstract::The accurate prediction of the adsorption energies of unsaturated molecules on graphene in the presence of water is essential for the design of molecules that can modify its properties and that can aid its processability. We here show that a semiempirical MO method corrected for dispersive interactions (PM6-DH2) can p...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci5003729

    authors: Vincent MA,Hillier IH

    更新日期:2014-08-25 00:00:00

  • GPCR-Bench: A Benchmarking Set and Practitioners' Guide for G Protein-Coupled Receptor Docking.

    abstract::Virtual screening is routinely used to discover new ligands and in particular new ligand chemotypes for G protein-coupled receptors (GPCRs). To prepare for a virtual screen, we often tailor a docking protocol that will enable us to select the best candidates for further screening. To aid this, we created GPCR-Bench, a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00660

    authors: Weiss DR,Bortolato A,Tehan B,Mason JS

    更新日期:2016-04-25 00:00:00

  • Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise.

    abstract::We describe a general methodology for designing an empirical scoring function and provide smina, a version of AutoDock Vina specially optimized to support high-throughput scoring and user-specified custom scoring functions. Using our general method, the unique capabilities of smina, a set of default interaction terms ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300604z

    authors: Koes DR,Baumgartner MP,Camacho CJ

    更新日期:2013-08-26 00:00:00

  • Conformational determinants of the activity of antiproliferative factor glycopeptide.

    abstract::The antiproliferative factor (APF) involved in interstitial cystitis is a glycosylated nonapeptide (TVPAAVVVA) containing a sialylated core 1 α-O-disaccharide linked to the N-terminal threonine. The chemical structure of APF was deduced using spectroscopic techniques and confirmed using total synthesis. The synthetic ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400147s

    authors: Mallajosyula SS,Adams KM,Barchi JJ,MacKerell AD

    更新日期:2013-05-24 00:00:00

  • The valence state combination model: a generic framework for handling tautomers and protonation states.

    abstract::The consistent handling of molecules is probably the most basic and important requirement in the field of cheminformatics. Reliable results can only be obtained if the underlying calculations are independent of the specific way molecules are represented in the input data. However, ensuring consistency is a complex tas...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400724v

    authors: Urbaczek S,Kolodzik A,Rarey M

    更新日期:2014-03-24 00:00:00

  • Substituted 4,5'-Bithiazoles as Catalytic Inhibitors of Human DNA Topoisomerase IIα.

    abstract::Human type II topoisomerases, molecular motors that alter the DNA topology, are a major target of modern chemotherapy. Groups of catalytic inhibitors represent a new approach to overcome the known limitations of topoisomerase II poisons such as cardiotoxicity and induction of secondary tumors. Here, we present a class...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00202

    authors: Bergant Loboda K,Janežič M,Štampar M,Žegura B,Filipič M,Perdih A

    更新日期:2020-07-27 00:00:00

  • Exploring Tunable Hyperparameters for Deep Neural Networks with Industrial ADME Data Sets.

    abstract::Deep learning has drawn significant attention in different areas including drug discovery. It has been proposed that it could outperform other machine learning algorithms, especially with big data sets. In the field of pharmaceutical industry, machine learning models are built to understand quantitative structure-acti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00671

    authors: Zhou Y,Cahya S,Combs SA,Nicolaou CA,Wang J,Desai PV,Shen J

    更新日期:2019-03-25 00:00:00

  • Random forest models to predict aqueous solubility.

    abstract::Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueou...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060164k

    authors: Palmer DS,O'Boyle NM,Glen RC,Mitchell JB

    更新日期:2007-01-01 00:00:00

  • Selective Fusion of Heterogeneous Classifiers for Predicting Substrates of Membrane Transporters.

    abstract::Membrane transporters play a crucial role in determining fate of administered drugs in a biological system. Early identification of plausible transporters for a drug molecule can provide insights into its therapeutic, pharmacokinetic, and toxicological profiles. In the present study, predictive models for classifying ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00508

    authors: Shaikh N,Sharma M,Garg P

    更新日期:2017-03-27 00:00:00

  • TAMkin: a versatile package for vibrational analysis and chemical kinetics.

    abstract::TAMkin is a program for the calculation and analysis of normal modes, thermochemical properties and chemical reaction rates. At present, the output from the frequently applied software programs ADF, CHARMM, CPMD, CP2K, Gaussian, Q-Chem, and VASP can be analyzed. The normal-mode analysis can be performed using a broad ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100099g

    authors: Ghysels A,Verstraelen T,Hemelsoet K,Waroquier M,Van Speybroeck V

    更新日期:2010-09-27 00:00:00

  • Effect of data standardization on chemical clustering and similarity searching.

    abstract::Standardization is used to ensure that the variables in a similarity calculation make an equal contribution to the computed similarity value. This paper compares the use of seven different methods that have been suggested previously for the standardization of integer-valued or real-valued data, comparing the results w...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800224h

    authors: Chu CW,Holliday JD,Willett P

    更新日期:2009-02-01 00:00:00

  • Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram Generation of Complex Molecules and Ligand-Protein Interactions.

    abstract::Computer programs for structure diagram generation (SDG) are indispensable cheminformatic tools that translate one- or three-dimensional (1D or 3D) chemical structure data stored in electronic formats to human-readable 2D depictions. Although many such programs are known, only a moderate part of chemical space can be ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00391

    authors: Frączek T

    更新日期:2016-12-27 00:00:00

  • Molecular Mechanism, Dynamics, and Energetics of Protein-Mediated Dinucleotide Flipping in a Mismatched DNA: A Computational Study of the RAD4-DNA Complex.

    abstract::DNA damage alters genetic information and adversely affects gene expression pathways leading to various complex genetic disorders and cancers. DNA repair proteins recognize and rectify DNA damage and mismatches with high fidelity. A critical molecular event that occurs during most protein-mediated DNA repair processes...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00636

    authors: Pitta K,Krishnan M

    更新日期:2018-03-26 00:00:00

  • QSAR Modeling of ToxCast Assays Relevant to the Molecular Initiating Events of AOPs Leading to Hepatic Steatosis.

    abstract::Nonalcoholic hepatic steatosis is a worldwide epidemiological concern since it is among the most prominent hepatic diseases. Indeed, research in toxicology and epidemiology has gathered evidence that exposure to endocrine disruptors can perturb cellular homeostasis and cause this disease. Therefore, assessing the like...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00297

    authors: Gadaleta D,Manganelli S,Roncaglioni A,Toma C,Benfenati E,Mombelli E

    更新日期:2018-08-27 00:00:00

  • Instrument monitoring, data sharing, and archiving using Common Instrument Middleware Architecture (CIMA).

    abstract::The Common Instrument Middleware Architecture (CIMA) aims at Grid-enabling a wide range of scientific instruments and sensors to enable easy access to and sharing and storage of data produced by these instruments and sensors. This paper describes the implementation of CIMA applied to the field of single-crystal X-ray ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050368l

    authors: Bramley R,Chiu K,Devadithya T,Gupta N,Hart C,Huffman JC,Huffman K,Ma Y,McMullen DF

    更新日期:2006-05-01 00:00:00

  • Enrichment analysis for discovering biological associations in phenotypic screens.

    abstract::A phenotypic screen (PS) is used to identify compounds causing a desired phenotype in a complex biological system where mechanisms and targets are largely unknown. Deconvoluting the mechanism of action of actives and identification of relevant targets and pathways remains a formidable challenge. Current methods fail t...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400245c

    authors: Polyakov VR,Moorcroft ND,Drawid A

    更新日期:2014-02-24 00:00:00

  • Predictive models for cytochrome p450 isozymes based on quantitative high throughput screening data.

    abstract::The human cytochrome P450 (CYP450) isozymes are the most important enzymes in the body to metabolize many endogenous and exogenous substances including environmental toxins and therapeutic drugs. Any unnecessary interactions between a small molecule and CYP450 isozymes may raise a potential to disarm the integrity of ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200311w

    authors: Sun H,Veith H,Xia M,Austin CP,Huang R

    更新日期:2011-10-24 00:00:00

  • Comparative analysis of binding energy of chymostatin with human cathepsin A and its homologous proteins by molecular orbital calculation.

    abstract::Cathepsin A is a mammalian lysosomal enzyme that catalyzes the hydrolysis of the carboxy-terminal amino acids of polypeptides and also regulates beta-galactosidase and neuraminidase-1 activities through the formation of a multienzymic complex in lysosomes. Human cathepsin A (hCathA), yeast carboxypeptidase (CPY), and ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060093p

    authors: Yoshida T,Lepp Z,Kadota Y,Satoh Y,Itoh K,Chuman H

    更新日期:2006-09-01 00:00:00

  • Scores of extended connectivity fingerprint as descriptors in QSPR study of melting point and aqueous solubility.

    abstract::QSPR studies, using scores of SciTegic's extended connectivity fingerprint as raw descriptors, were extended to the prediction of melting points and aqueous solubility of organic compounds. Robust partial least-squares models were developed that perform as well as the best published QSPR models for structurally divers...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800024c

    authors: Zhou D,Alelyunas Y,Liu R

    更新日期:2008-05-01 00:00:00

  • Evaluation of Generalized Born Models for Large Scale Affinity Prediction of Cyclodextrin Host-Guest Complexes.

    abstract::Binding affinity prediction with implicit solvent models remains a challenge in virtual screening for drug discovery. In order to assess the predictive power of implicit solvent models in docking techniques with Amber scoring, three generalized Born models (GBHCT, GBOBCI, and GBOBCII) available in Dock 6.7 were utiliz...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00418

    authors: Zhang H,Yin C,Yan H,van der Spoel D

    更新日期:2016-10-24 00:00:00

  • Molecular Structure-Based Large-Scale Prediction of Chemical-Induced Gene Expression Changes.

    abstract::The quantitative structure-activity relationship (QSAR) approach has been used to model a wide range of chemical-induced biological responses. However, it had not been utilized to model chemical-induced genomewide gene expression changes until very recently, owing to the complexity of training and evaluating a very la...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00281

    authors: Liu R,AbdulHameed MDM,Wallqvist A

    更新日期:2017-09-25 00:00:00

  • The normal-mode entropy in the MM/GBSA method: effect of system truncation, buffer region, and dielectric constant.

    abstract::We have performed a systematic study of the entropy term in the MM/GBSA (molecular mechanics combined with generalized Born and surface-area solvation) approach to calculate ligand-binding affinities. The entropies are calculated by a normal-mode analysis of harmonic frequencies from minimized snapshots of molecular d...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3001919

    authors: Genheden S,Kuhn O,Mikulskis P,Hoffmann D,Ryde U

    更新日期:2012-08-27 00:00:00

  • Modeling p K Shift in DNA Triplexes Containing Locked Nucleic Acids.

    abstract::The protonation states for nucleic acid bases are difficult to assess experimentally. In the context of DNA triplex, the protonation state of cytidine in the third strand is particularly important, because it needs to be protonated in order to form Hoogsteen hydrogen bonds. A sugar modification, locked nucleic acid (L...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00741

    authors: Hartono YD,Xu Y,Karshikoff A,Nilsson L,Villa A

    更新日期:2018-04-23 00:00:00

  • Modeling compound-target interaction network of traditional Chinese medicines for type II diabetes mellitus: insight for polypharmacology and drug design.

    abstract::In this study, in order to elucidate the action mechanism of traditional Chinese medicines (TCMs) that exhibit clinical efficacy for type II diabetes mellitus (T2DM), an integrated protocol that combines molecular docking and pharmacophore mapping was employed to find the potential inhibitors from TCM for the T2DM-rel...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400146u

    authors: Tian S,Li Y,Li D,Xu X,Wang J,Zhang Q,Hou T

    更新日期:2013-07-22 00:00:00

  • Improved Computation of Protein-Protein Relative Binding Energies with the Nwat-MMGBSA Method.

    abstract::A MMGBSA variant (here referred to as Nwat-MMGBSA), based on the inclusion of a certain number of explicit water molecules (Nwat) during the calculations, has been tested on a set of 20 protein-protein complexes, using the correlation between predicted and experimental binding energy as the evaluation metric. Besides ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00196

    authors: Maffucci I,Contini A

    更新日期:2016-09-26 00:00:00

  • Residue preference mapping of ligand fragments in the Protein Data Bank.

    abstract::The interaction between small molecules and proteins is one of the major concerns for structure-based drug design because the principles of protein-ligand interactions and molecular recognition are not thoroughly understood. Fortunately, the analysis of protein-ligand complexes in the Protein Data Bank (PDB) enables u...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100386y

    authors: Wang L,Xie Z,Wipf P,Xie XQ

    更新日期:2011-04-25 00:00:00

  • FORTRAN interface for code interoperability in quantum chemistry: the Q5Cost library.

    abstract::Ab initio quantum-chemistry programs produce and use large amounts of data, which are usually stored on disk in the form of binary files. A FORTRAN library, named Q5Cost, has been designed and implemented in order to allow the storage of these data sets in a special data format built with the HDF5 technology. This dat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7000567

    authors: Borini S,Monari A,Rossi E,Tajti A,Angeli C,Bendazzoli GL,Cimiraglia R,Emerson A,Evangelisti S,Maynau D,Sanchez-Marin J,Szalay PG

    更新日期:2007-05-01 00:00:00