Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets.

Abstract:

:With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e., by defining protein-ligand interaction descriptors and analyzing them with modern machine-learning methods. As in each data modeling ansatz, care has to be taken to validate the model carefully. Here, we show that there are large differences measured in R (0.77 vs 0.46) or R² (0.59 vs 0.21) for a relatively simple scoring function depending on whether it is validated against the PDBbind core set or validated in a leave-cluster-out cross-validation. If proteins from the same family are present in both the training and validation set, the estimated prediction quality from standard validation techniques looks too optimistic.

journal_name

J Chem Inf Model

authors

Kramer C,Gedeck P

doi

10.1021/ci100264e

subject

Has Abstract

pub_date

2010-11-22 00:00:00

pages

1961-9

issue

11

eissn

1549-9596

issn

1549-960X

journal_volume

50

pub_type

杂志文章
  • Evaluating Free Energies of Binding and Conservation of Crystallographic Waters Using SZMAP.

    abstract::The SZMAP method computes binding free energies and the corresponding thermodynamic components for water molecules in the binding site of a protein structure [ SZMAP, 1.0.0 ; OpenEye Scientific Software Inc. : Santa Fe, NM, USA , 2011 ]. In this work, the ability of SZMAP to predict water structure and thermodynamic s...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500746d

    authors: Bayden AS,Moustakas DT,Joseph-McCarthy D,Lamb ML

    更新日期:2015-08-24 00:00:00

  • Pharmer: efficient and exact pharmacophore search.

    abstract::Pharmacophore search is a key component of many drug discovery efforts. Pharmer is a new computational approach to pharmacophore search that scales with the breadth and complexity of the query, not the size of the compound library being screened. Two novel methods for organizing pharmacophore data, the Pharmer KDB-tre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200097m

    authors: Koes DR,Camacho CJ

    更新日期:2011-06-27 00:00:00

  • Free energy calculations give insight into the stereoselective hydroxylation of α-ionones by engineered cytochrome P450 BM3 mutants.

    abstract::Previously, stereoselective hydroxylation of α-ionone by Cytochrome P450 BM3 mutants M01 A82W and M11 L437N was observed. While both mutants hydroxylate α-ionone in a regioselective manner at the C3 position, M01 A82W catalyzes formation of trans-3-OH-α-ionone products whereas M11 L437N exhibits opposite stereoselecti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300243n

    authors: de Beer SB,Venkataraman H,Geerke DP,Oostenbrink C,Vermeulen NP

    更新日期:2012-08-27 00:00:00

  • Predicting the DNA Conductance Using a Deep Feedforward Neural Network Model.

    abstract::Double-stranded DNA (dsDNA) has been established as an efficient medium for charge migration, bringing it to the forefront of the field of molecular electronics and biological research. The charge migration rate is controlled by the electronic couplings between the two nucleobases of DNA/RNA. These electronic coupling...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01072

    authors: Aggarwal A,Vinayak V,Bag S,Bhattacharyya C,Waghmare UV,Maiti PK

    更新日期:2021-01-25 00:00:00

  • Get Your Atoms in Order--An Open-Source Implementation of a Novel and Robust Molecular Canonicalization Algorithm.

    abstract::Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00543

    authors: Schneider N,Sayle RA,Landrum GA

    更新日期:2015-10-26 00:00:00

  • Assessment of the Sampling Performance of Multiple-Copy Dynamics versus a Unique Trajectory.

    abstract::The goal of the present study was to ascertain the differential performance of a long molecular dynamics trajectory versus several shorter ones starting from different points in the phase space and covering the same sampling time. For this purpose, we selected the 16-mer peptide Bak16BH3 as a model for study and carri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00347

    authors: Perez JJ,Tomas MS,Rubio-Martinez J

    更新日期:2016-10-24 00:00:00

  • Prediction of pH-dependent aqueous solubility of druglike molecules.

    abstract::In the present work, the Henderson-Hasselbalch (HH) equation has been employed for the development of a tool for the prediction of pH-dependent aqueous solubility of drugs and drug candidates. A new prediction method for the intrinsic solubility was developed, based on artificial neural networks that have been trained...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600292q

    authors: Hansen NT,Kouskoumvekaki I,Jørgensen FS,Brunak S,Jónsdóttir SO

    更新日期:2006-11-01 00:00:00

  • Evaluation and Characterization of Trk Kinase Inhibitors for the Treatment of Pain: Reliable Binding Affinity Predictions from Theory and Computation.

    abstract::Optimization of ligand binding affinity to the target protein of interest is a primary objective in small-molecule drug discovery. Until now, the prediction of binding affinities by computational methods has not been widely applied in the drug discovery process, mainly because of its lack of accuracy and reproducibili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00780

    authors: Wan S,Bhati AP,Skerratt S,Omoto K,Shanmugasundaram V,Bagal SK,Coveney PV

    更新日期:2017-04-24 00:00:00

  • PiNN: A Python Library for Building Atomic Neural Networks of Molecules and Materials.

    abstract::Atomic neural networks (ANNs) constitute a class of machine learning methods for predicting potential energy surfaces and physicochemical properties of molecules and materials. Despite many successes, developing interpretable ANN architectures and implementing existing ones efficiently are still challenging. This call...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00994

    authors: Shao Y,Hellström M,Mitev PD,Knijff L,Zhang C

    更新日期:2020-03-23 00:00:00

  • Probing fragment complementation by rigid-body docking: in silico reconstitution of calbindin D9k.

    abstract::Fragment complementation is gaining an increasing impact as a nonperturbing method to probe noncovalent interactions within protein supersecondary structures. In this study, the fast Fourier transform rigid-body docking algorithm ZDOCK has been employed for in silico reconstitution of the calcium binding protein calbi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0501995

    authors: Dell'Orco D,Seeber M,De Benedetti PG,Fanelli F

    更新日期:2005-09-01 00:00:00

  • Molecular Structure Extraction from Documents Using Deep Learning.

    abstract::Chemical structure extraction from documents remains a hard problem because of both false positive identification of structures during segmentation and errors in the predicted structures. Current approaches rely on handcrafted rules and subroutines that perform reasonably well generally but still routinely encounter s...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00669

    authors: Staker J,Marshall K,Abel R,McQuaw CM

    更新日期:2019-03-25 00:00:00

  • Identifying biologically active compound classes using phenotypic screening data and sampling statistics.

    abstract::Scoring the activity of compounds in phenotypic high-throughput assays presents a unique challenge because of the limited resolution and inherent measurement error of these assays. Techniques that leverage the structural similarity of compounds within an assay can be used to improve the hit-recovery rate from screenin...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050087d

    authors: Klekota J,Brauner E,Schreiber SL

    更新日期:2005-11-01 00:00:00

  • Molecular Dynamics Simulations of Substrate Release from Trypanosoma cruzi UDP-Galactopyranose Mutase.

    abstract::The enzyme UDP-galactopyranose mutase (UGM) represents a promising drug target for the treatment of infections with Trypanosoma cruzi. We have computed the Potential of Mean Force for the release of UDP-galactopyranose from UGM, using Umbrella Sampling simulations. The simulations revealed the conformational changes t...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00675

    authors: Cossio-Pérez R,Pierdominici-Sottile G,Sobrado P,Palma J

    更新日期:2019-02-25 00:00:00

  • Equally Weighted Multiscale Elastic Network Model and Its Comparison with Traditional and Parameter-Free Models.

    abstract::Dynamical properties of proteins play an essential role in their function exertion. The elastic network model (ENM) is an effective and efficient tool in characterizing the intrinsic dynamical properties encoded in biomacromolecule structures. The Gaussian network model (GNM) and anisotropic network model (ANM) are th...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01178

    authors: Gong W,Liu Y,Zhao Y,Wang S,Han Z,Li C

    更新日期:2021-01-26 00:00:00

  • ReFlex3D: Refined Flexible Alignment of Molecules Using Shape and Electrostatics.

    abstract::We present an algorithm, ReFlex3D, for the refinement of flexible molecular alignments based on their three-dimensional shape and electrostatic properties. The algorithm is designed to be used with fast conformer generators to refine an initial overlay between two molecules and thus to obtain improved overlaps as judg...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00618

    authors: Schmidt TC,Cosgrove DA,Boström J

    更新日期:2018-04-23 00:00:00

  • Development of a computational tool to rival experts in the prediction of sites of metabolism of xenobiotics by p450s.

    abstract::The metabolism of xenobiotics--and more specifically drugs--in the liver is a critical process controlling their half-life. Although there exist experimental methods, which measure the metabolic stability of xenobiotics and identify their metabolites, developing higher throughput predictive methods is an avenue of res...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3003073

    authors: Campagna-Slater V,Pottel J,Therrien E,Cantin LD,Moitessier N

    更新日期:2012-09-24 00:00:00

  • Polarizable Force Field for Molecular Ions Based on the Classical Drude Oscillator.

    abstract::Development of accurate force field parameters for molecular ions in the context of a polarizable energy function based on the classical Drude oscillator is a crucial step toward an accurate polarizable model for modeling and simulations of biological macromolecules. Toward this goal we have undertaken a hierarchical ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00132

    authors: Lin FY,Lopes PEM,Harder E,Roux B,MacKerell AD Jr

    更新日期:2018-05-29 00:00:00

  • Scaffold topologies. 1. Exhaustive enumeration up to eight rings.

    abstract::Mapping the chemical space of small organic molecules is approached from a theoretical graph theory viewpoint, in an effort to begin the systematic exploration of molecular topologies. We present an algorithm for exhaustive generation of scaffold topologies with up to eight rings and an efficient comparison method for...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7003412

    authors: Pollock SN,Coutsias EA,Wester MJ,Oprea TI

    更新日期:2008-07-01 00:00:00

  • Coupling of Zinc-Binding and Secondary Structure in Nonfibrillar Aβ40 Peptide Oligomerization.

    abstract::Nonfibrillar neurotoxic amyloid β (Aβ) oligomer structures are typically rich in β-sheets, which could be promoted by metal ions like Zn(2+). Here, using molecular dynamics (MD) simulations, we systematically examined combinations of Aβ40 peptide conformations and Zn(2+) binding modes to probe the effects of secondary...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00063

    authors: Xu L,Shan S,Chen Y,Wang X,Nussinov R,Ma B

    更新日期:2015-06-22 00:00:00

  • iUmami-SCM: A Novel Sequence-Based Predictor for Prediction and Analysis of Umami Peptides Using a Scoring Card Method with Propensity Scores of Dipeptides.

    abstract::Umami or the taste of monosodium glutamate represents one of the major attractive taste modalities in humans. Therefore, knowledge about biophysical and biochemical properties of the umami taste is important for both scientific research and the food industry. Experimental approaches for predicting umami peptides are l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00707

    authors: Charoenkwan P,Yana J,Nantasenamat C,Hasan MM,Shoombuatong W

    更新日期:2020-12-28 00:00:00

  • Determination of Structural Ensembles of Flexible Molecules in Solution from NMR Data Undergoing Spin Diffusion.

    abstract::Spin diffusion is a formidable problem when interpreting NMR data of chemical compounds. We developed a method to reconstruct the conformational ensemble of flexible molecules displaying spin diffusion, which minimizes the subjective bias in the interpretation of experimental data and which can be used routinely to ob...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00259

    authors: Vasile F,Tiana G

    更新日期:2019-06-24 00:00:00

  • Comparison of Implicit and Explicit Solvation Models for Iota-Cyclodextrin Conformation Analysis from Replica Exchange Molecular Dynamics.

    abstract::Large ring cyclodextrins have become increasingly important for drug delivery applications. In this work, we have performed replica-exchange molecular dynamics simulations using both implicit and explicit water solvation models to study the conformational diversity of iota-cyclodextrin containing 14 α-1,4 glycosidic l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00595

    authors: Khuntawee W,Kunaseth M,Rungnim C,Intagorn S,Wolschann P,Kungwan N,Rungrotmongkol T,Hannongbua S

    更新日期:2017-04-24 00:00:00

  • Improved CoMFA modeling by optimization of settings.

    abstract::The possibility of improving the predictive ability of comparative molecular field analysis (CoMFA) by settings optimization has been evaluated to show that CoMFA predictive ability can be improved. Ten different CoMFA settings are evaluated, producing a total of 6120 models. This method has been applied to nine diffe...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci049612j

    authors: Peterson SD,Schaal W,Karlén A

    更新日期:2006-01-01 00:00:00

  • Regioselectivity prediction of CYP1A2-mediated phase I metabolism.

    abstract::A kinetic, reactivity-binding model has been proposed to predict the regioselectivity of substrates meditated by the CYP1A2 enzyme, which is responsible for the metabolism of planar-conjugated compounds such as caffeine. This model consists of a docking simulation for binding energy and a semiempirical molecular orbit...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800001m

    authors: Jung J,Kim ND,Kim SY,Choi I,Cho KH,Oh WS,Kim DN,No KT

    更新日期:2008-05-01 00:00:00

  • Modeling p K Shift in DNA Triplexes Containing Locked Nucleic Acids.

    abstract::The protonation states for nucleic acid bases are difficult to assess experimentally. In the context of DNA triplex, the protonation state of cytidine in the third strand is particularly important, because it needs to be protonated in order to form Hoogsteen hydrogen bonds. A sugar modification, locked nucleic acid (L...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00741

    authors: Hartono YD,Xu Y,Karshikoff A,Nilsson L,Villa A

    更新日期:2018-04-23 00:00:00

  • CHARMMing: a new, flexible web portal for CHARMM.

    abstract::A new web portal for the CHARMM macromolecular modeling package, CHARMMing (CHARMM interface and graphics, http://www.charmming.org), is presented. This tool provides a user-friendly interface for the preparation, submission, monitoring, and visualization of molecular simulations (i.e., energy minimization, solvation,...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800133b

    authors: Miller BT,Singh RP,Klauda JB,Hodoscek M,Brooks BR,Woodcock HL 3rd

    更新日期:2008-09-01 00:00:00

  • LiCABEDS II. Modeling of ligand selectivity for G-protein-coupled cannabinoid receptors.

    abstract::The cannabinoid receptor subtype 2 (CB2) is a promising therapeutic target for blood cancer, pain relief, osteoporosis, and immune system disease. The recent withdrawal of Rimonabant, which targets another closely related cannabinoid receptor (CB1), accentuates the importance of selectivity for the development of CB2 ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3003914

    authors: Ma C,Wang L,Yang P,Myint KZ,Xie XQ

    更新日期:2013-01-28 00:00:00

  • ANN multiscale model of anti-HIV drugs activity vs AIDS prevalence in the US at county level based on information indices of molecular graphs and social networks.

    abstract::This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400716y

    authors: González-Díaz H,Herrera-Ibatá DM,Duardo-Sánchez A,Munteanu CR,Orbegozo-Medina RA,Pazos A

    更新日期:2014-03-24 00:00:00

  • Exploring Alternative Strategies for the Identification of Potent Compounds Using Support Vector Machine and Regression Modeling.

    abstract::Support vector regression (SVR) is a premier approach for the prediction of compound potency. Given the conceptual link between support vector machine (SVM) and SVR modeling, SVR is capable of accounting for continuous and discontinuous structure-activity relationships (SARs) in potency prediction, which further exten...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00584

    authors: Miyao T,Funatsu K,Bajorath J

    更新日期:2019-03-25 00:00:00

  • L-arginine binding to human inducible nitric oxide synthase: an antisymmetric funnel route toward isoform-specific inhibitors?

    abstract::Nitric oxide (NO) is an important signaling molecule produced by a family of enzymes called nitric oxide synthases (NOS). Because NO is involved in various pathological conditions, the development of potent and isoform-selective NOS inhibitors is an important challenge. In the present study, the dimer of oxygenase dom...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100422v

    authors: Floquet N,Hernandez JF,Boucher JL,Martinez J

    更新日期:2011-06-27 00:00:00