Advantages of Relative versus Absolute Data for the Development of Quantitative Structure-Activity Relationship Classification Models.

Abstract:

:The appropriate selection of a chemical space represented by the data set, the selection of its chemical data representation, the development of a correct modeling process using a robust and reproducible algorithm, and the performance of an exhaustive training and external validation determine the usability and reproducibility of a quantitative structure-activity relationship (QSAR) classification model. In this paper, we show that the use of relative versus absolute data in the representation of the data sets produces better classification models when the other processes are not modified. Relative data considers a reference frame to measure the chemical characteristics involved in the classification model, refining the data set representation and smoothing the lack of chemical information. Three data sets with different characteristics have been used in this study, and classifications models have been built applying the support vector machine algorithm. For randomly selected training and test sets, values of accuracy and area under the receiver operating characteristic curve close to 100% have been obtained for the generation of the models and external validations in all cases.

journal_name

J Chem Inf Model

authors

Ruiz IL,Gómez-Nieto MÁ

doi

10.1021/acs.jcim.7b00492

subject

Has Abstract

pub_date

2017-11-27 00:00:00

pages

2776-2788

issue

11

eissn

1549-9596

issn

1549-960X

journal_volume

57

pub_type

杂志文章
  • COSMOsar3D: molecular field analysis based on local COSMO σ-profiles.

    abstract::The COSMO surface polarization charge density σ resulting from quantum chemical calculations combined with a virtual conductor embedding has been widely proven to be a very suitable descriptor for the quantification of interactions of molecules in liquids. In a preceding paper, grid-based local histograms of σ have be...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300231t

    authors: Klamt A,Thormann M,Wichmann K,Tosco P

    更新日期:2012-08-27 00:00:00

  • Comparison of Implicit and Explicit Solvation Models for Iota-Cyclodextrin Conformation Analysis from Replica Exchange Molecular Dynamics.

    abstract::Large ring cyclodextrins have become increasingly important for drug delivery applications. In this work, we have performed replica-exchange molecular dynamics simulations using both implicit and explicit water solvation models to study the conformational diversity of iota-cyclodextrin containing 14 α-1,4 glycosidic l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00595

    authors: Khuntawee W,Kunaseth M,Rungnim C,Intagorn S,Wolschann P,Kungwan N,Rungrotmongkol T,Hannongbua S

    更新日期:2017-04-24 00:00:00

  • Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    abstract::Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00249

    authors: Schneider N,Fechner N,Landrum GA,Stiefl N

    更新日期:2017-08-28 00:00:00

  • Assessment of the Sampling Performance of Multiple-Copy Dynamics versus a Unique Trajectory.

    abstract::The goal of the present study was to ascertain the differential performance of a long molecular dynamics trajectory versus several shorter ones starting from different points in the phase space and covering the same sampling time. For this purpose, we selected the 16-mer peptide Bak16BH3 as a model for study and carri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00347

    authors: Perez JJ,Tomas MS,Rubio-Martinez J

    更新日期:2016-10-24 00:00:00

  • ColBioS-FlavRC: a collection of bioselective flavonoids and related compounds filtered from high-throughput screening outcomes.

    abstract::Flavonoids, the vastest class of natural polyphenols, are extensively investigated for their multiple benefits on human health. Due to their physicochemical or biological properties, many representatives are considered to exhibit low selectivity among various protein targets or to plague high-throughput screening (HTS...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci5002668

    authors: Avram SI,Pacureanu LM,Bora A,Crisan L,Avram S,Kurunczi L

    更新日期:2014-08-25 00:00:00

  • Scaling predictive modeling in drug development with cloud computing.

    abstract::Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations ar...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500580y

    authors: Moghadam BT,Alvarsson J,Holm M,Eklund M,Carlsson L,Spjuth O

    更新日期:2015-01-26 00:00:00

  • Binding of Cytotoxic Aβ25-35 Peptide to the Dimyristoylphosphatidylcholine Lipid Bilayer.

    abstract::Aβ25-35 is a short, cytotoxic, and naturally occurring fragment of the Alzheimer's Aβ peptide. To map the molecular mechanism of Aβ25-35 binding to the zwitterionic dimyristoylphosphatidylcholine (DMPC) bilayer, we have performed replica exchange with solute tempering molecular dynamics simulations using all-atom expl...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00045

    authors: Smith AK,Klimov DK

    更新日期:2018-05-29 00:00:00

  • The ensemble performance index: an improved measure for assessing ensemble pose prediction performance.

    abstract::We present a theoretical study on the performance of ensemble docking methodologies considering multiple protein structures. We perform a theoretical analysis of pose prediction experiments which is completely unbiased, as we make no assumptions about specific scoring functions, search paradigms, protein structures, o...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci2002796

    authors: Korb O,McCabe P,Cole J

    更新日期:2011-11-28 00:00:00

  • Insights on the facet specific adsorption of amino acids and peptides toward platinum.

    abstract::Engineering shape-controlled bionanomaterials requires comprehensive understanding of interactions between biomolecules and inorganic surfaces. We explore the origin of facet-selective binding of peptides adsorbed onto Pt(100) and Pt(111) crystallographic planes. Using molecular dynamics simulations, we show that upon...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400630d

    authors: Ramakrishnan SK,Martin M,Cloitre T,Firlej L,Cuisinier FJ,Gergely C

    更新日期:2013-12-23 00:00:00

  • Toward high throughput 3D virtual screening using spherical harmonic surface representations.

    abstract::Searching chemical databases for possible drug leads is often one of the main activities conducted during the early stages of a drug development project. This article shows that spherical harmonic molecular shape representations provide a powerful way to search and cluster small-molecule databases rapidly and accurate...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7001507

    authors: Mavridis L,Hudson BD,Ritchie DW

    更新日期:2007-09-01 00:00:00

  • Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discove

    abstract::All molecules of up to 11 atoms of C, N, O, and F possible under consideration of simple valency, chemical stability, and synthetic feasibility rules were generated and collected in a database (GDB). GDB contains 26.4 million molecules (110.9 million stereoisomers), including three- and four-membered rings and triple ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600423u

    authors: Fink T,Reymond JL

    更新日期:2007-03-01 00:00:00

  • ChemSchematicResolver: A Toolkit to Decode 2D Chemical Diagrams with Labels and R-Groups into Annotated Chemical Named Entities.

    abstract::The number of journal articles in the scientific domain has grown to the point where it has become impossible for researchers to capitalize on all findings in their relevant discipline. Information is stored in these articles in a number of ways, including figures that describe important results. In organic chemistry,...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00042

    authors: Beard EJ,Cole JM

    更新日期:2020-04-27 00:00:00

  • Imputation of Assay Bioactivity Data Using Deep Learning.

    abstract::We describe a novel deep learning neural network method and its application to impute assay pIC50 values. Unlike conventional machine learning approaches, this method is trained on sparse bioactivity data as input, typical of that found in public and commercial databases, enabling it to learn directly from correlation...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00768

    authors: Whitehead TM,Irwin BWJ,Hunt P,Segall MD,Conduit GJ

    更新日期:2019-03-25 00:00:00

  • Automated pharmacophore query optimization with genetic algorithms - a case study using the MC4R system.

    abstract::Due to the recent availability of high quality small molecule databases, such as ZINC and PubChem,1,2 virtual screening is playing an even more important role in identifying biologically relevant molecules in drug discovery campaigns. The success of pharmacophore-based virtual screening (PBVS) relies largely on the ac...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700089w

    authors: Jia L,Zou J,So SS,Sun H

    更新日期:2007-07-01 00:00:00

  • Effects of Ligand Environment in Zr(IV) Assisted Peptide Hydrolysis.

    abstract::In this DFT study, activities of 11 different N2O4, N2O3, and NO2 core containing Zr(IV) complexes, 4,13-diaza-18-crown-6 (I'N2O4), 1,4,10-trioxa-7,13-diazacyclopentadecane (I'N2O3), and 2-(2-methoxy)ethanol (I'NO2), respectively, and their analogues in peptide hydrolysis have been investigated. Based on the experimen...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00781

    authors: Zhang T,Sharma G,Paul TJ,Hoffmann Z,Prabhakar R

    更新日期:2017-05-22 00:00:00

  • Template CoMFA: the 3D-QSAR Grail?

    abstract::Template CoMFA, a novel alignment methodology for training or test set structures in 3D-QSAR, is introduced. Its two most significant advantages are its complete automation and its ability to derive a single combined model from multiple structural series affecting a biological target. Its only two inputs are one or mo...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400696v

    authors: Cramer RD,Wendt B

    更新日期:2014-02-24 00:00:00

  • Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise.

    abstract::We describe a general methodology for designing an empirical scoring function and provide smina, a version of AutoDock Vina specially optimized to support high-throughput scoring and user-specified custom scoring functions. Using our general method, the unique capabilities of smina, a set of default interaction terms ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300604z

    authors: Koes DR,Baumgartner MP,Camacho CJ

    更新日期:2013-08-26 00:00:00

  • An Efficient Lossless Compression Algorithm for Trajectories of Atom Positions and Volumetric Data.

    abstract::We present our newly developed and highly efficient lossless compression algorithm for trajectories of atom positions and volumetric data. The algorithm is designed as a two-step approach. In the first step, efficient polynomial extrapolation schemes reduce the information entropy of the data by exploiting both spatia...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00501

    authors: Brehm M,Thomas M

    更新日期:2018-10-22 00:00:00

  • Prediction of pH-dependent aqueous solubility of druglike molecules.

    abstract::In the present work, the Henderson-Hasselbalch (HH) equation has been employed for the development of a tool for the prediction of pH-dependent aqueous solubility of drugs and drug candidates. A new prediction method for the intrinsic solubility was developed, based on artificial neural networks that have been trained...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600292q

    authors: Hansen NT,Kouskoumvekaki I,Jørgensen FS,Brunak S,Jónsdóttir SO

    更新日期:2006-11-01 00:00:00

  • iUmami-SCM: A Novel Sequence-Based Predictor for Prediction and Analysis of Umami Peptides Using a Scoring Card Method with Propensity Scores of Dipeptides.

    abstract::Umami or the taste of monosodium glutamate represents one of the major attractive taste modalities in humans. Therefore, knowledge about biophysical and biochemical properties of the umami taste is important for both scientific research and the food industry. Experimental approaches for predicting umami peptides are l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00707

    authors: Charoenkwan P,Yana J,Nantasenamat C,Hasan MM,Shoombuatong W

    更新日期:2020-12-28 00:00:00

  • Assessing the Conformational Equilibrium of Carboxylic Acid via Quantum Mechanical and Molecular Dynamics Studies on Acetic Acid.

    abstract::Accurate hydrogen placement in molecular modeling is crucial for studying the interactions and dynamics of biomolecular systems. The carboxyl functional group is a prototypical example of a functional group that requires protonation during structure preparation. To our knowledge, when in their neutral form, carboxylic...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00835

    authors: Lim VT,Bayly CI,Fusti-Molnar L,Mobley DL

    更新日期:2019-05-28 00:00:00

  • Combined approach using ligand efficiency, cross-docking, and antitarget hits for wild-type and drug-resistant Y181C HIV-1 reverse transcriptase.

    abstract::New hits against HIV-1 wild-type and Y181C drug-resistant reverse transcriptases were predicted taking into account the possibility of some of the known metabolism interactions. In silico hits against a set of antitargets (i.e., proteins or nucleic acids that are off-targets from the desired pharmaceutical target obje...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200203h

    authors: García-Sosa AT,Sild S,Takkis K,Maran U

    更新日期:2011-10-24 00:00:00

  • Elements of nucleotide specificity in the Trypanosoma brucei mitochondrial RNA editing enzyme RET2.

    abstract::The causative agent of African sleeping sickness, Trypanosoma brucei , undergoes an unusual mitochondrial RNA editing process that is essential for its survival. RNA editing terminal uridylyl transferase 2 of T. brucei (TbRET2) is an indispensable component of the editosome machinery that performs this editing. TbR...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3001327

    authors: Demir Ö,Amaro RE

    更新日期:2012-05-25 00:00:00

  • Full and partial agonism of ionotropic glutamate receptors indicated by molecular dynamics simulations.

    abstract::Ionotropic glutamate receptors (iGluRs) are synaptic proteins that facilitate signal transmission in the central nervous system. Extracellular iGluR cleft closure is linked to receptor activation; however, the mechanism underlying partial agonism is not entirely understood. Full agonists close the bilobed ligand-bindi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci2000055

    authors: Postila PA,Ylilauri M,Pentikäinen OT

    更新日期:2011-05-23 00:00:00

  • GalaxyGPCRloop: Template-Based and Ab Initio Structure Sampling of the Extracellular Loops of G-Protein-Coupled Receptors.

    abstract::The second extracellular loops (ECL2s) of G-protein-coupled receptors (GPCRs) are often involved in GPCR functions, and their structures have important implications in drug discovery. However, structure prediction of ECL2 is difficult because of its long length and the structural diversity among different GPCRs. In th...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00148

    authors: Won J,Lee GR,Park H,Seok C

    更新日期:2018-06-25 00:00:00

  • In silico deconstruction of ATP-competitive inhibitors of glycogen synthase kinase-3β.

    abstract::Fragment-based methods have emerged in the last two decades as alternatives to traditional high throughput screenings for the identification of chemical starting points in drug discovery. One arguable yet popular assumption about fragment-based design is that the fragment binding mode remains conserved upon chemical e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300355p

    authors: Bisignano P,Lambruschini C,Bicego M,Murino V,Favia AD,Cavalli A

    更新日期:2012-12-21 00:00:00

  • Prediction of molecular solvation free energy based on the optimization of atomic solvation parameters with genetic algorithm.

    abstract::We propose an improved solvent contact model to estimate the solvation free energy of an organic molecule from individual atomic contributions. The modification of the solvation model involves the optimization of three kinds of parameters in the solvation free energy function: atomic fragmental volume, maximum atomic ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600453b

    authors: Kang H,Choi H,Park H

    更新日期:2007-03-01 00:00:00

  • In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Naïve Bayes and Parzen-Rosenblatt window.

    abstract::In this study, two probabilistic machine-learning algorithms were compared for in silico target prediction of bioactive molecules, namely the well-established Laplacian-modified Naïve Bayes classifier (NB) and the more recently introduced (to Cheminformatics) Parzen-Rosenblatt Window. Both classifiers were trained in ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300435j

    authors: Koutsoukas A,Lowe R,Kalantarmotamedi Y,Mussa HY,Klaffke W,Mitchell JB,Glen RC,Bender A

    更新日期:2013-08-26 00:00:00

  • FAME 3: Predicting the Sites of Metabolism in Synthetic Compounds and Natural Products for Phase 1 and Phase 2 Metabolic Enzymes.

    abstract::In this work we present the third generation of FAst MEtabolizer (FAME 3), a collection of extra trees classifiers for the prediction of sites of metabolism (SoMs) in small molecules such as drugs, druglike compounds, natural products, agrochemicals, and cosmetics. FAME 3 was derived from the MetaQSAR database ( Pedre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00376

    authors: Šícho M,Stork C,Mazzolari A,de Bruyn Kops C,Pedretti A,Testa B,Vistoli G,Svozil D,Kirchmair J

    更新日期:2019-08-26 00:00:00

  • Develop and test a solvent accessible surface area-based model in conformational entropy calculations.

    abstract::It is of great interest in modern drug design to accurately calculate the free energies of protein-ligand or nucleic acid-ligand binding. MM-PBSA (molecular mechanics Poisson-Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300064d

    authors: Wang J,Hou T

    更新日期:2012-05-25 00:00:00