Exploring Alternative Strategies for the Identification of Potent Compounds Using Support Vector Machine and Regression Modeling.

Abstract:

:Support vector regression (SVR) is a premier approach for the prediction of compound potency. Given the conceptual link between support vector machine (SVM) and SVR modeling, SVR is capable of accounting for continuous and discontinuous structure-activity relationships (SARs) in potency prediction, which further extends the classical quantitative SAR (QSAR) paradigm. In the context of virtual compound screening, compound potency prediction can be applied to identify the most potent compounds that are available or enrich database selection sets with potent compounds. To these ends, we have evaluated new potency prediction strategies. Conventional (direct) potency prediction using SVR was compared to two-stage SVM-SVR modeling and potency prediction using SVR models trained in the presence of active and inactive compounds, a previously unconsidered approach. The latter models were found to maximize the recall of potent compounds but were least accurate in predicting high potency values. For this purpose, direct SVR predictions were preferred. However, the best balance between accurate potency predictions and enrichment of potent compounds in database selection sets was achieved by combined SVM-SVR modeling. Taken together, our findings further extend current approaches for compound potency prediction in virtual compound screening.

journal_name

J Chem Inf Model

authors

Miyao T,Funatsu K,Bajorath J

doi

10.1021/acs.jcim.8b00584

subject

Has Abstract

pub_date

2019-03-25 00:00:00

pages

983-992

issue

3

eissn

1549-9596

issn

1549-960X

journal_volume

59

pub_type

杂志文章
  • Dihedral-based segment identification and classification of biopolymers I: proteins.

    abstract::A new structure classification scheme for biopolymers is introduced, which is solely based on main-chain dihedral angles. It is shown that by dividing a biopolymer into segments containing two central residues, a local classification can be performed. The method is referred to as DISICL, short for Dihedral-based Segme...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400541d

    authors: Nagy G,Oostenbrink C

    更新日期:2014-01-27 00:00:00

  • Relationships between Molecular Complexity, Biological Activity, and Structural Diversity.

    abstract::Following the theoretical model by Hann et al. moderately complex structures are preferable lead compounds since they lead to specific binding events involving the complete ligand molecule. To make this concept usable in practice for library design, we studied several complexity measures on the biological activity of ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0503558

    authors: Schuffenhauer A,Brown N,Selzer P,Ertl P,Jacoby E

    更新日期:2006-03-01 00:00:00

  • Ligand- and Structure-Based Analysis of Deep Learning-Generated Potential α2a Adrenoceptor Agonists.

    abstract::The α2a adrenoceptor is a medically relevant subtype of the G protein-coupled receptor family. Unfortunately, high-throughput techniques aimed at producing novel drug leads for this receptor have been largely unsuccessful because of the complex pharmacology of adrenergic receptors. As such, cutting-edge in silico liga...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01019

    authors: Schultz KJ,Colby SM,Lin VS,Wright AT,Renslow RS

    更新日期:2021-01-25 00:00:00

  • Use of surface charges from DFT calculations to predict intestinal absorption.

    abstract::A model for prediction of percent intestinal absorption (%Abs) of neutral molecules was developed based upon surface charges of the molecule calculated by density functional theory (DFT). The surface charges are decomposed into sigma moments which are correlated to a partition coefficient representing transfer of the ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci049653f

    authors: Jones R,Connolly PC,Klamt A,Diedenhofen M

    更新日期:2005-09-01 00:00:00

  • Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions.

    abstract::We report a novel method called ADAN (Applicability Domain ANalysis) for assessing the reliability of drug property predictions obtained by in silico methods. The assessment provided by ADAN is based on the comparison of the query compound with the training set, using six diverse similarity criteria. For every criteri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500172z

    authors: Carrió P,Pinto M,Ecker G,Sanz F,Pastor M

    更新日期:2014-05-27 00:00:00

  • Probing the Binding Pathway of BRACO19 to a Parallel-Stranded Human Telomeric G-Quadruplex Using Molecular Dynamics Binding Simulation with AMBER DNA OL15 and Ligand GAFF2 Force Fields.

    abstract::Human telomeric DNA G-quadruplex has been identified as a good therapeutic target in cancer treatment. G-quadruplex-specific ligands that stabilize the G-quadruplex have great potential to be developed as anticancer agents. Two crystal structures (an apo form of parallel stranded human telomeric G-quadruplex and its h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00287

    authors: Machireddy B,Kalra G,Jonnalagadda S,Ramanujachary K,Wu C

    更新日期:2017-11-27 00:00:00

  • Identification of ligand templates using local structure alignment for structure-based drug design.

    abstract::With a rapid increase in the number of high-resolution protein-ligand structures, the known protein-ligand structures can be used to gain insight into ligand-binding modes in a target protein. On the basis of the fact that the structurally similar binding sites share information about their ligands, we have developed ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300178e

    authors: Lee HS,Im W

    更新日期:2012-10-22 00:00:00

  • Flux (2): comparison of molecular mutation and crossover operators for ligand-based de novo design.

    abstract::We implemented a fragment-based de novo design algorithm for a population-based optimization of molecular structures. The concept is grounded on an evolution strategy with mutation and crossover operators for structure breeding. Molecular building blocks were obtained from the pseudo-retrosynthesis of a collection of ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci6005307

    authors: Fechner U,Schneider G

    更新日期:2007-03-01 00:00:00

  • Retrospect and Prospect of Single Particle Cryo-Electron Microscopy: The Class of Integral Membrane Proteins as an Example.

    abstract::A giant technological leap in the field of cryo-electron microscopy (cryo-EM) has assured the achievement of near-atomic resolution structures of biological macromolecules. As a recognition of this accomplishment, the Nobel Prize in Chemistry was awarded in 2017 to Jacques Dubochet, Joachim Frank, and Richard Henderso...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01015

    authors: Akbar S,Mozumder S,Sengupta J

    更新日期:2020-05-26 00:00:00

  • Jaqpot Quattro: A Novel Computational Web Platform for Modeling and Analysis in Nanoinformatics.

    abstract::Engineered nanomaterials (ENMs) are increasingly infiltrating our lives as a result of their applications across multiple fields. However, ENM formulations may result in the modulation of pathways and mechanisms of toxic action that endanger human health and the environment. Alternative testing methods such as in sili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00223

    authors: Chomenidis C,Drakakis G,Tsiliki G,Anagnostopoulou E,Valsamis A,Doganis P,Sopasakis P,Sarimveis H

    更新日期:2017-09-25 00:00:00

  • Whole-molecule calculation of log p based on molar volume, hydrogen bonds, and simulated 13C NMR spectra.

    abstract::The prediction of Log P is usually accomplished using either substructure or whole-molecule approaches. However, these methods are complicated, and previous whole-molecule approaches have not been successful for the prediction of Log P in very complex molecules. The observed chemical shifts in nuclear magnetic resonan...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci049643e

    authors: Schnackenberg LK,Beger RD

    更新日期:2005-03-01 00:00:00

  • Expert system for predicting reaction conditions: the Michael reaction case.

    abstract::A generic chemical transformation may often be achieved under various synthetic conditions. However, for any specific reagents, only one or a few among the reported synthetic protocols may be successful. For example, Michael β-addition reactions may proceed under different choices of solvent (e.g., hydrophobic, aproti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500698a

    authors: Marcou G,Aires de Sousa J,Latino DA,de Luca A,Horvath D,Rietsch V,Varnek A

    更新日期:2015-02-23 00:00:00

  • GPCR-Bench: A Benchmarking Set and Practitioners' Guide for G Protein-Coupled Receptor Docking.

    abstract::Virtual screening is routinely used to discover new ligands and in particular new ligand chemotypes for G protein-coupled receptors (GPCRs). To prepare for a virtual screen, we often tailor a docking protocol that will enable us to select the best candidates for further screening. To aid this, we created GPCR-Bench, a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00660

    authors: Weiss DR,Bortolato A,Tehan B,Mason JS

    更新日期:2016-04-25 00:00:00

  • Potent Human Telomerase Inhibitors: Molecular Dynamic Simulations, Multiple Pharmacophore-Based Virtual Screening, and Biochemical Assays.

    abstract::Telomere maintenance is a universal cancer hallmark, and small molecules that disrupt telomere maintenance generally have anticancer properties. Since the vast majority of cancer cells utilize telomerase activity for telomere maintenance, the enzyme has been considered as an anticancer drug target. Recently, rational ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00336

    authors: Shirgahi Talari F,Bagherzadeh K,Golestanian S,Jarstfer M,Amanlou M

    更新日期:2015-12-28 00:00:00

  • Turbocharging Matched Molecular Pair Analysis: Optimizing the Identification and Analysis of Pairs.

    abstract::We have applied the two most commonly used methods for automatic matched pair identification, obtained the optimum settings, and discovered that the two methods are synergistic. A turbocharging approach to matched pair analysis is advocated in which a first round (a conservative categorical approach that uses an analo...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00335

    authors: Lukac I,Zarnecka J,Griffen EJ,Dossetter AG,St-Gallay SA,Enoch SJ,Madden JC,Leach AG

    更新日期:2017-10-23 00:00:00

  • Improving Protein-Ligand Docking Results with High-Throughput Molecular Dynamics Simulations.

    abstract::Structure-based virtual screening relies on classical scoring functions that often fail to reliably discriminate binders from nonbinders. In this work, we present a high-throughput protein-ligand complex molecular dynamics (MD) simulation that uses the output from AutoDock Vina to improve docking results in distinguis...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00057

    authors: Guterres H,Im W

    更新日期:2020-04-27 00:00:00

  • Criterion for evaluating the predictive ability of nonlinear regression models without cross-validation.

    abstract::We propose predictive performance criteria for nonlinear regression models without cross-validation. The proposed criteria are the determination coefficient and the root-mean-square error for the midpoints between k-nearest-neighbor data points. These criteria can be used to evaluate predictive ability after the regre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4003766

    authors: Kaneko H,Funatsu K

    更新日期:2013-09-23 00:00:00

  • SARANEA: a freely available program to mine structure-activity and structure-selectivity relationship information in compound data sets.

    abstract::We introduce SARANEA, an open-source Java application for interactive exploration of structure-activity relationship (SAR) and structure-selectivity relationship (SSR) information in compound sets of any source. SARANEA integrates various SAR and SSR analysis functions and utilizes a network-like similarity graph data...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900416a

    authors: Lounkine E,Wawer M,Wassermann AM,Bajorath J

    更新日期:2010-01-01 00:00:00

  • In silico deconstruction of ATP-competitive inhibitors of glycogen synthase kinase-3β.

    abstract::Fragment-based methods have emerged in the last two decades as alternatives to traditional high throughput screenings for the identification of chemical starting points in drug discovery. One arguable yet popular assumption about fragment-based design is that the fragment binding mode remains conserved upon chemical e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300355p

    authors: Bisignano P,Lambruschini C,Bicego M,Murino V,Favia AD,Cavalli A

    更新日期:2012-12-21 00:00:00

  • Benchmark performance of MultiCASE Inc. software in Ames mutagenicity set.

    abstract::The predictive performances of MC4PC were evaluated using its learning machine functionality. Its superior characteristics are demonstrated in this following up study using the newly published Ames mutagenicity benchmark set. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 评论,信件

    doi:10.1021/ci1000899

    authors: Saiakhov RD,Klopman G

    更新日期:2010-09-27 00:00:00

  • Template CoMFA: the 3D-QSAR Grail?

    abstract::Template CoMFA, a novel alignment methodology for training or test set structures in 3D-QSAR, is introduced. Its two most significant advantages are its complete automation and its ability to derive a single combined model from multiple structural series affecting a biological target. Its only two inputs are one or mo...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400696v

    authors: Cramer RD,Wendt B

    更新日期:2014-02-24 00:00:00

  • Virtual Screening with Generative Topographic Maps: How Many Maps Are Required?

    abstract::Universal generative topographic maps (GTMs) provide two-dimensional representations of chemical space selected for their "polypharmacological competence", that is, the ability to simultaneously represent meaningful activity and property landscapes, associated with many distinct targets and properties. Several such GT...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00650

    authors: Casciuc I,Zabolotna Y,Horvath D,Marcou G,Bajorath J,Varnek A

    更新日期:2019-01-28 00:00:00

  • "Social" network of isomers based on bond count distance: algorithms.

    abstract::This paper introduces the concept of an isomer network based on the reaction step counts between pairs of isomers as an alternative means to view and analyze isomer space. The computation of isomer networks is computationally expensive with respect to both run time and memory. Accordingly, this paper focuses on the de...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4005173

    authors: Kouri TM,Awale M,Slyby JK,Reymond JL,Mehta DP

    更新日期:2014-01-27 00:00:00

  • Structural and Functional Characterization of Allatostatin Receptor Type-C of Thaumetopoea pityocampa, a Potential Target for Next-Generation Pest Control Agents.

    abstract::Insect neuropeptide receptors, including allatostatin receptor type C (AstR-C), a G protein-coupled receptor, are among the potential targets for designing next-generation pesticides that despite their importance in offering a new mode-of-action have been overlooked. Focusing on AstR-C of Thaumetopoea pityocampa, a co...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00985

    authors: Shahraki A,Işbilir A,Dogan B,Lohse MJ,Durdagi S,Birgul-Iyison N

    更新日期:2021-01-21 00:00:00

  • BiKi Life Sciences: A New Suite for Molecular Dynamics and Related Methods in Drug Discovery.

    abstract::In this paper, we introduce the BiKi Life Sciences suite. This software makes it easy for computational medicinal chemists to run ad hoc molecular dynamics protocols in a novel and task-oriented environment; as a notebook, BiKi (acronym of Binding Kinetics) keeps memory of any activity together with dependencies among...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00680

    authors: Decherchi S,Bottegoni G,Spitaleri A,Rocchia W,Cavalli A

    更新日期:2018-02-26 00:00:00

  • Concept-based semi-automatic classification of drugs.

    abstract::The anatomical therapeutic chemical (ATC) classification system maintained by the World Health Organization provides a global standard for the classification of medical substances and serves as a source for drug repurposing research. Nevertheless, it lacks several drugs that are major players in the global drug market...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9000844

    authors: Gurulingappa H,Kolárik C,Hofmann-Apitius M,Fluck J

    更新日期:2009-08-01 00:00:00

  • A Coarse-Grained Force Field Parameterized for MgCl2 and CaCl2 Aqueous Solutions.

    abstract::Calcium and magnesium ions play important roles in many physicochemical processes. To facilitate the investigation of phenomena related to these ions that occur over large length and time scales, a coarse-grained force field (CGFF) is developed for MgCl2 and CaCl2 aqueous solutions. The ions are modeled by CG beads wi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00206

    authors: Gong Z,Sun H

    更新日期:2017-07-24 00:00:00

  • Including explicit water molecules as part of the protein structure in MM/PBSA calculations.

    abstract::Water is the natural medium of molecules in the cell and plays an important role in protein structure, function and interaction with small molecule ligands. However, the widely used molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) method for binding energy calculation does not explicitly take account of wa...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4001794

    authors: Zhu YL,Beroza P,Artis DR

    更新日期:2014-02-24 00:00:00

  • Improving classical substructure-based virtual screening to handle extrapolation challenges.

    abstract::Target-oriented substructure-based virtual screening (sSBVS) of molecules is a promising approach in drug discovery. Yet, there are doubts whether sSBVS is suitable also for extrapolation, that is, for detecting molecules that are very different from those used for training. Herein, we evaluate the predictive power of...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200472s

    authors: Biniashvili T,Schreiber E,Kliger Y

    更新日期:2012-03-26 00:00:00

  • An integrated approach to ligand- and structure-based drug design: development and application to a series of serine protease inhibitors.

    abstract::A novel approach was developed to rationally interface structure- and ligand-based drug design through the rescoring of docking poses and automated generation of molecular alignments for 3D quantitative structure-activity relationship investigations. The procedure was driven by a genetic algorithm optimizing the value...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800015s

    authors: Nicolotti O,Miscioscia TF,Carotti A,Leonetti F,Carotti A

    更新日期:2008-06-01 00:00:00