Imputation of Assay Bioactivity Data Using Deep Learning.

Abstract:

:We describe a novel deep learning neural network method and its application to impute assay pIC50 values. Unlike conventional machine learning approaches, this method is trained on sparse bioactivity data as input, typical of that found in public and commercial databases, enabling it to learn directly from correlations between activities measured in different assays. In two case studies on public domain data sets we show that the neural network method outperforms traditional quantitative structure-activity relationship (QSAR) models and other leading approaches. Furthermore, by focusing on only the most confident predictions the accuracy is increased to R2 > 0.9 using our method, as compared to R2 = 0.44 when reporting all predictions.

journal_name

J Chem Inf Model

authors

Whitehead TM,Irwin BWJ,Hunt P,Segall MD,Conduit GJ

doi

10.1021/acs.jcim.8b00768

subject

Has Abstract

pub_date

2019-03-25 00:00:00

pages

1197-1204

issue

3

eissn

1549-9596

issn

1549-960X

journal_volume

59

pub_type

杂志文章
  • Nonadditivity Analysis.

    abstract::We introduce the statistics behind a novel type of SAR analysis named "nonadditivity analysis". On the basis of all pairs of matched pairs within a given data set, the approach analyzes whether the same transformations between related molecules have the same effect, i.e., whether they are additive. Assuming that the e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00631

    authors: Kramer C

    更新日期:2019-09-23 00:00:00

  • Building Graphs To Describe Dynamics, Kinetics, and Energetics in the d-ALa:d-Lac Ligase VanA.

    abstract::The d-Ala:d-Lac ligase, VanA, plays a critical role in the resistance of vancomycin. Indeed, it is involved in the synthesis of a peptidoglycan precursor, to which vancomycin cannot bind. The reaction catalyzed by VanA requires the opening of the so-called "ω-loop", so that the substrates can enter the active site. He...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00211

    authors: Duclert-Savatier N,Bouvier G,Nilges M,Malliavin TE

    更新日期:2016-09-26 00:00:00

  • Prediction of the Favorable Hydration Sites in a Protein Binding Pocket and Its Application to Scoring Function Formulation.

    abstract::The important role of water molecules in protein-ligand binding energetics has attracted wide attention in recent years. A range of computational methods has been developed to predict the favorable locations of water molecules in a protein binding pocket. Most of the current methods are based on extensive molecular dy...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00619

    authors: Li Y,Gao Y,Holloway MK,Wang R

    更新日期:2020-09-28 00:00:00

  • Molecular simulations of aromatase reveal new insights into the mechanism of ligand binding.

    abstract::CYP19A1, also known as aromatase or estrogen synthetase, is the rate-limiting enzyme in the biosynthesis of estrogens from their corresponding androgens. Several clinically used breast cancer therapies target aromatase. In this work, explicitly solvated all-atom molecular dynamics simulations of aromatase with a model...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400225w

    authors: Park J,Czapla L,Amaro RE

    更新日期:2013-08-26 00:00:00

  • Assessing different classification methods for virtual screening.

    abstract::How well do different classification methods perform in selecting the ligands of a protein target out of large compound collections not used to train the model? Support vector machines, random forest, artificial neural networks, k-nearest-neighbor classification with genetic-algorithm-optimized feature selection, tren...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050519k

    authors: Plewczynski D,Spieser SA,Koch U

    更新日期:2006-05-01 00:00:00

  • Scores of extended connectivity fingerprint as descriptors in QSPR study of melting point and aqueous solubility.

    abstract::QSPR studies, using scores of SciTegic's extended connectivity fingerprint as raw descriptors, were extended to the prediction of melting points and aqueous solubility of organic compounds. Robust partial least-squares models were developed that perform as well as the best published QSPR models for structurally divers...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800024c

    authors: Zhou D,Alelyunas Y,Liu R

    更新日期:2008-05-01 00:00:00

  • Regulation of JAK2 activation by Janus homology 2: evidence from molecular dynamics simulations.

    abstract::Janus kinase 2 (JAK2) is a protein tyrosine kinase implicated in signaling by specific members of the cytokine receptor family. Although it has been established that the JAK2 tyrosine kinase is negatively regulated by the JAK homology 2 (JH2) pseudokinase domain, the underlying mechanism of JH2 mediated regulation rem...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300308g

    authors: Wan S,Coveney PV

    更新日期:2012-11-26 00:00:00

  • Criterion for evaluating the predictive ability of nonlinear regression models without cross-validation.

    abstract::We propose predictive performance criteria for nonlinear regression models without cross-validation. The proposed criteria are the determination coefficient and the root-mean-square error for the midpoints between k-nearest-neighbor data points. These criteria can be used to evaluate predictive ability after the regre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4003766

    authors: Kaneko H,Funatsu K

    更新日期:2013-09-23 00:00:00

  • RED: a set of molecular descriptors based on Renyi entropy.

    abstract::New molecular descriptors, RED (Renyi entropy descriptors), based on the generalized entropies introduced by Renyi are presented. Topological descriptors based on molecular features have proven to be useful for describing molecular profiles. Renyi entropy is used as a variability measure to contract a feature-pair dis...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900275w

    authors: Delgado-Soler L,Toral R,Tomás MS,Rubio-Martinez J

    更新日期:2009-11-01 00:00:00

  • Trust, but Verify II: A Practical Guide to Chemogenomics Data Curation.

    abstract::There is a growing public concern about the lack of reproducibility of experimental data published in peer-reviewed scientific literature. Herein, we review the most recent alerts regarding experimental data quality and discuss initiatives taken thus far to address this problem, especially in the area of chemical geno...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章,评审

    doi:10.1021/acs.jcim.6b00129

    authors: Fourches D,Muratov E,Tropsha A

    更新日期:2016-07-25 00:00:00

  • GPCR-Bench: A Benchmarking Set and Practitioners' Guide for G Protein-Coupled Receptor Docking.

    abstract::Virtual screening is routinely used to discover new ligands and in particular new ligand chemotypes for G protein-coupled receptors (GPCRs). To prepare for a virtual screen, we often tailor a docking protocol that will enable us to select the best candidates for further screening. To aid this, we created GPCR-Bench, a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00660

    authors: Weiss DR,Bortolato A,Tehan B,Mason JS

    更新日期:2016-04-25 00:00:00

  • Develop and test a solvent accessible surface area-based model in conformational entropy calculations.

    abstract::It is of great interest in modern drug design to accurately calculate the free energies of protein-ligand or nucleic acid-ligand binding. MM-PBSA (molecular mechanics Poisson-Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300064d

    authors: Wang J,Hou T

    更新日期:2012-05-25 00:00:00

  • Structure-Based Rational Design of Novel Inhibitors Against Fructose-1,6-Bisphosphate Aldolase from Candida albicans.

    abstract::Class II fructose-1,6-bisphosphate aldolases (FBA-II) are attractive new targets for the discovery of drugs to combat invasive fungal infection, because they are absent in animals and higher plants. Although several FBA-II inhibitors have been reported, none of these inhibitors exhibit antifungal effect so far. In thi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00763

    authors: Han X,Zhu X,Hong Z,Wei L,Ren Y,Wan F,Zhu S,Peng H,Guo L,Rao L,Feng L,Wan J

    更新日期:2017-06-26 00:00:00

  • Tautomer Standardization in Chemical Databases: Deriving Business Rules from Quantum Chemistry.

    abstract::Databases of small, potentially bioactive molecules are ubiquitous across the industry and academia. Designed such that each unique compound should appear only once, the multiplicity of ways in which many compounds can be represented means that these databases require methods for standardizing the representation of ch...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00232

    authors: Baker CM,Kidley NJ,Papachristos K,Hotson M,Carson R,Gravestock D,Pouliot M,Harrison J,Dowling A

    更新日期:2020-08-24 00:00:00

  • ReFlex3D: Refined Flexible Alignment of Molecules Using Shape and Electrostatics.

    abstract::We present an algorithm, ReFlex3D, for the refinement of flexible molecular alignments based on their three-dimensional shape and electrostatic properties. The algorithm is designed to be used with fast conformer generators to refine an initial overlay between two molecules and thus to obtain improved overlaps as judg...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00618

    authors: Schmidt TC,Cosgrove DA,Boström J

    更新日期:2018-04-23 00:00:00

  • Systematic analysis of enzyme-catalyzed reaction patterns and prediction of microbial biodegradation pathways.

    abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700006f

    authors: Oh M,Yamada T,Hattori M,Goto S,Kanehisa M

    更新日期:2007-07-01 00:00:00

  • Assessment of the Cruzain Cysteine Protease Reversible and Irreversible Covalent Inhibition Mechanism.

    abstract::Reversible and irreversible covalent ligands are advanced cysteine protease inhibitors in the drug development pipeline. K777 is an irreversible inhibitor of cruzain, a necessary enzyme for the survival of the Trypanosoma cruzi (T. cruzi) parasite, the causative agent of Chagas disease. Despite their importance, irrev...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01138

    authors: Silva JRA,Cianni L,Araujo D,Batista PHJ,de Vita D,Rosini F,Leitão A,Lameira J,Montanari CA

    更新日期:2020-03-23 00:00:00

  • CoMFA, CoMSIA, and molecular hologram QSAR studies of novel neuronal nAChRs ligands-open ring analogues of 3-pyridyl ether.

    abstract::3-Pyridyl ethers are excellent nAChRs ligands, which show high subtype selectivity and binding affinity to alpha4beta2 nAChR. Although the quantitative structure-activity relationship (QSAR) of nAChRs ligands has been widely investigated using various classes of compounds, the open ring analogues of 3-pyridyl ethers h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0498113

    authors: Zhang H,Li H,Liu C

    更新日期:2005-03-01 00:00:00

  • The valence state combination model: a generic framework for handling tautomers and protonation states.

    abstract::The consistent handling of molecules is probably the most basic and important requirement in the field of cheminformatics. Reliable results can only be obtained if the underlying calculations are independent of the specific way molecules are represented in the input data. However, ensuring consistency is a complex tas...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400724v

    authors: Urbaczek S,Kolodzik A,Rarey M

    更新日期:2014-03-24 00:00:00

  • iUmami-SCM: A Novel Sequence-Based Predictor for Prediction and Analysis of Umami Peptides Using a Scoring Card Method with Propensity Scores of Dipeptides.

    abstract::Umami or the taste of monosodium glutamate represents one of the major attractive taste modalities in humans. Therefore, knowledge about biophysical and biochemical properties of the umami taste is important for both scientific research and the food industry. Experimental approaches for predicting umami peptides are l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00707

    authors: Charoenkwan P,Yana J,Nantasenamat C,Hasan MM,Shoombuatong W

    更新日期:2020-12-28 00:00:00

  • Role of water in ligand binding to maltose-binding protein: insight from a new docking protocol based on the 3D-RISM-KH molecular theory of solvation.

    abstract::Maltose-binding protein is a periplasmic binding protein responsible for transport of maltooligosaccarides through the periplasmic space of Gram-negative bacteria, as a part of the ABC transport system. The molecular mechanisms of the initial ligand binding and induced large scale motion of the protein's domains still...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500520q

    authors: Huang W,Blinov N,Wishart DS,Kovalenko A

    更新日期:2015-02-23 00:00:00

  • TAMkin: a versatile package for vibrational analysis and chemical kinetics.

    abstract::TAMkin is a program for the calculation and analysis of normal modes, thermochemical properties and chemical reaction rates. At present, the output from the frequently applied software programs ADF, CHARMM, CPMD, CP2K, Gaussian, Q-Chem, and VASP can be analyzed. The normal-mode analysis can be performed using a broad ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100099g

    authors: Ghysels A,Verstraelen T,Hemelsoet K,Waroquier M,Van Speybroeck V

    更新日期:2010-09-27 00:00:00

  • Discovery and Evaluation of Anti-Fibrinolytic Plasmin Inhibitors Derived from 5-(4-Piperidyl)isoxazol-3-ol (4-PIOL).

    abstract::Inhibition of plasmin has been found to effectively reduce fibrinolysis and to avoid hemorrhage. This can be achieved by addressing its kringle 1 domain with the known drug and lysine analogue tranexamic acid. Guided by shape similarities toward a previously discovered lead compound, 5-(4-piperidyl)isoxazol-3-ol, a se...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00255

    authors: Schmidt TC,Eriksson PO,Gustafsson D,Cosgrove D,Frølund B,Boström J

    更新日期:2017-07-24 00:00:00

  • Predicted Biological Activity of Purchasable Chemical Space.

    abstract::Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity to the nearest bioactive to predict a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00316

    authors: Irwin JJ,Gaskins G,Sterling T,Mysinger MM,Keiser MJ

    更新日期:2018-01-22 00:00:00

  • Rapid evaluation of synthetic and molecular complexity for in silico chemistry.

    abstract::Methods that rapidly evaluate molecular complexity and synthetic feasibility are becoming increasingly important for in silico chemistry. We propose a new metric based on relative atomic electronegativities and bond parameters that evaluate both synthetic and molecular complexity (SMCM) starting from chemical structur...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0501387

    authors: Allu TK,Oprea TI

    更新日期:2005-09-01 00:00:00

  • Efficiency of Stratification for Ensemble Docking Using Reduced Ensembles.

    abstract::Molecular docking can account for receptor flexibility by combining the docking score over multiple rigid receptor conformations, such as snapshots from a molecular dynamics simulation. Here, we evaluate a number of common snapshot selection strategies using a quality metric from stratified sampling, the efficiency of...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00314

    authors: Xie B,Clark JD,Minh DDL

    更新日期:2018-09-24 00:00:00

  • Baseline Model for Predicting Protein-Ligand Unbinding Kinetics through Machine Learning.

    abstract::Derivation of structure-kinetics relationships can help rational design and development of new small-molecule drug candidates with desired residence times. Efforts are now being directed toward the development of efficient computational methods. Currently, there is a lack of solid, high-throughput binding kinetics pre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00450

    authors: Amangeldiuly N,Karlov D,Fedorov MV

    更新日期:2020-12-28 00:00:00

  • Assessment of the Sampling Performance of Multiple-Copy Dynamics versus a Unique Trajectory.

    abstract::The goal of the present study was to ascertain the differential performance of a long molecular dynamics trajectory versus several shorter ones starting from different points in the phase space and covering the same sampling time. For this purpose, we selected the 16-mer peptide Bak16BH3 as a model for study and carri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00347

    authors: Perez JJ,Tomas MS,Rubio-Martinez J

    更新日期:2016-10-24 00:00:00

  • Searching for New Leads To Treat Epilepsy: Target-Based Virtual Screening for the Discovery of Anticonvulsant Agents.

    abstract::The purpose of this investigation is to contribute to the development of new anticonvulsant drugs to treat patients with refractory epilepsy. We applied a virtual screening protocol that involved the search into molecular databases of new compounds and known drugs to find small molecules that interact with the open co...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00721

    authors: Palestro PH,Enrique N,Goicoechea S,Villalba ML,Sabatier LL,Martin P,Milesi V,Bruno Blanch LE,Gavernet L

    更新日期:2018-07-23 00:00:00

  • A Grid Map Based Approach to Identify Nonobvious Ligand Design Opportunities in 3D Protein Structure Ensembles.

    abstract::Three-dimensional protein structures are a key requisite for structure-based drug discovery. For many highly relevant targets, medicinal chemists are confronted with large numbers of target structures in their apo-forms or in complex with a wealth of different ligands. To exploit the full potential of such structure e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00051

    authors: Schmalhorst PS,Bergner A

    更新日期:2020-04-27 00:00:00