Exploring Tunable Hyperparameters for Deep Neural Networks with Industrial ADME Data Sets.

Abstract:

:Deep learning has drawn significant attention in different areas including drug discovery. It has been proposed that it could outperform other machine learning algorithms, especially with big data sets. In the field of pharmaceutical industry, machine learning models are built to understand quantitative structure-activity relationships (QSARs) and predict molecular activities, including absorption, distribution, metabolism, and excretion (ADME) properties, using only molecular structures. Previous reports have demonstrated the advantages of using deep neural networks (DNNs) for QSAR modeling. One of the challenges while building DNN models is identifying the hyperparameters that lead to better generalization of the models. In this study, we investigated several tunable hyperparameters of deep neural network models on 24 industrial ADME data sets. We analyzed the sensitivity and influence of five different hyperparameters including the learning rate, weight decay for L2 regularization, dropout rate, activation function, and the use of batch normalization. This paper focuses on strategies and practices for DNN model building. Further, the optimized model for each data set was built and compared with the benchmark models used in production. Based on our benchmarking results, we propose several practices for building DNN QSAR models.

journal_name

J Chem Inf Model

authors

Zhou Y,Cahya S,Combs SA,Nicolaou CA,Wang J,Desai PV,Shen J

doi

10.1021/acs.jcim.8b00671

subject

Has Abstract

pub_date

2019-03-25 00:00:00

pages

1005-1016

issue

3

eissn

1549-9596

issn

1549-960X

journal_volume

59

pub_type

杂志文章
  • Combination of Ambiguous and Unambiguous Data in the Restraint-driven Docking of Flexible Peptides with HADDOCK: The Binding of the Spider Toxin PcTx1 to the Acid Sensing Ion Channel (ASIC) 1a.

    abstract::Peptides that bind to ion channels have attracted much interest as potential lead molecules for the development of new drugs and insecticides. However, the structure determination of large peptide-channel complexes using experimental methods is challenging. Thus structural models are often derived from combining exper...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00529

    authors: Deplazes E,Davies J,Bonvin AM,King GF,Mark AE

    更新日期:2016-01-25 00:00:00

  • In Silico Study of Membrane Lipid Composition Regulating Conformation and Hydration of Influenza Virus B M2 Channel.

    abstract::The proton conduction of transmembrane influenza virus B M2 (BM2) proton channel is possibly mediated by the membrane environment, but the detailed molecular mechanism is challenging to determine. In this work, how membrane lipid composition regulates the conformation and hydration of BM2 channel is elucidated in sili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00329

    authors: Zhang Y,Zhang HX,Zheng QC

    更新日期:2020-07-27 00:00:00

  • VMD Store-A VMD Plugin to Browse, Discover, and Install VMD Extensions.

    abstract::Herein we present the VMD Store, an open-source VMD plugin that simplifies the way that users browse, discover, install, update, and uninstall extensions for the Visual Molecular Dynamics (VMD) software. The VMD Store obtains data about all the indexed VMD extensions hosted on GitHub and presents a one-click mechanism...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00739

    authors: Fernandes HS,Sousa SF,Cerqueira NMFSA

    更新日期:2019-11-25 00:00:00

  • New combined model for the prediction of regioselectivity in cytochrome P450/3A4 mediated metabolism.

    abstract::Cytochrome P450 3A4 metabolizes nearly 50% of the drugs currently in clinical use with a broad range of substrate specificity. Early prediction of metabolites of xenobiotic compounds is crucial for cost efficient drug discovery and development. We developed a new combined model, MLite, for the prediction of regioselec...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7003576

    authors: Oh WS,Kim DN,Jung J,Cho KH,No KT

    更新日期:2008-03-01 00:00:00

  • Discovery of wild-type and Y181C mutant non-nucleoside HIV-1 reverse transcriptase inhibitors using virtual screening with multiple protein structures.

    abstract::To discover non-nucleoside inhibitors of HIV-1 reverse transcriptase (NNRTIs) that are effective against both wild-type (WT) virus and variants that encode the clinically troublesome Tyr181Cys (Y181C) RT mutation, virtual screening by docking was carried out using three RT structures and more than 2 million commercial...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900068k

    authors: Nichols SE,Domaoal RA,Thakur VV,Tirado-Rives J,Anderson KS,Jorgensen WL

    更新日期:2009-05-01 00:00:00

  • Computational and conformational evaluation of FTase alternative substrates: insight into a novel enzyme binding pocket.

    abstract::Protein farnesyltransferase (FTase) is an important anticancer drug target. In an effort to develop isoprenoid diphosphate-based FTase inhibitors, striking variations have been observed in the ability of conservatively modified analogues to bind to the enzyme. For example, 2Z-GGPP is an alternative substrate with high...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0496550

    authors: Henriksen BS,Zahn TJ,Evanseck JD,Firestine SM,Gibbs RA

    更新日期:2005-07-01 00:00:00

  • Effect of input differences on the results of docking calculations.

    abstract::The sensitivity of docking calculations to the geometry of the input ligand was studied. It was found that even small changes in the ligand input conformation can lead to large differences in the geometries and scores of the resulting docked poses. The accuracy of docked poses produced from different ligand input stru...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9000629

    authors: Feher M,Williams CI

    更新日期:2009-07-01 00:00:00

  • Role of halogen bonds in thyroid hormone receptor selectivity: pharmacophore-based 3D-QSSR studies.

    abstract::Most physiological effects of thyroid hormones are mediated by the two thyroid hormone receptor subtypes, TRalpha and TRbeta. Several pharmacological effects mediated by TRbeta might be beneficial in important medical conditions such as obesity, hypercholesterolemia and diabetes, and selective TRbeta activation may el...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900316e

    authors: Valadares NF,Salum LB,Polikarpov I,Andricopulo AD,Garratt RC

    更新日期:2009-11-01 00:00:00

  • Computational simulations of the interactions between acetyl-coenzyme-A carboxylase and clodinafop: resistance mechanism due to active and nonactive site mutations.

    abstract::Grass weed populations resistant to acetyl-CoA carboxylase-inhibiting (ACCase; EC 6.4.1.2) herbicides represent a major problem for the sustainable development of modern agriculture. In the present study, extensive computational simulations, including homology modeling, molecular dynamics (MD) simulations, and molecul...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900174d

    authors: Zhu XL,Ge-Fei H,Zhan CG,Yang GF

    更新日期:2009-08-01 00:00:00

  • Isomerization and Decomposition of 2-Methylfuran with External Forces.

    abstract::The primary goal of this project was to evaluate the performance of the Standard and Enforced Geometry Optimization (SEGO) method which we have recently developed. The SEGO method has been designed for an automatic location of multiple minima on the molecular Potential Energy Surface (PES), and its usefulness has been...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00352

    authors: Brzyska A,Woliński K

    更新日期:2019-08-26 00:00:00

  • Random Forest Refinement of Pairwise Potentials for Protein-Ligand Decoy Detection.

    abstract::An accurate scoring function is expected to correctly select the most stable structure from a set of pose candidates. One can hypothesize that a scoring function's ability to identify the most stable structure might be improved by emphasizing the most relevant atom pairwise interactions. However, it is hard to evaluat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00356

    authors: Pei J,Zheng Z,Kim H,Song LF,Walworth S,Merz MR,Merz KM Jr

    更新日期:2019-07-22 00:00:00

  • Searching for recursively defined generic chemical patterns in nonenumerated fragment spaces.

    abstract::Retrieving molecules with specific structural features is a fundamental requirement of today's molecular database technologies. Estimates claim the chemical space relevant for drug discovery to be around 10⁶⁰ molecules. This figure is many orders of magnitude larger than the amount of molecules conventional databases ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400107k

    authors: Ehrlich HC,Henzler AM,Rarey M

    更新日期:2013-07-22 00:00:00

  • Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions.

    abstract::We report a novel method called ADAN (Applicability Domain ANalysis) for assessing the reliability of drug property predictions obtained by in silico methods. The assessment provided by ADAN is based on the comparison of the query compound with the training set, using six diverse similarity criteria. For every criteri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500172z

    authors: Carrió P,Pinto M,Ecker G,Sanz F,Pastor M

    更新日期:2014-05-27 00:00:00

  • Matched Molecular Series Analysis for ADME Property Prediction.

    abstract::Generation and prioritization of new molecules are the most central part of the drug design process. Matched molecular series analysis (MMSA) has recently been proposed as a formal approach that captures both of these key elements of design. In order to better understand the power of MMSA and its specific limitations,...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00269

    authors: Awale M,Riniker S,Kramer C

    更新日期:2020-06-22 00:00:00

  • Selective Fusion of Heterogeneous Classifiers for Predicting Substrates of Membrane Transporters.

    abstract::Membrane transporters play a crucial role in determining fate of administered drugs in a biological system. Early identification of plausible transporters for a drug molecule can provide insights into its therapeutic, pharmacokinetic, and toxicological profiles. In the present study, predictive models for classifying ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00508

    authors: Shaikh N,Sharma M,Garg P

    更新日期:2017-03-27 00:00:00

  • Determination of partition coefficient of spin probe between different lipid membrane phases.

    abstract::Model lipid membranes made from binary mixtures of dimyristoylphosphatidylcholine/dipalmitoylphosphatidylcholine (DMPC/DPPC) and dimyristoylphosphatidylcholine/cholesterol (DMPC/Chol) exhibit coexistence of diverse lipid phases at appropriate temperature and composition. Since lipids in different phases show different...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0501793

    authors: Arsov Z,Strancar J

    更新日期:2005-11-01 00:00:00

  • Comparative analysis of binding energy of chymostatin with human cathepsin A and its homologous proteins by molecular orbital calculation.

    abstract::Cathepsin A is a mammalian lysosomal enzyme that catalyzes the hydrolysis of the carboxy-terminal amino acids of polypeptides and also regulates beta-galactosidase and neuraminidase-1 activities through the formation of a multienzymic complex in lysosomes. Human cathepsin A (hCathA), yeast carboxypeptidase (CPY), and ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060093p

    authors: Yoshida T,Lepp Z,Kadota Y,Satoh Y,Itoh K,Chuman H

    更新日期:2006-09-01 00:00:00

  • GPCR-Bench: A Benchmarking Set and Practitioners' Guide for G Protein-Coupled Receptor Docking.

    abstract::Virtual screening is routinely used to discover new ligands and in particular new ligand chemotypes for G protein-coupled receptors (GPCRs). To prepare for a virtual screen, we often tailor a docking protocol that will enable us to select the best candidates for further screening. To aid this, we created GPCR-Bench, a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00660

    authors: Weiss DR,Bortolato A,Tehan B,Mason JS

    更新日期:2016-04-25 00:00:00

  • Structural and Functional Characterization of Allatostatin Receptor Type-C of Thaumetopoea pityocampa, a Potential Target for Next-Generation Pest Control Agents.

    abstract::Insect neuropeptide receptors, including allatostatin receptor type C (AstR-C), a G protein-coupled receptor, are among the potential targets for designing next-generation pesticides that despite their importance in offering a new mode-of-action have been overlooked. Focusing on AstR-C of Thaumetopoea pityocampa, a co...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00985

    authors: Shahraki A,Işbilir A,Dogan B,Lohse MJ,Durdagi S,Birgul-Iyison N

    更新日期:2021-01-21 00:00:00

  • Molecular Dynamics Simulations of Membrane-Bound STIM1 to Investigate Conformational Changes during STIM1 Activation upon Calcium Release.

    abstract::Calcium is involved in important intracellular processes, such as intracellular signaling from cell membrane receptors to the nucleus. Typically, calcium levels are kept at less than 100 nM in the nucleus and cytosol, but some calcium is stored in the endoplasmic reticulum (ER) lumen for rapid release to activate intr...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00475

    authors: Mukherjee S,Karolak A,Debant M,Buscaglia P,Renaudineau Y,Mignen O,Guida WC,Brooks WH

    更新日期:2017-02-27 00:00:00

  • Coarse-Grained Prediction of Peptide Binding to G-Protein Coupled Receptors.

    abstract::In this study, we used the Martini Coarse-Grained model with no applied restraints to predict the binding mode of some peptides to G-Protein Coupled Receptors (GPCRs). Both the Neurotensin-1 and the chemokine CXCR4 receptors were used as test cases. Their ligands, NTS8-13 and CVX15 peptides, respectively, were initial...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00503

    authors: Delort B,Renault P,Charlier L,Raussin F,Martinez J,Floquet N

    更新日期:2017-03-27 00:00:00

  • Concept-based semi-automatic classification of drugs.

    abstract::The anatomical therapeutic chemical (ATC) classification system maintained by the World Health Organization provides a global standard for the classification of medical substances and serves as a source for drug repurposing research. Nevertheless, it lacks several drugs that are major players in the global drug market...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9000844

    authors: Gurulingappa H,Kolárik C,Hofmann-Apitius M,Fluck J

    更新日期:2009-08-01 00:00:00

  • Identification of ligand templates using local structure alignment for structure-based drug design.

    abstract::With a rapid increase in the number of high-resolution protein-ligand structures, the known protein-ligand structures can be used to gain insight into ligand-binding modes in a target protein. On the basis of the fact that the structurally similar binding sites share information about their ligands, we have developed ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300178e

    authors: Lee HS,Im W

    更新日期:2012-10-22 00:00:00

  • Search for novel aminoglycosides by combining fragment-based virtual screening and 3D-QSAR scoring.

    abstract::Aminoglycosides are antibiotics targeting the 16S RNA A site of the bacterial ribosome. There have been many efforts directed toward design of their synthetic derivatives, however with only few successes. As RNA binders, aminoglycosides are also a difficult target for computational drug design, since most of the exist...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800361a

    authors: Setny P,Trylska J

    更新日期:2009-02-01 00:00:00

  • Coordination of Na(+) by monoamine ligands in dopamine, norepinephrine, and serotonin transporters.

    abstract::The reuptake of neurotransmitters by dopamine, norepinephrine, and serotonin transporters during neuronal transmission requires a sodium gradient. An "ionic mode" of binding proposes that aspartate anchors the ligand's positive charge but ignores the direct role of sodium in ligand binding seen in the only representat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700255d

    authors: Xhaard H,Backström V,Denessiouk K,Johnson MS

    更新日期:2008-07-01 00:00:00

  • TAMkin: a versatile package for vibrational analysis and chemical kinetics.

    abstract::TAMkin is a program for the calculation and analysis of normal modes, thermochemical properties and chemical reaction rates. At present, the output from the frequently applied software programs ADF, CHARMM, CPMD, CP2K, Gaussian, Q-Chem, and VASP can be analyzed. The normal-mode analysis can be performed using a broad ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100099g

    authors: Ghysels A,Verstraelen T,Hemelsoet K,Waroquier M,Van Speybroeck V

    更新日期:2010-09-27 00:00:00

  • A Coarse-Grained Force Field Parameterized for MgCl2 and CaCl2 Aqueous Solutions.

    abstract::Calcium and magnesium ions play important roles in many physicochemical processes. To facilitate the investigation of phenomena related to these ions that occur over large length and time scales, a coarse-grained force field (CGFF) is developed for MgCl2 and CaCl2 aqueous solutions. The ions are modeled by CG beads wi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00206

    authors: Gong Z,Sun H

    更新日期:2017-07-24 00:00:00

  • QM/MM calculations in drug discovery: a useful method for studying binding phenomena?

    abstract::Herein we investigate whether QM/MM could prove useful as a tool to study the often subtle binding phenomena found within pharmaceutical drug discovery programs. The goal of this investigation is to determine whether it is possible to employ high level QM/MM calculations to answer specific questions around a binding e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800419j

    authors: Gleeson MP,Gleeson D

    更新日期:2009-03-01 00:00:00

  • Evaluation of Generalized Born Models for Large Scale Affinity Prediction of Cyclodextrin Host-Guest Complexes.

    abstract::Binding affinity prediction with implicit solvent models remains a challenge in virtual screening for drug discovery. In order to assess the predictive power of implicit solvent models in docking techniques with Amber scoring, three generalized Born models (GBHCT, GBOBCI, and GBOBCII) available in Dock 6.7 were utiliz...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00418

    authors: Zhang H,Yin C,Yan H,van der Spoel D

    更新日期:2016-10-24 00:00:00

  • Study of chromatographic retention of natural terpenoids by chemoinformatic tools.

    abstract::The study of chromatographic retention of natural products can be used to increase their identification speed in complex biological matrices. In this work, six variables were used to study the retention behavior in reversed phase liquid chromatography of 39 sesquiterpene lactones (SL) from an in-house database using c...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500581q

    authors: Oliveira TB,Gobbo-Neto L,Schmidt TJ,Da Costa FB

    更新日期:2015-01-26 00:00:00