Random forest models to predict aqueous solubility.

Abstract:

:Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.89 and RMSE = 0.69 log S units. For a standard data set selected from the literature, the model performed well with respect to other documented methods. Finally, the diversity of the training and test sets are compared to the chemical space occupied by molecules in the MDL drug data report, on the basis of molecular descriptors selected by the regression analysis.

journal_name

J Chem Inf Model

authors

Palmer DS,O'Boyle NM,Glen RC,Mitchell JB

doi

10.1021/ci060164k

subject

Has Abstract

pub_date

2007-01-01 00:00:00

pages

150-8

issue

1

eissn

1549-9596

issn

1549-960X

journal_volume

47

pub_type

杂志文章
  • ChemSchematicResolver: A Toolkit to Decode 2D Chemical Diagrams with Labels and R-Groups into Annotated Chemical Named Entities.

    abstract::The number of journal articles in the scientific domain has grown to the point where it has become impossible for researchers to capitalize on all findings in their relevant discipline. Information is stored in these articles in a number of ways, including figures that describe important results. In organic chemistry,...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00042

    authors: Beard EJ,Cole JM

    更新日期:2020-04-27 00:00:00

  • Coarse-Grained Prediction of Peptide Binding to G-Protein Coupled Receptors.

    abstract::In this study, we used the Martini Coarse-Grained model with no applied restraints to predict the binding mode of some peptides to G-Protein Coupled Receptors (GPCRs). Both the Neurotensin-1 and the chemokine CXCR4 receptors were used as test cases. Their ligands, NTS8-13 and CVX15 peptides, respectively, were initial...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00503

    authors: Delort B,Renault P,Charlier L,Raussin F,Martinez J,Floquet N

    更新日期:2017-03-27 00:00:00

  • How to Model Inter- and Intramolecular Hydrogen Bond Strengths with Quantum Chemistry.

    abstract::This article presents the computation of both inter- and intramolecular hydrogen bond strengths from first-principles. Quantum chemical calculations conducted at the dispersion-corrected density functional theory level including free energy and solvation contributions are conducted for (i) one-to-one hydrogen-bonded c...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00132

    authors: Bauer CA

    更新日期:2019-09-23 00:00:00

  • Discovery of wild-type and Y181C mutant non-nucleoside HIV-1 reverse transcriptase inhibitors using virtual screening with multiple protein structures.

    abstract::To discover non-nucleoside inhibitors of HIV-1 reverse transcriptase (NNRTIs) that are effective against both wild-type (WT) virus and variants that encode the clinically troublesome Tyr181Cys (Y181C) RT mutation, virtual screening by docking was carried out using three RT structures and more than 2 million commercial...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900068k

    authors: Nichols SE,Domaoal RA,Thakur VV,Tirado-Rives J,Anderson KS,Jorgensen WL

    更新日期:2009-05-01 00:00:00

  • 3D QSAR methods: Phase and Catalyst compared.

    abstract::The programs Phase and Catalyst HypoGen are compared for their performance in determining three-dimensional quantitative structure-activity relationships. Eight sets of compounds with measured activity were collected from the public literature and partitioned into suitable training and test sets by an automated proced...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7000082

    authors: Evans DA,Doman TN,Thorner DA,Bodkin MJ

    更新日期:2007-05-01 00:00:00

  • Structure-based design and screen of novel inhibitors for class II 3-hydroxy-3-methylglutaryl coenzyme A reductase from Streptococcus pneumoniae.

    abstract::3-Hydroxy-3-methylglutaryl coenzyme A reductase (HMGR) is a primary target in the current clinical treatment of hypercholesterolemia with specific inhibitors of "statin" family. Statins are excellent inhibitors of the class I (human) enzyme but relatively poor inhibitors of the class II enzyme, which are well-known as...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300163v

    authors: Li D,Gui J,Li Y,Feng L,Han X,Sun Y,Sun T,Chen Z,Cao Y,Zhang Y,Zhou L,Hu X,Ren Y,Wan J

    更新日期:2012-07-23 00:00:00

  • AntiBac-Pred: A Web Application for Predicting Antibacterial Activity of Chemical Compounds.

    abstract::Discovery of new antibacterial agents is a never-ending task of medicinal chemistry. Every new drug brings significant improvement to patients with bacterial infections, but prolonged usage of antibacterials leads to the emergence of resistant strains. Therefore, novel active structures with new modes of action are re...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00436

    authors: Pogodin PV,Lagunin AA,Rudik AV,Druzhilovskiy DS,Filimonov DA,Poroikov VV

    更新日期:2019-11-25 00:00:00

  • Facile Solutions to the Problems Associated with Chemical Information and Mathematical Symbolism While Using Machine Translation Tools.

    abstract::Advances in computer-aided translation technology have made tremendous progress in accuracy in the past few years. Chemical Abstracts Service of the American Chemical Society summarizes scientific works from more than 50 languages and allows the users to search papers in nine selected languages. Currently, only the ab...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00274

    authors: Wahab MF,Zulfiqar S,Sarwar MI,Lieberwirth I

    更新日期:2020-07-27 00:00:00

  • Role of halogen bonds in thyroid hormone receptor selectivity: pharmacophore-based 3D-QSSR studies.

    abstract::Most physiological effects of thyroid hormones are mediated by the two thyroid hormone receptor subtypes, TRalpha and TRbeta. Several pharmacological effects mediated by TRbeta might be beneficial in important medical conditions such as obesity, hypercholesterolemia and diabetes, and selective TRbeta activation may el...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900316e

    authors: Valadares NF,Salum LB,Polikarpov I,Andricopulo AD,Garratt RC

    更新日期:2009-11-01 00:00:00

  • Secondary structure characterization based on amino acid composition and availability in proteins.

    abstract::The importance of thorough analyses of the secondary structures in proteins as basic structural units cannot be overemphasized. Although recent computational methods have achieved reasonably high accuracy for predicting secondary structures from amino acid sequences, a simple and fundamental empirical approach to char...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900452z

    authors: Otaki JM,Tsutsumi M,Gotoh T,Yamamoto H

    更新日期:2010-04-26 00:00:00

  • Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets.

    abstract::With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100264e

    authors: Kramer C,Gedeck P

    更新日期:2010-11-22 00:00:00

  • RDChiral: An RDKit Wrapper for Handling Stereochemistry in Retrosynthetic Template Extraction and Application.

    abstract::There is a renewed interest in computer-aided synthesis planning, where the vast majority of approaches require the application of retrosynthetic reaction templates. Here we introduce RDChiral, an open-source Python wrapper for RDKit designed to provide consistent handling of stereochemical information in applying ret...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00286

    authors: Coley CW,Green WH,Jensen KF

    更新日期:2019-06-24 00:00:00

  • ThermoData Engine (TDE): software implementation of the dynamic data evaluation concept. 9. Extensible thermodynamic constraints for pure compounds and new model developments.

    abstract::ThermoData Engine (TDE) is the first full-scale software implementation of the dynamic data evaluation concept, as reported in this journal. The present article describes the background and implementation for new additions in latest release of TDE. Advances are in the areas of program architecture and quality improvem...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4005699

    authors: Diky V,Chirico RD,Muzny CD,Kazakov AF,Kroenlein K,Magee JW,Abdulagatov I,Frenkel M

    更新日期:2013-12-23 00:00:00

  • Dihedral-based segment identification and classification of biopolymers I: proteins.

    abstract::A new structure classification scheme for biopolymers is introduced, which is solely based on main-chain dihedral angles. It is shown that by dividing a biopolymer into segments containing two central residues, a local classification can be performed. The method is referred to as DISICL, short for Dihedral-based Segme...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400541d

    authors: Nagy G,Oostenbrink C

    更新日期:2014-01-27 00:00:00

  • Consensus QSAR models: do the benefits outweigh the complexity?

    abstract::This study has assessed the use of consensus regression, as compared to single multiple linear regression, models for the development of quantitative structure-activity relationships (QSARs). To provide a comparison, four data sets of varying size and complexity were analyzed: silastic membrane flux, toxicity of pheno...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700016d

    authors: Hewitt M,Cronin MT,Madden JC,Rowe PH,Johnson C,Obi A,Enoch SJ

    更新日期:2007-07-01 00:00:00

  • Sharing Data from Molecular Simulations.

    abstract::Given the need for modern researchers to produce open, reproducible scientific output, the lack of standards and best practices for sharing data and workflows used to produce and analyze molecular dynamics (MD) simulations has become an important issue in the field. There are now multiple well-established packages to ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00665

    authors: Abraham M,Apostolov R,Barnoud J,Bauer P,Blau C,Bonvin AMJJ,Chavent M,Chodera J,Čondić-Jurkić K,Delemotte L,Grubmüller H,Howard RJ,Jordan EJ,Lindahl E,Ollila OHS,Selent J,Smith DGA,Stansfeld PJ,Tiemann JKS,Trellet M

    更新日期:2019-10-28 00:00:00

  • A Polarization-Consistent Model for Alcohols to Predict Solvation Free Energies.

    abstract::Classical nonpolarizable models, normally based on a combination of Lennard-Jones sites and point charges, are extensively used to model thermodynamic properties of fluids, including solvation. An important shortcoming of these models is that they do not explicitly account for polarization effects, i.e., a description...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01005

    authors: Barrera MC,Jorge M

    更新日期:2020-03-23 00:00:00

  • H274Y's Effect on Oseltamivir Resistance: What Happens Before the Drug Enters the Binding Site.

    abstract::Increased reports of oseltamivir (OTV)-resistant strains of the influenza virus, such as the H274Y mutation on its neuraminidase (NA), have created some cause for concern. Many studies have been conducted in the attempt to uncover the mechanism of OTV resistance in H274Y NA. However, most of the reported studies on H2...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00331

    authors: Yusuf M,Mohamed N,Mohamad S,Janezic D,Damodaran KV,Wahab HA

    更新日期:2016-01-25 00:00:00

  • Structure-Based Discovery of 1H-Indazole-3-carboxamides as a Novel Structural Class of Human GSK-3 Inhibitors.

    abstract::An in silico screening procedure was performed to select new inhibitors of glycogen synthase kinase 3β (GSK-3β), a serine/threonine protein kinase that in the last two decades has emerged as a key target in drug discovery, having been implicated in multiple cellular processes and linked with the pathogenesis of severa...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00486

    authors: Ombrato R,Cazzolla N,Mancini F,Mangano G

    更新日期:2015-12-28 00:00:00

  • Equally Weighted Multiscale Elastic Network Model and Its Comparison with Traditional and Parameter-Free Models.

    abstract::Dynamical properties of proteins play an essential role in their function exertion. The elastic network model (ENM) is an effective and efficient tool in characterizing the intrinsic dynamical properties encoded in biomacromolecule structures. The Gaussian network model (GNM) and anisotropic network model (ANM) are th...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01178

    authors: Gong W,Liu Y,Zhao Y,Wang S,Han Z,Li C

    更新日期:2021-01-26 00:00:00

  • Combined approach using ligand efficiency, cross-docking, and antitarget hits for wild-type and drug-resistant Y181C HIV-1 reverse transcriptase.

    abstract::New hits against HIV-1 wild-type and Y181C drug-resistant reverse transcriptases were predicted taking into account the possibility of some of the known metabolism interactions. In silico hits against a set of antitargets (i.e., proteins or nucleic acids that are off-targets from the desired pharmaceutical target obje...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200203h

    authors: García-Sosa AT,Sild S,Takkis K,Maran U

    更新日期:2011-10-24 00:00:00

  • DiSCuS: an open platform for (not only) virtual screening results management.

    abstract::DiSCuS, a "Database System for Compound Selection", has been developed. The primary goal of DiSCuS is to aid researchers in the steps subsequent to generating high-throughput virtual screening (HTVS) results, such as selection of compounds for further study, purchase, or synthesis. To do so, DiSCuS provides (1) a stor...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400587f

    authors: Wójcikowski M,Zielenkiewicz P,Siedlecki P

    更新日期:2014-01-27 00:00:00

  • Locating sweet spots for screening hits and evaluating pan-assay interference filters from the performance analysis of two lead-like libraries.

    abstract::The efficiency of automated compound screening is heavily influenced by the design and the quality of the screening libraries used. We recently reported on the assembly of one diverse and one target-focused lead-like screening library. Using data from 15 enzyme-based screenings conducted using these libraries, their p...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300382f

    authors: Mok NY,Maxe S,Brenk R

    更新日期:2013-03-25 00:00:00

  • Estimation of ligand efficacies of metabotropic glutamate receptors from conformational forces obtained from molecular dynamics simulations.

    abstract::Group 1 metabotropic glutamate receptors (mGluR) are G-protein coupled receptors with a large bilobate extracellular ligand binding region (LBR) that resembles a Venus fly trap. Closing of this LBR in the presence of a ligand is associated with the activation of the receptor. From conformational sampling of the LBR-li...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400160x

    authors: Lakkaraju SK,Xue F,Faden AI,MacKerell AD Jr

    更新日期:2013-06-24 00:00:00

  • What Does the Machine Learn? Knowledge Representations of Chemical Reactivity.

    abstract::In a departure from conventional chemical approaches, data-driven models of chemical reactions have recently been shown to be statistically successful using machine learning. These models, however, are largely black box in character and have not provided the kind of chemical insights that historically advanced the fie...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00721

    authors: Kammeraad JA,Goetz J,Walker EA,Tewari A,Zimmerman PM

    更新日期:2020-03-23 00:00:00

  • Including explicit water molecules as part of the protein structure in MM/PBSA calculations.

    abstract::Water is the natural medium of molecules in the cell and plays an important role in protein structure, function and interaction with small molecule ligands. However, the widely used molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) method for binding energy calculation does not explicitly take account of wa...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4001794

    authors: Zhu YL,Beroza P,Artis DR

    更新日期:2014-02-24 00:00:00

  • Gas-phase and solution conformations of selected dimeric structural units of heparin.

    abstract::The molecular structure of four dimeric units (D-E, E-F, F-G, and G-H) of the DEFGH structural unit of heparin, their anionic forms, and their sodium salts have been studied using the B3LYP/6-31+G(d) method. The optimized geometries indicate that the most stable structure of these dimeric units in neutral state is sta...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060060+

    authors: Remko M,von der Lieth CW

    更新日期:2006-07-01 00:00:00

  • Ligand-Based Discovery of a New Scaffold for Allosteric Modulation of the μ-Opioid Receptor.

    abstract::With the hope of discovering effective analgesics with fewer side effects, attention has recently shifted to allosteric modulators of the opioid receptors. In the past two years, the first chemotypes of positive or silent allosteric modulators (PAMs or SAMs, respectively) of μ- and δ-opioid receptor types have been re...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00388

    authors: Bisignano P,Burford NT,Shang Y,Marlow B,Livingston KE,Fenton AM,Rockwell K,Budenholzer L,Traynor JR,Gerritz SW,Alt A,Filizola M

    更新日期:2015-09-28 00:00:00

  • Retrospect and Prospect of Single Particle Cryo-Electron Microscopy: The Class of Integral Membrane Proteins as an Example.

    abstract::A giant technological leap in the field of cryo-electron microscopy (cryo-EM) has assured the achievement of near-atomic resolution structures of biological macromolecules. As a recognition of this accomplishment, the Nobel Prize in Chemistry was awarded in 2017 to Jacques Dubochet, Joachim Frank, and Richard Henderso...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01015

    authors: Akbar S,Mozumder S,Sengupta J

    更新日期:2020-05-26 00:00:00

  • Efficient Strategy for the Calculation of Solvation Free Energies in Water and Chloroform at the Quantum Mechanical/Molecular Mechanical Level.

    abstract::The partitioning of solute molecules between immiscible solvents with significantly different polarities is of great importance. The polarization between the solute and solvent molecules plays an essential role in determining the solubility of the solute, which makes computational studies utilizing molecular mechanics...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00001

    authors: Wang M,Li P,Jia X,Liu W,Shao Y,Hu W,Zheng J,Brooks BR,Mei Y

    更新日期:2017-10-23 00:00:00