Efficient Corrections for DFT Noncovalent Interactions Based on Ensemble Learning Models.

Abstract:

:Machine learning has exhibited powerful capabilities in many areas. However, machine learning models are mostly database dependent, requiring a new model if the database changes. Therefore, a universal model is highly desired to accommodate the widest variety of databases. Fortunately, this universality may be achieved by ensemble learning, which can integrate multiple learners to meet the demands of diversified databases. Therefore, we propose a general procedure for learning ensemble establishment based on noncovalent interactions (NCIs) databases. Additionally, accurate NCI computation is quite demanding for first-principles methods, for which a competent machine learning model can be an efficient solution to obtain high NCI accuracy with minimal computational resources. In regard to these aspects, multiple schemes of ensemble learning models (Bagging, Boosting, and Stacking frameworks), are explored in this study. The models are based on various low levels of density functional theory (DFT) calculations for the benchmark databases S66, S22, and X40. All NCIs computed by the DFT calculations can be improved to high-level accuracy (root-mean-square error RMSE = 0.22 kcal/mol in contrast to CCSD(T)/CBS benchmark) by established ensemble learning models. Compared with single machine learning models, ensemble models show better accuracy (RMSE of the best model is further lowered by ∼25%), robustness and goodness-of-fit according to evaluation parameters suggested by the OECD. Among ensemble learning models, heterogeneous Stacking ensemble models show the most valuable application potential. The standardized procedure of constructing learning ensembles has been well utilized on several NCI data sets, and this procedure may also be applicable for other chemical databases.

journal_name

J Chem Inf Model

authors

Li W,Miao W,Cui J,Fang C,Su S,Li H,Hu L,Lu Y,Chen G

doi

10.1021/acs.jcim.8b00878

subject

Has Abstract

pub_date

2019-05-28 00:00:00

pages

1849-1857

issue

5

eissn

1549-9596

issn

1549-960X

journal_volume

59

pub_type

杂志文章
  • Polarizable Force Field for Molecular Ions Based on the Classical Drude Oscillator.

    abstract::Development of accurate force field parameters for molecular ions in the context of a polarizable energy function based on the classical Drude oscillator is a crucial step toward an accurate polarizable model for modeling and simulations of biological macromolecules. Toward this goal we have undertaken a hierarchical ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00132

    authors: Lin FY,Lopes PEM,Harder E,Roux B,MacKerell AD Jr

    更新日期:2018-05-29 00:00:00

  • Underestimated Halogen Bonds Forming with Protein Backbone in Protein Data Bank.

    abstract::Halogen bonds (XBs) are attracting increasing attention in biological systems. Protein Data Bank (PDB) archives experimentally determined XBs in biological macromolecules. However, no software for structure refinement in X-ray crystallography takes into account XBs, which might result in the weakening or even vanishin...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00235

    authors: Zhang Q,Xu Z,Shi J,Zhu W

    更新日期:2017-07-24 00:00:00

  • Discovery of New SIRT2 Inhibitors by Utilizing a Consensus Docking/Scoring Strategy and Structure-Activity Relationship Analysis.

    abstract::SIRT2, which is a NAD+ (nicotinamide adenine dinucleotide) dependent deacetylase, has been demonstrated to play an important role in the occurrence and development of a variety of diseases such as cancer, ischemia-reperfusion, and neurodegenerative diseases. Small molecule inhibitors of SIRT2 are thought to be potenti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00714

    authors: Huang S,Song C,Wang X,Zhang G,Wang Y,Jiang X,Sun Q,Huang L,Xiang R,Hu Y,Li L,Yang S

    更新日期:2017-04-24 00:00:00

  • Protein-protein binding site prediction by local structural alignment.

    abstract::Generalization of an earlier algorithm has led to the development of new local structural alignment algorithms for prediction of protein-protein binding sites. The algorithms use maximum cliques on protein graphs to define structurally similar protein regions. The search for structural neighbors in the new algorithms ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100265x

    authors: Carl N,Konc J,Vehar B,Janezic D

    更新日期:2010-10-25 00:00:00

  • Mesoscopic simulation of phospholipid membranes, peptides, and proteins with molecular fragment dynamics.

    abstract::Molecular fragment dynamics (MFD) is a variant of dissipative particle dynamics (DPD), a coarse-grained mesoscopic simulation technique for isothermal complex fuids and soft matter systems with particles that are chosen to be adequate fluid elements. MFD choses its particles to be small molecules which may be connecte...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci5006096

    authors: Truszkowski A,van den Broek K,Kuhn H,Zielesny A,Epple M

    更新日期:2015-05-26 00:00:00

  • Improved CoMFA modeling by optimization of settings.

    abstract::The possibility of improving the predictive ability of comparative molecular field analysis (CoMFA) by settings optimization has been evaluated to show that CoMFA predictive ability can be improved. Ten different CoMFA settings are evaluated, producing a total of 6120 models. This method has been applied to nine diffe...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci049612j

    authors: Peterson SD,Schaal W,Karlén A

    更新日期:2006-01-01 00:00:00

  • Computational Prediction and Biochemical Analyses of New Inverse Agonists for the CB1 Receptor.

    abstract::Human cannabinoid type 1 (CB1) G-protein coupled receptor is a potential therapeutic target for obesity. The previously predicted and experimentally validated ensemble of ligand-free conformations of CB1 [Scott, C. E. et al. Protein Sci. 2013 , 22 , 101 - 113 ; Ahn, K. H. et al. Proteins 2013 , 81 , 1304 - 1317] are u...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00581

    authors: Scott CE,Ahn KH,Graf ST,Goddard WA 3rd,Kendall DA,Abrol R

    更新日期:2016-01-25 00:00:00

  • Probing the Binding Pathway of BRACO19 to a Parallel-Stranded Human Telomeric G-Quadruplex Using Molecular Dynamics Binding Simulation with AMBER DNA OL15 and Ligand GAFF2 Force Fields.

    abstract::Human telomeric DNA G-quadruplex has been identified as a good therapeutic target in cancer treatment. G-quadruplex-specific ligands that stabilize the G-quadruplex have great potential to be developed as anticancer agents. Two crystal structures (an apo form of parallel stranded human telomeric G-quadruplex and its h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00287

    authors: Machireddy B,Kalra G,Jonnalagadda S,Ramanujachary K,Wu C

    更新日期:2017-11-27 00:00:00

  • OPUS-Rota3: Improving Protein Side-Chain Modeling by Deep Neural Networks and Ensemble Methods.

    abstract::Side-chain modeling is critical for protein structure prediction since the uniqueness of the protein structure is largely determined by its side-chain packing conformation. In this paper, differing from most approaches that rely on rotamer library sampling, we first propose a novel side-chain rotamer prediction method...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00951

    authors: Xu G,Wang Q,Ma J

    更新日期:2020-12-28 00:00:00

  • Selective Fusion of Heterogeneous Classifiers for Predicting Substrates of Membrane Transporters.

    abstract::Membrane transporters play a crucial role in determining fate of administered drugs in a biological system. Early identification of plausible transporters for a drug molecule can provide insights into its therapeutic, pharmacokinetic, and toxicological profiles. In the present study, predictive models for classifying ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00508

    authors: Shaikh N,Sharma M,Garg P

    更新日期:2017-03-27 00:00:00

  • Combined 3D-QSAR modeling and molecular docking study on indolinone derivatives as inhibitors of 3-phosphoinositide-dependent protein kinase-1.

    abstract::3-Phosphoinositide-dependent protein kinase-1 (PDK1) is a promising target for developing novel anticancer drugs. In order to understand the structure-activity correlation of indolinone-based PDK1 inhibitors, we have carried out a combined molecular docking and three-dimensional quantitative structure-activity relatio...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800147v

    authors: AbdulHameed MD,Hamza A,Liu J,Zhan CG

    更新日期:2008-09-01 00:00:00

  • GalaxyDock: protein-ligand docking with flexible protein side-chains.

    abstract::An important issue in developing protein-ligand docking methods is how to incorporate receptor flexibility. Consideration of receptor flexibility using an ensemble of precompiled receptor conformations or by employing an effectively enlarged binding pocket has been reported to be useful. However, direct consideration ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300342z

    authors: Shin WH,Seok C

    更新日期:2012-12-21 00:00:00

  • Factors affecting d-block metal-ligand bond lengths: toward an automated library of molecular geometry for metal complexes.

    abstract::Metal-ligand (M-L) bond lengths for a range of ligands (carboxylates, chlorides, pyridines, water, tertiary phosphines, and alkenes) and a variety of metals have been retrieved from the Cambridge Structural Database, CSD. Analysis of the factors which affect M-L bond lengths (for example, ligand coordination mode, oxi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0500785

    authors: Harris SE,Orpen AG,Bruno IJ,Taylor R

    更新日期:2005-11-01 00:00:00

  • Prediction and Experimental Confirmation of Novel Peripheral Cannabinoid-1 Receptor Antagonists.

    abstract::Small molecules targeting peripheral CB1 receptors have therapeutic potential in a variety of disorders including obesity-related, hormonal, and metabolic abnormalities, while avoiding the psychoactive effects in the central nervous system. We applied our in-house algorithm, iterative stochastic elimination, to produc...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00577

    authors: El-Atawneh S,Hirsch S,Hadar R,Tam J,Goldblum A

    更新日期:2019-09-23 00:00:00

  • Radial clustergrams: visualizing the aggregate properties of hierarchical clusters.

    abstract::A new radial space-filling method for visualizing cluster hierarchies is presented. The method, referred to as a radial clustergram, arranges the clusters into a series of layers, each representing a different level of the tree. It uses adjacency of nodes instead of links to represent parent-child relationships and al...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600427x

    authors: Agrafiotis DK,Bandyopadhyay D,Farnum M

    更新日期:2007-01-01 00:00:00

  • Discovery of Inhibitors of Four Bromodomains by Fragment-Anchored Ligand Docking.

    abstract::The high-throughput docking protocol called ALTA-VS (anchor-based library tailoring approach for virtual screening) was developed in 2005 for the efficient in silico screening of large libraries of compounds by preselection of only those molecules that have optimal fragments (anchors) for the protein target. Here we p...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00336

    authors: Marchand JR,Dalle Vedove A,Lolli G,Caflisch A

    更新日期:2017-10-23 00:00:00

  • Homology model-guided 3D-QSAR studies of HIV-1 integrase inhibitors.

    abstract::In the present study, we report the exploration of binding modes of potent HIV-1 integrase (IN) inhibitors MK-0518 (raltegravir) and GS-9137 (elvitegravir) as well as chalcone and related amide IN inhibitors we recently synthesized and the development of 3D-QSAR models for integrase inhibition. Homology models of DNA-...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200485a

    authors: Sharma H,Cheng X,Buolamwini JK

    更新日期:2012-02-27 00:00:00

  • Prediction of pH-dependent aqueous solubility of druglike molecules.

    abstract::In the present work, the Henderson-Hasselbalch (HH) equation has been employed for the development of a tool for the prediction of pH-dependent aqueous solubility of drugs and drug candidates. A new prediction method for the intrinsic solubility was developed, based on artificial neural networks that have been trained...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600292q

    authors: Hansen NT,Kouskoumvekaki I,Jørgensen FS,Brunak S,Jónsdóttir SO

    更新日期:2006-11-01 00:00:00

  • Characterization of DNA primary sequences by a new similarity/diversity measure based on the partial ordering.

    abstract::The similarity/diversity measures play a fundamental role in library searching, virtual screening, and quantitative structure-activity relationship/quantitative structure-property relationship modeling as well as in genomics and proteomics. In this paper, a new similarity/diversity measure is proposed as a new approac...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060099e

    authors: Todeschini R,Consonni V,Mauri A,Ballabio D

    更新日期:2006-09-01 00:00:00

  • Evaluating QM/MM Free Energy Surfaces for Ranking Cysteine Protease Covalent Inhibitors.

    abstract::One tactic for cysteine protease inhibition is to form a covalent bond between an electrophilic atom of the inhibitor and the thiol of the catalytic cysteine. In this study, we evaluate the reaction free energy obtained from a hybrid quantum mechanical/molecular mechanical (QM/MM) free energy profile as a predictor of...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00847

    authors: da Costa CHS,Bonatto V,Dos Santos AM,Lameira J,Leitão A,Montanari CA

    更新日期:2020-02-24 00:00:00

  • Efficient Strategy for the Calculation of Solvation Free Energies in Water and Chloroform at the Quantum Mechanical/Molecular Mechanical Level.

    abstract::The partitioning of solute molecules between immiscible solvents with significantly different polarities is of great importance. The polarization between the solute and solvent molecules plays an essential role in determining the solubility of the solute, which makes computational studies utilizing molecular mechanics...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00001

    authors: Wang M,Li P,Jia X,Liu W,Shao Y,Hu W,Zheng J,Brooks BR,Mei Y

    更新日期:2017-10-23 00:00:00

  • Computational simulations of the interactions between acetyl-coenzyme-A carboxylase and clodinafop: resistance mechanism due to active and nonactive site mutations.

    abstract::Grass weed populations resistant to acetyl-CoA carboxylase-inhibiting (ACCase; EC 6.4.1.2) herbicides represent a major problem for the sustainable development of modern agriculture. In the present study, extensive computational simulations, including homology modeling, molecular dynamics (MD) simulations, and molecul...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900174d

    authors: Zhu XL,Ge-Fei H,Zhan CG,Yang GF

    更新日期:2009-08-01 00:00:00

  • CHARMMing: a new, flexible web portal for CHARMM.

    abstract::A new web portal for the CHARMM macromolecular modeling package, CHARMMing (CHARMM interface and graphics, http://www.charmming.org), is presented. This tool provides a user-friendly interface for the preparation, submission, monitoring, and visualization of molecular simulations (i.e., energy minimization, solvation,...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800133b

    authors: Miller BT,Singh RP,Klauda JB,Hodoscek M,Brooks BR,Woodcock HL 3rd

    更新日期:2008-09-01 00:00:00

  • Structure-Based Kinase Profiling To Understand the Polypharmacological Behavior of Therapeutic Molecules.

    abstract::Several drugs elicit their therapeutic efficacy by modulating multiple cellular targets and possess varied polypharmacological actions. The identification of the molecular targets of a potent bioactive molecule is essential in determining its overall polypharmacological profile. Experimental procedures are expensive a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00227

    authors: Dutta D,Das R,Mandal C,Mandal C

    更新日期:2018-01-22 00:00:00

  • First Multitarget Chemo-Bioinformatic Model To Enable the Discovery of Antibacterial Peptides against Multiple Gram-Positive Pathogens.

    abstract::Antimicrobial peptides (AMPs) have emerged as promising therapeutic alternatives to fight against the diverse infections caused by different pathogenic microorganisms. In this context, theoretical approaches in bioinformatics have paved the way toward the creation of several in silico models capable of predicting anti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00630

    authors: Speck-Planche A,Kleandrova VV,Ruso JM,Cordeiro MN

    更新日期:2016-03-28 00:00:00

  • Phytochemical informatics of traditional Chinese medicine and therapeutic relevance.

    abstract::Distribution patterns of 8411 compounds from 240 Chinese herbs were analyzed in relation to the herbal categories of traditional Chinese medicine (TCM), using Random Forest (RF) and self-organizing maps (SOM). RF was used first to construct TCM profiles of individual compounds, which describe their affinities for 28 m...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700155t

    authors: Ehrman TM,Barlow DJ,Hylands PJ

    更新日期:2007-11-01 00:00:00

  • RDChiral: An RDKit Wrapper for Handling Stereochemistry in Retrosynthetic Template Extraction and Application.

    abstract::There is a renewed interest in computer-aided synthesis planning, where the vast majority of approaches require the application of retrosynthetic reaction templates. Here we introduce RDChiral, an open-source Python wrapper for RDKit designed to provide consistent handling of stereochemical information in applying ret...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00286

    authors: Coley CW,Green WH,Jensen KF

    更新日期:2019-06-24 00:00:00

  • Benchmark Sets for Binding Hot Spot Identification in Fragment-Based Ligand Discovery.

    abstract::Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00877

    authors: Wakefield AE,Yueh C,Beglov D,Castilho MS,Kozakov D,Keserű GM,Whitty A,Vajda S

    更新日期:2020-12-28 00:00:00

  • BCL::MolAlign: Three-Dimensional Small Molecule Alignment for Pharmacophore Mapping.

    abstract::Small molecule flexible alignment is a critical component of both ligand- and structure-based methods in computer-aided drug discovery. Despite its importance, the availability of high-quality flexible alignment software packages is limited. Here, we present BCL::MolAlign, a freely available property-based molecular a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00020

    authors: Brown BP,Mendenhall J,Meiler J

    更新日期:2019-02-25 00:00:00

  • 3D-QSAR and docking studies of selective GSK-3beta inhibitors. Comparison with a thieno[2,3-b]pyrrolizinone derivative, a new potential lead for GSK-3beta ligands.

    abstract::The three-dimensional structures of 3-anilino-4-arylmaleimides, selective GSK-3beta inhibitors, were correlated to their biological affinities by 3D-QSAR studies (CoMFA method). The cocrystallographic data of GSK-3beta vs 3-anilino-4-arylmaleimide allowed us to compare 3D-QSAR results to experimental intermolecular in...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050008y

    authors: Lescot E,Bureau R,Sopkova-de Oliveira Santos J,Rochais C,Lisowski V,Lancelot JC,Rault S

    更新日期:2005-05-01 00:00:00