Abstract:
:Machine learning has exhibited powerful capabilities in many areas. However, machine learning models are mostly database dependent, requiring a new model if the database changes. Therefore, a universal model is highly desired to accommodate the widest variety of databases. Fortunately, this universality may be achieved by ensemble learning, which can integrate multiple learners to meet the demands of diversified databases. Therefore, we propose a general procedure for learning ensemble establishment based on noncovalent interactions (NCIs) databases. Additionally, accurate NCI computation is quite demanding for first-principles methods, for which a competent machine learning model can be an efficient solution to obtain high NCI accuracy with minimal computational resources. In regard to these aspects, multiple schemes of ensemble learning models (Bagging, Boosting, and Stacking frameworks), are explored in this study. The models are based on various low levels of density functional theory (DFT) calculations for the benchmark databases S66, S22, and X40. All NCIs computed by the DFT calculations can be improved to high-level accuracy (root-mean-square error RMSE = 0.22 kcal/mol in contrast to CCSD(T)/CBS benchmark) by established ensemble learning models. Compared with single machine learning models, ensemble models show better accuracy (RMSE of the best model is further lowered by ∼25%), robustness and goodness-of-fit according to evaluation parameters suggested by the OECD. Among ensemble learning models, heterogeneous Stacking ensemble models show the most valuable application potential. The standardized procedure of constructing learning ensembles has been well utilized on several NCI data sets, and this procedure may also be applicable for other chemical databases.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Li W,Miao W,Cui J,Fang C,Su S,Li H,Hu L,Lu Y,Chen Gdoi
10.1021/acs.jcim.8b00878subject
Has Abstractpub_date
2019-05-28 00:00:00pages
1849-1857issue
5eissn
1549-9596issn
1549-960Xjournal_volume
59pub_type
杂志文章abstract::Development of accurate force field parameters for molecular ions in the context of a polarizable energy function based on the classical Drude oscillator is a crucial step toward an accurate polarizable model for modeling and simulations of biological macromolecules. Toward this goal we have undertaken a hierarchical ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00132
更新日期:2018-05-29 00:00:00
abstract::Halogen bonds (XBs) are attracting increasing attention in biological systems. Protein Data Bank (PDB) archives experimentally determined XBs in biological macromolecules. However, no software for structure refinement in X-ray crystallography takes into account XBs, which might result in the weakening or even vanishin...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00235
更新日期:2017-07-24 00:00:00
abstract::SIRT2, which is a NAD+ (nicotinamide adenine dinucleotide) dependent deacetylase, has been demonstrated to play an important role in the occurrence and development of a variety of diseases such as cancer, ischemia-reperfusion, and neurodegenerative diseases. Small molecule inhibitors of SIRT2 are thought to be potenti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00714
更新日期:2017-04-24 00:00:00
abstract::Generalization of an earlier algorithm has led to the development of new local structural alignment algorithms for prediction of protein-protein binding sites. The algorithms use maximum cliques on protein graphs to define structurally similar protein regions. The search for structural neighbors in the new algorithms ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100265x
更新日期:2010-10-25 00:00:00
abstract::Molecular fragment dynamics (MFD) is a variant of dissipative particle dynamics (DPD), a coarse-grained mesoscopic simulation technique for isothermal complex fuids and soft matter systems with particles that are chosen to be adequate fluid elements. MFD choses its particles to be small molecules which may be connecte...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci5006096
更新日期:2015-05-26 00:00:00
abstract::The possibility of improving the predictive ability of comparative molecular field analysis (CoMFA) by settings optimization has been evaluated to show that CoMFA predictive ability can be improved. Ten different CoMFA settings are evaluated, producing a total of 6120 models. This method has been applied to nine diffe...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci049612j
更新日期:2006-01-01 00:00:00
abstract::Human cannabinoid type 1 (CB1) G-protein coupled receptor is a potential therapeutic target for obesity. The previously predicted and experimentally validated ensemble of ligand-free conformations of CB1 [Scott, C. E. et al. Protein Sci. 2013 , 22 , 101 - 113 ; Ahn, K. H. et al. Proteins 2013 , 81 , 1304 - 1317] are u...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00581
更新日期:2016-01-25 00:00:00
abstract::Human telomeric DNA G-quadruplex has been identified as a good therapeutic target in cancer treatment. G-quadruplex-specific ligands that stabilize the G-quadruplex have great potential to be developed as anticancer agents. Two crystal structures (an apo form of parallel stranded human telomeric G-quadruplex and its h...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00287
更新日期:2017-11-27 00:00:00
abstract::Side-chain modeling is critical for protein structure prediction since the uniqueness of the protein structure is largely determined by its side-chain packing conformation. In this paper, differing from most approaches that rely on rotamer library sampling, we first propose a novel side-chain rotamer prediction method...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00951
更新日期:2020-12-28 00:00:00
abstract::Membrane transporters play a crucial role in determining fate of administered drugs in a biological system. Early identification of plausible transporters for a drug molecule can provide insights into its therapeutic, pharmacokinetic, and toxicological profiles. In the present study, predictive models for classifying ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00508
更新日期:2017-03-27 00:00:00
abstract::3-Phosphoinositide-dependent protein kinase-1 (PDK1) is a promising target for developing novel anticancer drugs. In order to understand the structure-activity correlation of indolinone-based PDK1 inhibitors, we have carried out a combined molecular docking and three-dimensional quantitative structure-activity relatio...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800147v
更新日期:2008-09-01 00:00:00
abstract::An important issue in developing protein-ligand docking methods is how to incorporate receptor flexibility. Consideration of receptor flexibility using an ensemble of precompiled receptor conformations or by employing an effectively enlarged binding pocket has been reported to be useful. However, direct consideration ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300342z
更新日期:2012-12-21 00:00:00
abstract::Metal-ligand (M-L) bond lengths for a range of ligands (carboxylates, chlorides, pyridines, water, tertiary phosphines, and alkenes) and a variety of metals have been retrieved from the Cambridge Structural Database, CSD. Analysis of the factors which affect M-L bond lengths (for example, ligand coordination mode, oxi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0500785
更新日期:2005-11-01 00:00:00
abstract::Small molecules targeting peripheral CB1 receptors have therapeutic potential in a variety of disorders including obesity-related, hormonal, and metabolic abnormalities, while avoiding the psychoactive effects in the central nervous system. We applied our in-house algorithm, iterative stochastic elimination, to produc...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00577
更新日期:2019-09-23 00:00:00
abstract::A new radial space-filling method for visualizing cluster hierarchies is presented. The method, referred to as a radial clustergram, arranges the clusters into a series of layers, each representing a different level of the tree. It uses adjacency of nodes instead of links to represent parent-child relationships and al...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600427x
更新日期:2007-01-01 00:00:00
abstract::The high-throughput docking protocol called ALTA-VS (anchor-based library tailoring approach for virtual screening) was developed in 2005 for the efficient in silico screening of large libraries of compounds by preselection of only those molecules that have optimal fragments (anchors) for the protein target. Here we p...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00336
更新日期:2017-10-23 00:00:00
abstract::In the present study, we report the exploration of binding modes of potent HIV-1 integrase (IN) inhibitors MK-0518 (raltegravir) and GS-9137 (elvitegravir) as well as chalcone and related amide IN inhibitors we recently synthesized and the development of 3D-QSAR models for integrase inhibition. Homology models of DNA-...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200485a
更新日期:2012-02-27 00:00:00
abstract::In the present work, the Henderson-Hasselbalch (HH) equation has been employed for the development of a tool for the prediction of pH-dependent aqueous solubility of drugs and drug candidates. A new prediction method for the intrinsic solubility was developed, based on artificial neural networks that have been trained...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600292q
更新日期:2006-11-01 00:00:00
abstract::The similarity/diversity measures play a fundamental role in library searching, virtual screening, and quantitative structure-activity relationship/quantitative structure-property relationship modeling as well as in genomics and proteomics. In this paper, a new similarity/diversity measure is proposed as a new approac...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060099e
更新日期:2006-09-01 00:00:00
abstract::One tactic for cysteine protease inhibition is to form a covalent bond between an electrophilic atom of the inhibitor and the thiol of the catalytic cysteine. In this study, we evaluate the reaction free energy obtained from a hybrid quantum mechanical/molecular mechanical (QM/MM) free energy profile as a predictor of...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00847
更新日期:2020-02-24 00:00:00
abstract::The partitioning of solute molecules between immiscible solvents with significantly different polarities is of great importance. The polarization between the solute and solvent molecules plays an essential role in determining the solubility of the solute, which makes computational studies utilizing molecular mechanics...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00001
更新日期:2017-10-23 00:00:00
abstract::Grass weed populations resistant to acetyl-CoA carboxylase-inhibiting (ACCase; EC 6.4.1.2) herbicides represent a major problem for the sustainable development of modern agriculture. In the present study, extensive computational simulations, including homology modeling, molecular dynamics (MD) simulations, and molecul...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900174d
更新日期:2009-08-01 00:00:00
abstract::A new web portal for the CHARMM macromolecular modeling package, CHARMMing (CHARMM interface and graphics, http://www.charmming.org), is presented. This tool provides a user-friendly interface for the preparation, submission, monitoring, and visualization of molecular simulations (i.e., energy minimization, solvation,...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800133b
更新日期:2008-09-01 00:00:00
abstract::Several drugs elicit their therapeutic efficacy by modulating multiple cellular targets and possess varied polypharmacological actions. The identification of the molecular targets of a potent bioactive molecule is essential in determining its overall polypharmacological profile. Experimental procedures are expensive a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00227
更新日期:2018-01-22 00:00:00
abstract::Antimicrobial peptides (AMPs) have emerged as promising therapeutic alternatives to fight against the diverse infections caused by different pathogenic microorganisms. In this context, theoretical approaches in bioinformatics have paved the way toward the creation of several in silico models capable of predicting anti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00630
更新日期:2016-03-28 00:00:00
abstract::Distribution patterns of 8411 compounds from 240 Chinese herbs were analyzed in relation to the herbal categories of traditional Chinese medicine (TCM), using Random Forest (RF) and self-organizing maps (SOM). RF was used first to construct TCM profiles of individual compounds, which describe their affinities for 28 m...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700155t
更新日期:2007-11-01 00:00:00
abstract::There is a renewed interest in computer-aided synthesis planning, where the vast majority of approaches require the application of retrosynthetic reaction templates. Here we introduce RDChiral, an open-source Python wrapper for RDKit designed to provide consistent handling of stereochemical information in applying ret...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00286
更新日期:2019-06-24 00:00:00
abstract::Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00877
更新日期:2020-12-28 00:00:00
abstract::Small molecule flexible alignment is a critical component of both ligand- and structure-based methods in computer-aided drug discovery. Despite its importance, the availability of high-quality flexible alignment software packages is limited. Here, we present BCL::MolAlign, a freely available property-based molecular a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00020
更新日期:2019-02-25 00:00:00
abstract::The three-dimensional structures of 3-anilino-4-arylmaleimides, selective GSK-3beta inhibitors, were correlated to their biological affinities by 3D-QSAR studies (CoMFA method). The cocrystallographic data of GSK-3beta vs 3-anilino-4-arylmaleimide allowed us to compare 3D-QSAR results to experimental intermolecular in...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050008y
更新日期:2005-05-01 00:00:00