Abstract:
:The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Li Y,Yang Jdoi
10.1021/acs.jcim.7b00049subject
Has Abstractpub_date
2017-04-24 00:00:00pages
1007-1012issue
4eissn
1549-9596issn
1549-960Xjournal_volume
57pub_type
杂志文章abstract::CYP19A1, also known as aromatase or estrogen synthetase, is the rate-limiting enzyme in the biosynthesis of estrogens from their corresponding androgens. Several clinically used breast cancer therapies target aromatase. In this work, explicitly solvated all-atom molecular dynamics simulations of aromatase with a model...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400225w
更新日期:2013-08-26 00:00:00
abstract::An ultrafast docking and virtual screening program, CRDOCK, is presented that contains (1) a search engine that can use a variety of sampling methods and an initial energy evaluation function, (2) several energy minimization algorithms for fine tuning the binding poses, and (3) different scoring functions. This modula...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300194a
更新日期:2012-08-27 00:00:00
abstract::Advances in the development of high-throughput screening and automated chemistry have rapidly accelerated the production of chemical and biological data, much of them freely accessible through literature aggregator services such as ChEMBL and PubChem. Here, we explore how to use this comprehensive mapping of chemical ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00526
更新日期:2019-11-25 00:00:00
abstract::The similarity/diversity measures play a fundamental role in library searching, virtual screening, and quantitative structure-activity relationship/quantitative structure-property relationship modeling as well as in genomics and proteomics. In this paper, a new similarity/diversity measure is proposed as a new approac...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060099e
更新日期:2006-09-01 00:00:00
abstract::Universal generative topographic maps (GTMs) provide two-dimensional representations of chemical space selected for their "polypharmacological competence", that is, the ability to simultaneously represent meaningful activity and property landscapes, associated with many distinct targets and properties. Several such GT...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00650
更新日期:2019-01-28 00:00:00
abstract::We report the development of homology models of dopamine (D(2), D(3), and D(4)), serotonin (5-HT(1B), 5-HT(2A), 5-HT(2B), and 5-HT(2C)), histamine (H(1)), and muscarinic (M(1)) receptors, based on the high-resolution structure of the beta(2)-adrenergic receptor. The homology models were built and refined using Prime. ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900444q
更新日期:2010-04-26 00:00:00
abstract::With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100264e
更新日期:2010-11-22 00:00:00
abstract::It is demonstrated that the fragmentation of druglike molecules by applying simplistic pseudo-retrosynthesis results in a stock of chemically meaningful building blocks for de novo molecule generation. A stochastic search algorithm in conjunction with ligand-based similarity scoring (Flux: fragment-based ligand builde...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0503560
更新日期:2006-03-01 00:00:00
abstract::Proteins often have both orthosteric and allosteric binding sites. Endogenous ligands, such as hormones and neurotransmitters, bind to the orthosteric site, while synthetic ligands may bind to orthosteric or allosteric sites, which has become a focal point in drug discovery. Usually, such allosteric modulators bind to...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00695
更新日期:2020-10-26 00:00:00
abstract::We develop the idea that the use of ad hoc molecular descriptors in QSAR/QSPR studies is not an optimal solution. Instead, we propose to optimize these descriptors for the specific properties under study. In the case of topological indices (TIs) we propose the use of the generalized topological indices (GTIs), which a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600448b
更新日期:2007-05-01 00:00:00
abstract::Engineering shape-controlled bionanomaterials requires comprehensive understanding of interactions between biomolecules and inorganic surfaces. We explore the origin of facet-selective binding of peptides adsorbed onto Pt(100) and Pt(111) crystallographic planes. Using molecular dynamics simulations, we show that upon...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400630d
更新日期:2013-12-23 00:00:00
abstract::The primary goal of this project was to evaluate the performance of the Standard and Enforced Geometry Optimization (SEGO) method which we have recently developed. The SEGO method has been designed for an automatic location of multiple minima on the molecular Potential Energy Surface (PES), and its usefulness has been...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00352
更新日期:2019-08-26 00:00:00
abstract::We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050216q
更新日期:2005-11-01 00:00:00
abstract::Nitric oxide (NO) is an important signaling molecule produced by a family of enzymes called nitric oxide synthases (NOS). Because NO is involved in various pathological conditions, the development of potent and isoform-selective NOS inhibitors is an important challenge. In the present study, the dimer of oxygenase dom...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100422v
更新日期:2011-06-27 00:00:00
abstract::Most physiological effects of thyroid hormones are mediated by the two thyroid hormone receptor subtypes, TRalpha and TRbeta. Several pharmacological effects mediated by TRbeta might be beneficial in important medical conditions such as obesity, hypercholesterolemia and diabetes, and selective TRbeta activation may el...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900316e
更新日期:2009-11-01 00:00:00
abstract::Generation and prioritization of new molecules are the most central part of the drug design process. Matched molecular series analysis (MMSA) has recently been proposed as a formal approach that captures both of these key elements of design. In order to better understand the power of MMSA and its specific limitations,...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00269
更新日期:2020-06-22 00:00:00
abstract::In this study, we tried to establish a general scheme to create a model that could predict the affinity of small compounds to their target proteins. This scheme consists of a search for ligand-binding sites on a protein, a generation of bound conformations (poses) of ligands in each of the sites by docking, identifica...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800313h
更新日期:2009-04-01 00:00:00
abstract::Calcium is involved in important intracellular processes, such as intracellular signaling from cell membrane receptors to the nucleus. Typically, calcium levels are kept at less than 100 nM in the nucleus and cytosol, but some calcium is stored in the endoplasmic reticulum (ER) lumen for rapid release to activate intr...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00475
更新日期:2017-02-27 00:00:00
abstract::Small molecule flexible alignment is a critical component of both ligand- and structure-based methods in computer-aided drug discovery. Despite its importance, the availability of high-quality flexible alignment software packages is limited. Here, we present BCL::MolAlign, a freely available property-based molecular a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00020
更新日期:2019-02-25 00:00:00
abstract::Novel statistical potentials derived from known protein structures are presented. They are designed to describe cation-pi and amino-pi interactions between a positively charged amino acid or an amino acid carrying a partially charged amino group and an aromatic moiety. These potentials are based on the propensity of r...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050395b
更新日期:2006-03-01 00:00:00
abstract::Herein we investigate whether QM/MM could prove useful as a tool to study the often subtle binding phenomena found within pharmaceutical drug discovery programs. The goal of this investigation is to determine whether it is possible to employ high level QM/MM calculations to answer specific questions around a binding e...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800419j
更新日期:2009-03-01 00:00:00
abstract::One tactic for cysteine protease inhibition is to form a covalent bond between an electrophilic atom of the inhibitor and the thiol of the catalytic cysteine. In this study, we evaluate the reaction free energy obtained from a hybrid quantum mechanical/molecular mechanical (QM/MM) free energy profile as a predictor of...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00847
更新日期:2020-02-24 00:00:00
abstract::The predictive performances of MC4PC were evaluated using its learning machine functionality. Its superior characteristics are demonstrated in this following up study using the newly published Ames mutagenicity benchmark set. ...
journal_title:Journal of chemical information and modeling
pub_type: 评论,信件
doi:10.1021/ci1000899
更新日期:2010-09-27 00:00:00
abstract::We propose an improved solvent contact model to estimate the solvation free energy of an organic molecule from individual atomic contributions. The modification of the solvation model involves the optimization of three kinds of parameters in the solvation free energy function: atomic fragmental volume, maximum atomic ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600453b
更新日期:2007-03-01 00:00:00
abstract::The in silico prediction of unwanted side effects (SEs) caused by the promiscuous behavior of drugs and their targets is highly relevant to the pharmaceutical industry. Considerable effort is now being put into computational and experimental screening of several suspected off-target proteins in the hope that SEs might...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00120
更新日期:2015-09-28 00:00:00
abstract::Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueou...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060164k
更新日期:2007-01-01 00:00:00
abstract::Reaction classification has important applications, and many approaches to classification have been applied. Our own algorithm tests all maximum common substructures (MCS) between all reactant and product molecules in order to find an atom mapping containing the minimum chemical distance (MCD). Recent publications hav...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400442f
更新日期:2013-11-25 00:00:00
abstract::We present a theoretical study on the performance of ensemble docking methodologies considering multiple protein structures. We perform a theoretical analysis of pose prediction experiments which is completely unbiased, as we make no assumptions about specific scoring functions, search paradigms, protein structures, o...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci2002796
更新日期:2011-11-28 00:00:00
abstract::The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined. A protocol requiring minimal user intervention has been established to align training and test set molecules using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are exemplified stu...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7002945
更新日期:2008-02-01 00:00:00
abstract::Several drugs elicit their therapeutic efficacy by modulating multiple cellular targets and possess varied polypharmacological actions. The identification of the molecular targets of a potent bioactive molecule is essential in determining its overall polypharmacological profile. Experimental procedures are expensive a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00227
更新日期:2018-01-22 00:00:00