Abstract:
:The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Li Y,Yang Jdoi
10.1021/acs.jcim.7b00049subject
Has Abstractpub_date
2017-04-24 00:00:00pages
1007-1012issue
4eissn
1549-9596issn
1549-960Xjournal_volume
57pub_type
杂志文章abstract::An index of the activation of Class A G-protein-coupled receptors (GPCRs) has been trained using interhelix distances from a series of microsecond molecular-dynamics simulations and tested for 268 published X-ray structures. In a three-class model that includes intermediate structures, 63% of the active structures are...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00604
更新日期:2019-09-23 00:00:00
abstract::We report a new classification method for pyranose ring conformations called Best-fit, Four-Membered Plane (BFMP), which describes pyranose ring conformations based on reference planes defined by four atoms. The method is able to characterize all asymmetrical and symmetrical shapes of a pyran ring, is readily automate...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500325b
更新日期:2014-10-27 00:00:00
abstract::How well do different classification methods perform in selecting the ligands of a protein target out of large compound collections not used to train the model? Support vector machines, random forest, artificial neural networks, k-nearest-neighbor classification with genetic-algorithm-optimized feature selection, tren...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050519k
更新日期:2006-05-01 00:00:00
abstract::Given the need for modern researchers to produce open, reproducible scientific output, the lack of standards and best practices for sharing data and workflows used to produce and analyze molecular dynamics (MD) simulations has become an important issue in the field. There are now multiple well-established packages to ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00665
更新日期:2019-10-28 00:00:00
abstract::An accurate scoring function is expected to correctly select the most stable structure from a set of pose candidates. One can hypothesize that a scoring function's ability to identify the most stable structure might be improved by emphasizing the most relevant atom pairwise interactions. However, it is hard to evaluat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00356
更新日期:2019-07-22 00:00:00
abstract::Carcinogenicity is an important toxicological endpoint that poses high concern to drug discovery. In this study, we developed a method to extract structural alerts (SAs) and modulating factors of carcinogens on the basis of statistical analyses. First, the Gaston algorithm, a frequent subgraph mining method, was used ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300266p
更新日期:2012-08-27 00:00:00
abstract::Representative molecules from 10 classes of prohibited substances were taken from the World Anti-Doping Agency (WADA) list, augmented by molecules from corresponding activity classes found in the MDDR database. Together with some explicitly allowed compounds, these formed a set of 5245 molecules. Five types of fingerp...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0601160
更新日期:2006-11-01 00:00:00
abstract::Naturally occurring anticancer compounds represent about half of the chemotherapeutic drugs which have been put in the market against cancer until date. Computer-based or in silico virtual screening methods are often used in lead/hit discovery protocols. In this study, the "drug-likeness" of ~400 compounds from Africa...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci5003697
更新日期:2014-09-22 00:00:00
abstract::The in silico prediction of unwanted side effects (SEs) caused by the promiscuous behavior of drugs and their targets is highly relevant to the pharmaceutical industry. Considerable effort is now being put into computational and experimental screening of several suspected off-target proteins in the hope that SEs might...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00120
更新日期:2015-09-28 00:00:00
abstract::Methods that rapidly evaluate molecular complexity and synthetic feasibility are becoming increasingly important for in silico chemistry. We propose a new metric based on relative atomic electronegativities and bond parameters that evaluate both synthetic and molecular complexity (SMCM) starting from chemical structur...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0501387
更新日期:2005-09-01 00:00:00
abstract::In the present study, we report the exploration of binding modes of potent HIV-1 integrase (IN) inhibitors MK-0518 (raltegravir) and GS-9137 (elvitegravir) as well as chalcone and related amide IN inhibitors we recently synthesized and the development of 3D-QSAR models for integrase inhibition. Homology models of DNA-...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200485a
更新日期:2012-02-27 00:00:00
abstract::In this study, we tried to establish a general scheme to create a model that could predict the affinity of small compounds to their target proteins. This scheme consists of a search for ligand-binding sites on a protein, a generation of bound conformations (poses) of ligands in each of the sites by docking, identifica...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800313h
更新日期:2009-04-01 00:00:00
abstract::This study has assessed the use of consensus regression, as compared to single multiple linear regression, models for the development of quantitative structure-activity relationships (QSARs). To provide a comparison, four data sets of varying size and complexity were analyzed: silastic membrane flux, toxicity of pheno...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700016d
更新日期:2007-07-01 00:00:00
abstract::Template CoMFA, a novel alignment methodology for training or test set structures in 3D-QSAR, is introduced. Its two most significant advantages are its complete automation and its ability to derive a single combined model from multiple structural series affecting a biological target. Its only two inputs are one or mo...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400696v
更新日期:2014-02-24 00:00:00
abstract::A compound's synthetic accessibility (SA) is an important aspect of drug design, since in some cases computer-designed compounds cannot be synthesized. There have been several reports on SA prediction, most of which have focused on the difficulties of synthetic reactions based on retro-synthesis analyses, reaction dat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500568d
更新日期:2014-12-22 00:00:00
abstract::The protonation states for nucleic acid bases are difficult to assess experimentally. In the context of DNA triplex, the protonation state of cytidine in the third strand is particularly important, because it needs to be protonated in order to form Hoogsteen hydrogen bonds. A sugar modification, locked nucleic acid (L...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00741
更新日期:2018-04-23 00:00:00
abstract::In this paper, we introduce the BiKi Life Sciences suite. This software makes it easy for computational medicinal chemists to run ad hoc molecular dynamics protocols in a novel and task-oriented environment; as a notebook, BiKi (acronym of Binding Kinetics) keeps memory of any activity together with dependencies among...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00680
更新日期:2018-02-26 00:00:00
abstract::We introduce SARANEA, an open-source Java application for interactive exploration of structure-activity relationship (SAR) and structure-selectivity relationship (SSR) information in compound sets of any source. SARANEA integrates various SAR and SSR analysis functions and utilizes a network-like similarity graph data...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900416a
更新日期:2010-01-01 00:00:00
abstract::The binding affinity and relative maximal efficacy of human A3 adenosine receptor (AR) agonists were each subjected to ligand-based three-dimensional quantitative structure-activity relationship analysis. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) used a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600501z
更新日期:2007-05-01 00:00:00
abstract::Large ring cyclodextrins have become increasingly important for drug delivery applications. In this work, we have performed replica-exchange molecular dynamics simulations using both implicit and explicit water solvation models to study the conformational diversity of iota-cyclodextrin containing 14 α-1,4 glycosidic l...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00595
更新日期:2017-04-24 00:00:00
abstract::A generic chemical transformation may often be achieved under various synthetic conditions. However, for any specific reagents, only one or a few among the reported synthetic protocols may be successful. For example, Michael β-addition reactions may proceed under different choices of solvent (e.g., hydrophobic, aproti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500698a
更新日期:2015-02-23 00:00:00
abstract::We present an algorithm, ReFlex3D, for the refinement of flexible molecular alignments based on their three-dimensional shape and electrostatic properties. The algorithm is designed to be used with fast conformer generators to refine an initial overlay between two molecules and thus to obtain improved overlaps as judg...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00618
更新日期:2018-04-23 00:00:00
abstract::Acetohydroxyacid synthase (AHAS) is a thiamin diphosphate-dependent enzyme involved in the biosynthesis of valine, leucine, isoleucine, and lysine. Experimental evidence has shown that mutation of the Gln202 residue results in a decrease in the enzymatic activity, thus suggesting the main role of the carboligation cat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00863
更新日期:2020-02-24 00:00:00
abstract::The recent article "Evaluation of pK(a) Estimation Methods on 211 Druglike Compounds" ( Manchester, J.; et al. J. Chem Inf. Model. 2010, 50, 565-571 ) reports poor results for the program Epik. Here, we highlight likely sources for the poor performance and describe work done to improve the performance. Running Epik in...
journal_title:Journal of chemical information and modeling
pub_type: 评论,杂志文章
doi:10.1021/ci100332m
更新日期:2011-01-24 00:00:00
abstract::Our main objective was to compile a data set of high-quality protein-fragment complexes and make it publicly available. Once assembled, the data set was challenged using docking procedures to address the following questions: (i) Can molecular docking correctly reproduce the experimentally solved structures? (ii) How t...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci2003363
更新日期:2011-11-28 00:00:00
abstract::As a result of the widespread industrial use of polychlorinated hydrocarbons, they have accumulated in nearly all types of environmental compartments, especially in aquatic systems. Particularly, chloroaromatics are among the most undesirable industrial effluents because of their persistence and toxicity. To predict c...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0501342
更新日期:2005-07-01 00:00:00
abstract::Advances in the development of high-throughput screening and automated chemistry have rapidly accelerated the production of chemical and biological data, much of them freely accessible through literature aggregator services such as ChEMBL and PubChem. Here, we explore how to use this comprehensive mapping of chemical ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00526
更新日期:2019-11-25 00:00:00
abstract::Voltage-gated sodium channels (VGSC) are attractive targets for drug discovery because of the broad therapeutic potential of their modulators. On the basis of the structure of marine alkaloid clathrodin, we have recently discovered novel subtype-selective VGSC modulators I and II that were used as starting points for ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400505e
更新日期:2013-12-23 00:00:00
abstract::Understanding which physicochemical properties, or property distributions, are favorable for successful design and development of drugs, nutritional supplements, cosmetics, and agrochemicals is of great importance. In this study we have analyzed molecules from three distinct chemical spaces (i) approved drugs, (ii) hu...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300487z
更新日期:2013-02-25 00:00:00
abstract::Human telomeric DNA G-quadruplex has been identified as a good therapeutic target in cancer treatment. G-quadruplex-specific ligands that stabilize the G-quadruplex have great potential to be developed as anticancer agents. Two crystal structures (an apo form of parallel stranded human telomeric G-quadruplex and its h...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00287
更新日期:2017-11-27 00:00:00