Abstract:
:In a departure from conventional chemical approaches, data-driven models of chemical reactions have recently been shown to be statistically successful using machine learning. These models, however, are largely black box in character and have not provided the kind of chemical insights that historically advanced the field of chemistry. To examine the knowledgebase of machine-learning models-what does the machine learn-this article deconstructs black-box machine-learning models of a diverse chemical reaction data set. Through experimentation with chemical representations and modeling techniques, the analysis provides insights into the nature of how statistical accuracy can arise, even when the model lacks informative physical principles. By peeling back the layers of these complicated models we arrive at a minimal, chemically intuitive model (and no machine learning involved). This model is based on systematic reaction-type classification and Evans-Polanyi relationships within reaction types which are easily visualized and interpreted. Through exploring this simple model, we gain deeper understanding of the data set and uncover a means for expert interactions to improve the model's reliability.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Kammeraad JA,Goetz J,Walker EA,Tewari A,Zimmerman PMdoi
10.1021/acs.jcim.9b00721subject
Has Abstractpub_date
2020-03-23 00:00:00pages
1290-1301issue
3eissn
1549-9596issn
1549-960Xjournal_volume
60pub_type
杂志文章abstract::Pharmacophore hypotheses were developed for six structurally diverse series of cholecystokinin-B/gastrin receptor (CCK-BR) antagonists. A training set consisting of 33 compounds was carefully selected. The activity spread of the training set molecules was from 0.1 to 2100 nM. The most predictive pharmacophore model (h...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050257m
更新日期:2005-11-01 00:00:00
abstract::The determination of the validity of a QSAR model when applied to new compounds is an important concern in the field of QSAR and QSPR modeling. Various scoring techniques can be applied to specific types of models. We present a technique with which we can state whether a new compound will be well predicted by a previo...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0497511
更新日期:2005-01-01 00:00:00
abstract::Cyanobacterial fructose-1,6-/sedoheptulose-1,7-bisphoshatase (cy-FBP/SBPase) is a potential enzymatic target for screening of novel inhibitors that can combat harmful algal blooms. In the present study, we targeted the substrate binding pocket of cy-FBP/SBPase. A series of novel hit compounds from the SPECs database w...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci4007529
更新日期:2014-03-24 00:00:00
abstract::The evaluation of regression QSAR model performance, in fitting, robustness, and external prediction, is of pivotal importance. Over the past decade, different external validation parameters have been proposed: Q(F1)(2), Q(F2)(2), Q(F3)(2), r(m)(2), and the Golbraikh-Tropsha method. Recently, the concordance correlati...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300084j
更新日期:2012-08-27 00:00:00
abstract::Artificial intelligence and multiobjective optimization represent promising solutions to bridge chemical and biological landscapes by addressing the automated de novo design of compounds as a result of a humanlike creative process. In the present study, we conceived a novel pair-based multiobjective approach implement...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00517
更新日期:2020-10-26 00:00:00
abstract::A giant technological leap in the field of cryo-electron microscopy (cryo-EM) has assured the achievement of near-atomic resolution structures of biological macromolecules. As a recognition of this accomplishment, the Nobel Prize in Chemistry was awarded in 2017 to Jacques Dubochet, Joachim Frank, and Richard Henderso...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b01015
更新日期:2020-05-26 00:00:00
abstract::Physicochemical atomic stereodescriptors (PAS) were implemented that represent the chirality of an atomic chiral center on the basis of empirical physicochemical properties of the ligands. The ligands are ranked according to a specific property, and the chiral center takes an S/R-like descriptor relative to that prope...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600235w
更新日期:2006-11-01 00:00:00
abstract::Selecting a small subset of descriptors from a large pool to build a predictive quantitative structure-activity relationship (QSAR) model is an important step in the QSAR modeling process. In general, subset selection is very hard to solve, even approximately, with guaranteed performance bounds. Traditional approaches...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600563w
更新日期:2007-05-01 00:00:00
abstract::Fast and accurate predicting of the binding affinities of large sets of diverse protein−ligand complexes is an important, yet extremely challenging, task in drug discovery. The development of knowledge-based scoring functions exploiting structural information of known protein−ligand complexes represents a valuable con...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100343j
更新日期:2011-02-28 00:00:00
abstract::Voltage-gated sodium channels (VGSC) are attractive targets for drug discovery because of the broad therapeutic potential of their modulators. On the basis of the structure of marine alkaloid clathrodin, we have recently discovered novel subtype-selective VGSC modulators I and II that were used as starting points for ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400505e
更新日期:2013-12-23 00:00:00
abstract::In this work we present the third generation of FAst MEtabolizer (FAME 3), a collection of extra trees classifiers for the prediction of sites of metabolism (SoMs) in small molecules such as drugs, druglike compounds, natural products, agrochemicals, and cosmetics. FAME 3 was derived from the MetaQSAR database ( Pedre...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00376
更新日期:2019-08-26 00:00:00
abstract::Screening large libraries of chemicals has been an efficient strategy to discover bioactive compounds; however a portion of the potential for success is limited to the available libraries. Synergizing combinatorial and computational chemistries has emerged as a time-efficient strategy to explore the chemical space mor...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00648
更新日期:2017-03-27 00:00:00
abstract::Cytochrome P450 3A4 metabolizes nearly 50% of the drugs currently in clinical use with a broad range of substrate specificity. Early prediction of metabolites of xenobiotic compounds is crucial for cost efficient drug discovery and development. We developed a new combined model, MLite, for the prediction of regioselec...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7003576
更新日期:2008-03-01 00:00:00
abstract::Designing organic saccharide sensors for use in aqueous solution is a nontrivial endeavor. Incorporation of hydrogen bonding groups on a sensor's receptor unit to target saccharides is an obvious strategy but not one that is likely to ensure analyte-receptor interactions over analyte-solvent or receptor-solvent intera...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00987
更新日期:2019-05-28 00:00:00
abstract::Spin diffusion is a formidable problem when interpreting NMR data of chemical compounds. We developed a method to reconstruct the conformational ensemble of flexible molecules displaying spin diffusion, which minimizes the subjective bias in the interpretation of experimental data and which can be used routinely to ob...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00259
更新日期:2019-06-24 00:00:00
abstract::Protein-protein interactions play a key role in a multitude of biological processes, such as signal transduction, de novo drug design, immune responses, and enzymatic activities. It is of great interest to understand how proteins interact with each other. The general approach is to explore all possible poses and ident...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci5002372
更新日期:2014-06-23 00:00:00
abstract::Fragment complementation is gaining an increasing impact as a nonperturbing method to probe noncovalent interactions within protein supersecondary structures. In this study, the fast Fourier transform rigid-body docking algorithm ZDOCK has been employed for in silico reconstitution of the calcium binding protein calbi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0501995
更新日期:2005-09-01 00:00:00
abstract::We report the synthesis and a study of the structure-activity relationships of a new series of diarylhydrazides as potential selective non-ligand binding pocket androgen receptor antagonists. Their biological activity as antiandrogens in the context of the development of treatments for castration resistant prostate ca...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400189m
更新日期:2013-08-26 00:00:00
abstract::Arachidonic acid is an essential fatty acid in cells, acting as a key inflammatory intermediate in inflammatory reactions. In cardiac tissues, CYP2J2 can adopt arachidonic acid as a major substrate to produce epoxyeicosatrienoic acids (EETs), which can protect endothelial cells from ischemic or hypoxic injuries and ha...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400003p
更新日期:2013-06-24 00:00:00
abstract::The similarity/diversity measures play a fundamental role in library searching, virtual screening, and quantitative structure-activity relationship/quantitative structure-property relationship modeling as well as in genomics and proteomics. In this paper, a new similarity/diversity measure is proposed as a new approac...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060099e
更新日期:2006-09-01 00:00:00
abstract::Reduction of the affinity of the fragment crystallizable (Fc) region with immune receptors by substitution of one or a few amino acids, known as Fc-silencing, is an established approach to reduce the immune effector functions of monoclonal antibody therapeutics. This approach to Fc-silencing, however, is problematic a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b01198
更新日期:2020-11-23 00:00:00
abstract::Databases of small, potentially bioactive molecules are ubiquitous across the industry and academia. Designed such that each unique compound should appear only once, the multiplicity of ways in which many compounds can be represented means that these databases require methods for standardizing the representation of ch...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00232
更新日期:2020-08-24 00:00:00
abstract::Increasing protein kinase C (PKC) activity is of potential therapeutic value. Its activation involves an interaction between the C1 domain and diacylglycerol (DAG) at intracellular membrane surfaces; DAG mimetics hold promise as new drugs. We previously developed the isophthalate derivative HMI-1a3, an effective but h...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00624
更新日期:2020-11-23 00:00:00
abstract::Factor Xa inhibitors are innovative anticoagulant agents that provide a better safety/efficacy profile compared to other anticoagulative drugs. A chemical feature-based modeling approach was applied to identify crucial pharmacophore patterns from 3D crystal structures of inhibitors bound to human factor Xa (Pdb entrie...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci049778k
更新日期:2005-01-01 00:00:00
abstract::The homodimeric catabolite activator protein (CAP) regulates the transcription of several bacterial genes based on the cellular concentration of cyclic adenosine monophosphate (cAMP). The binding of cAMP to CAP triggers allosteric communication between the cAMP binding domains (CBD) and DNA binding domains (DBD) of CA...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00617
更新日期:2020-12-28 00:00:00
abstract::Deep learning has drawn significant attention in different areas including drug discovery. It has been proposed that it could outperform other machine learning algorithms, especially with big data sets. In the field of pharmaceutical industry, machine learning models are built to understand quantitative structure-acti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00671
更新日期:2019-03-25 00:00:00
abstract::In this study, we have developed a two model system to mimic the active and inactive states of a G-protein coupled receptor specifically the alpha1A adrenergic receptor. We have docked two agonists, epinephrine (phenylamine type) and oxymetazoline (imidazoline type), as well as two antagonists, prazosin and 5-methylur...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700026v
更新日期:2007-09-01 00:00:00
abstract::Molecular docking programs are widely used modeling tools for predicting ligand binding modes and structure based virtual screening. In this study, six molecular docking programs (DOCK, FlexX, GLIDE, ICM, PhDOCK, and Surflex) were evaluated using metrics intended to assess docking pose and virtual screening accuracy. ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900056c
更新日期:2009-06-01 00:00:00
abstract::We propose a hypothesis that "a model of active compound can be provided by integrating information of compounds high-ranked by docking simulation of a random compound library". In our hypothesis, the inclusion of true active compounds in the high-ranked compound is not necessary. We regard the high-ranked compounds a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7003384
更新日期:2008-03-01 00:00:00
abstract::Soluble low-molecular-weight oligomers formed during the early stage of amyloid aggregation are considered the major toxic species in amyloidosis. The structure-function relationship between oligomeric assemblies and the cytotoxicity in amyloid diseases are still elusive due to the heterogeneous and transient nature o...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c01319
更新日期:2021-01-14 00:00:00