Abstract:
:Machine learning (ML) algorithms are gaining importance in the processing of chemical information and modeling of chemical reactivity problems. In this work, we have developed a perturbation-theory and machine learning (PTML) model combining perturbation theory (PT) and ML algorithms for predicting the yield of a given reaction. For this purpose, we have selected Parham cyclization, which is a general and powerful tool for the synthesis of heterocyclic and carbocyclic compounds. This reaction has both structural (substitution pattern on the substrate, internal electrophile, ring size, etc.) and operational variables (organolithium reagent, solvent, temperature, time, etc.), so predicting the effect of changes on substrate design (internal elelctrophile, halide, etc.) or reaction conditions on the yield is an important task that could help to optimize the reaction design. The PTML model developed uses PT operators to account for perturbations under experimental conditions and/or structural variables of all the molecules involved in a query reaction, compared to a reaction of reference. Thus, a dataset of >100 reactions has been collected for different substrates and internal electrophiles, under different reaction conditions, with a wide range of yields (0-98%). The best PTML model found using General Linear Regression (GLR) has R = 0.88 in training and R = 0.83 in external validation series for 10 000 pairs of query and reference reactions. The PTML model has a final R = 0.95 for all reactions using multiple reactions of reference. We also report a comparative study of linear versus nonlinear PTML models based on artificial neural network (ANN) algorithms. PTML-ANN models (LNN, MLP, RBF) with R ≈ 0.1-0.8 do not outperform the first PMTL model. This result confirms the validity of the linearity of the model. Next, we carried out an experimental and theoretical study of nonreported Parham reactions to illustrate the practical use of the PTML model. A 500 000-point simulation and a Hammett analysis of the reactivity space of Parham reactions are also reported.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Simón-Vidal L,García-Calvo O,Oteo U,Arrasate S,Lete E,Sotomayor N,González-Díaz Hdoi
10.1021/acs.jcim.8b00286subject
Has Abstractpub_date
2018-07-23 00:00:00pages
1384-1396issue
7eissn
1549-9596issn
1549-960Xjournal_volume
58pub_type
杂志文章abstract::Template CoMFA, a novel alignment methodology for training or test set structures in 3D-QSAR, is introduced. Its two most significant advantages are its complete automation and its ability to derive a single combined model from multiple structural series affecting a biological target. Its only two inputs are one or mo...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400696v
更新日期:2014-02-24 00:00:00
abstract::We investigate unexpectedly short non-covalent distances (<85% of the sum of van der Waals radii) in X-ray crystal structures of proteins. We curate over 11 000 high-quality protein crystal structures and an ultra-high-resolution (1.2 Å or better) subset containing >900 structures. Although our non-covalent distance c...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00144
更新日期:2019-05-28 00:00:00
abstract::Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations ar...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500580y
更新日期:2015-01-26 00:00:00
abstract::Partial covalent interactions (PCIs) in proteins, which include hydrogen bonds, salt bridges, cation-π, and π-π interactions, contribute to thermodynamic stability and facilitate interactions with other biomolecules. Several score functions have been developed within the Rosetta protein modeling framework that identif...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00398
更新日期:2018-05-29 00:00:00
abstract::The human cytochrome P450 (CYP450) isozymes are the most important enzymes in the body to metabolize many endogenous and exogenous substances including environmental toxins and therapeutic drugs. Any unnecessary interactions between a small molecule and CYP450 isozymes may raise a potential to disarm the integrity of ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200311w
更新日期:2011-10-24 00:00:00
abstract::Prediction of compound properties from structure via quantitative structure-activity relationship and machine-learning approaches is an important computational chemistry task in small-molecule drug research. Though many such properties are dependent on three-dimensional structures or even conformer ensembles, the majo...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00151
更新日期:2018-05-29 00:00:00
abstract::Up to now, publicly available data sets to build and evaluate Ames mutagenicity prediction tools have been very limited in terms of size and chemical space covered. In this report we describe a new unique public Ames mutagenicity data set comprising about 6500 nonconfidential compounds (available as SMILES strings and...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900161g
更新日期:2009-09-01 00:00:00
abstract::A major concern of chemogenomics is to associate drug activity with biological variables. Several reports have clustered cell line drug activity profiles as well as drug activity-gene expression correlation profiles and noted that the resulting groupings differ but still reflect mechanism of action. The present paper ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060073n
更新日期:2007-01-01 00:00:00
abstract::CDC25 phosphatases play critical roles in cell cycle regulation and are attractive targets for anticancer therapies. Several small non-peptide molecules are known to inhibit CDC25, but many of them appear to form a covalent bond with the enzyme or act through oxidation of the thiolate group of the catalytic cysteine. ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700313e
更新日期:2008-01-01 00:00:00
abstract::The antiproliferative factor (APF) involved in interstitial cystitis is a glycosylated nonapeptide (TVPAAVVVA) containing a sialylated core 1 α-O-disaccharide linked to the N-terminal threonine. The chemical structure of APF was deduced using spectroscopic techniques and confirmed using total synthesis. The synthetic ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400147s
更新日期:2013-05-24 00:00:00
abstract::Increasing protein kinase C (PKC) activity is of potential therapeutic value. Its activation involves an interaction between the C1 domain and diacylglycerol (DAG) at intracellular membrane surfaces; DAG mimetics hold promise as new drugs. We previously developed the isophthalate derivative HMI-1a3, an effective but h...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00624
更新日期:2020-11-23 00:00:00
abstract::In this study, in order to elucidate the action mechanism of traditional Chinese medicines (TCMs) that exhibit clinical efficacy for type II diabetes mellitus (T2DM), an integrated protocol that combines molecular docking and pharmacophore mapping was employed to find the potential inhibitors from TCM for the T2DM-rel...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400146u
更新日期:2013-07-22 00:00:00
abstract::Solute-solvent interactions are critical for biomolecular stability and recognition. Explicit solvent molecular dynamics (MD) simulations are routinely used to probe such interactions. However, detailed analyses and interpretation of the hydration patterns seen in MD simulations can be both complex and time-consuming....
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00453
更新日期:2019-07-22 00:00:00
abstract::Molecular docking can account for receptor flexibility by combining the docking score over multiple rigid receptor conformations, such as snapshots from a molecular dynamics simulation. Here, we evaluate a number of common snapshot selection strategies using a quality metric from stratified sampling, the efficiency of...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00314
更新日期:2018-09-24 00:00:00
abstract::Nonfibrillar neurotoxic amyloid β (Aβ) oligomer structures are typically rich in β-sheets, which could be promoted by metal ions like Zn(2+). Here, using molecular dynamics (MD) simulations, we systematically examined combinations of Aβ40 peptide conformations and Zn(2+) binding modes to probe the effects of secondary...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00063
更新日期:2015-06-22 00:00:00
abstract::On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00143
更新日期:2015-06-22 00:00:00
abstract::Advances in computer-aided translation technology have made tremendous progress in accuracy in the past few years. Chemical Abstracts Service of the American Chemical Society summarizes scientific works from more than 50 languages and allows the users to search papers in nine selected languages. Currently, only the ab...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00274
更新日期:2020-07-27 00:00:00
abstract::Universal generative topographic maps (GTMs) provide two-dimensional representations of chemical space selected for their "polypharmacological competence", that is, the ability to simultaneously represent meaningful activity and property landscapes, associated with many distinct targets and properties. Several such GT...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00650
更新日期:2019-01-28 00:00:00
abstract::In this study, we tried to establish a general scheme to create a model that could predict the affinity of small compounds to their target proteins. This scheme consists of a search for ligand-binding sites on a protein, a generation of bound conformations (poses) of ligands in each of the sites by docking, identifica...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800313h
更新日期:2009-04-01 00:00:00
abstract::The sensitivity of docking calculations to the geometry of the input ligand was studied. It was found that even small changes in the ligand input conformation can lead to large differences in the geometries and scores of the resulting docked poses. The accuracy of docked poses produced from different ligand input stru...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci9000629
更新日期:2009-07-01 00:00:00
abstract::Taking into account dynamical behavior and/or structural inaccuracies of receptor-ligand systems becomes increasingly important in structure-based drug design. Here, we describe the development of consensus Adaptation of Fields for Molecular Comparison (AFMoC) (abbreviated as AFMoCcon) models that account for multiple...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7002472
更新日期:2007-11-01 00:00:00
abstract::Generation and prioritization of new molecules are the most central part of the drug design process. Matched molecular series analysis (MMSA) has recently been proposed as a formal approach that captures both of these key elements of design. In order to better understand the power of MMSA and its specific limitations,...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00269
更新日期:2020-06-22 00:00:00
abstract::A compound's synthetic accessibility (SA) is an important aspect of drug design, since in some cases computer-designed compounds cannot be synthesized. There have been several reports on SA prediction, most of which have focused on the difficulties of synthetic reactions based on retro-synthesis analyses, reaction dat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500568d
更新日期:2014-12-22 00:00:00
abstract::Metal-ligand (M-L) bond lengths for a range of ligands (carboxylates, chlorides, pyridines, water, tertiary phosphines, and alkenes) and a variety of metals have been retrieved from the Cambridge Structural Database, CSD. Analysis of the factors which affect M-L bond lengths (for example, ligand coordination mode, oxi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0500785
更新日期:2005-11-01 00:00:00
abstract::Fast and accurate predicting of the binding affinities of large sets of diverse protein−ligand complexes is an important, yet extremely challenging, task in drug discovery. The development of knowledge-based scoring functions exploiting structural information of known protein−ligand complexes represents a valuable con...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100343j
更新日期:2011-02-28 00:00:00
abstract::Inhibition of protein-protein interactions (PPIs) is emerging as a promising therapeutic strategy despite the difficulty in targeting such interfaces with drug-like small molecules. PPIs generally feature large and flat binding surfaces as compared to typical drug targets. These features pose a challenge for structura...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00103
更新日期:2015-08-24 00:00:00
abstract::Enrichment of ligands versus property-matched decoys is widely used to test and optimize docking library screens. However, the unconstrained optimization of enrichment alone can mislead, leading to false confidence in prospective performance. This can arise by over-optimizing for enrichment against property-matched de...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00598
更新日期:2021-01-25 00:00:00
abstract::As computational drug design becomes increasingly reliant on virtual screening and on high-throughput 3D modeling, the need for fast, robust, and reliable methods for sampling molecular conformations has become greater than ever. Furthermore, chemical novelty is at a premium, forcing medicinal chemists to explore more...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900238a
更新日期:2009-10-01 00:00:00
abstract::Covalent inhibitors have been gaining increased attention in drug discovery due to their beneficial properties such as long residence time, high biochemical efficiency, and specificity. Optimization of covalent inhibitors is a complex task that involves parallel monitoring of the noncovalent recognition elements and t...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00834
更新日期:2020-12-28 00:00:00
abstract::Due to the importance of hot-spots (HS) detection and the efficiency of computational methodologies, several HS detecting approaches have been developed. The current paper presents new models to predict HS for protein-protein and protein-nucleic acid interactions with better statistics compared with the ones currently...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500760m
更新日期:2015-05-26 00:00:00