Abstract:
:Machine learning (ML) algorithms are gaining importance in the processing of chemical information and modeling of chemical reactivity problems. In this work, we have developed a perturbation-theory and machine learning (PTML) model combining perturbation theory (PT) and ML algorithms for predicting the yield of a given reaction. For this purpose, we have selected Parham cyclization, which is a general and powerful tool for the synthesis of heterocyclic and carbocyclic compounds. This reaction has both structural (substitution pattern on the substrate, internal electrophile, ring size, etc.) and operational variables (organolithium reagent, solvent, temperature, time, etc.), so predicting the effect of changes on substrate design (internal elelctrophile, halide, etc.) or reaction conditions on the yield is an important task that could help to optimize the reaction design. The PTML model developed uses PT operators to account for perturbations under experimental conditions and/or structural variables of all the molecules involved in a query reaction, compared to a reaction of reference. Thus, a dataset of >100 reactions has been collected for different substrates and internal electrophiles, under different reaction conditions, with a wide range of yields (0-98%). The best PTML model found using General Linear Regression (GLR) has R = 0.88 in training and R = 0.83 in external validation series for 10 000 pairs of query and reference reactions. The PTML model has a final R = 0.95 for all reactions using multiple reactions of reference. We also report a comparative study of linear versus nonlinear PTML models based on artificial neural network (ANN) algorithms. PTML-ANN models (LNN, MLP, RBF) with R ≈ 0.1-0.8 do not outperform the first PMTL model. This result confirms the validity of the linearity of the model. Next, we carried out an experimental and theoretical study of nonreported Parham reactions to illustrate the practical use of the PTML model. A 500 000-point simulation and a Hammett analysis of the reactivity space of Parham reactions are also reported.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Simón-Vidal L,García-Calvo O,Oteo U,Arrasate S,Lete E,Sotomayor N,González-Díaz Hdoi
10.1021/acs.jcim.8b00286subject
Has Abstractpub_date
2018-07-23 00:00:00pages
1384-1396issue
7eissn
1549-9596issn
1549-960Xjournal_volume
58pub_type
杂志文章abstract::A new web portal for the CHARMM macromolecular modeling package, CHARMMing (CHARMM interface and graphics, http://www.charmming.org), is presented. This tool provides a user-friendly interface for the preparation, submission, monitoring, and visualization of molecular simulations (i.e., energy minimization, solvation,...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800133b
更新日期:2008-09-01 00:00:00
abstract::The primary goal of this project was to evaluate the performance of the Standard and Enforced Geometry Optimization (SEGO) method which we have recently developed. The SEGO method has been designed for an automatic location of multiple minima on the molecular Potential Energy Surface (PES), and its usefulness has been...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00352
更新日期:2019-08-26 00:00:00
abstract::A database of about 700 high-resolution kinase structures was used to test the reliability of 17 docking procedures (using six docking software packages) by means of self- and cross-docking studies. The analysis of about 80 000 docking calculations suggests that the docking of an unknown ligand into a kinase has a pro...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100161z
更新日期:2010-08-23 00:00:00
abstract::Halogen bonds (XBs) are attracting increasing attention in biological systems. Protein Data Bank (PDB) archives experimentally determined XBs in biological macromolecules. However, no software for structure refinement in X-ray crystallography takes into account XBs, which might result in the weakening or even vanishin...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00235
更新日期:2017-07-24 00:00:00
abstract::We present a succession of structural changes involved in hormone peptide activation of a prototypical GPCR. Microsecond molecular dynamics simulation generated conformational ensembles reveal propagation of structural changes through key "microswitches" within human AT1R bound to native hormone. The endocrine octa-pe...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00583
更新日期:2019-01-28 00:00:00
abstract::Virtual screening is routinely used to discover new ligands and in particular new ligand chemotypes for G protein-coupled receptors (GPCRs). To prepare for a virtual screen, we often tailor a docking protocol that will enable us to select the best candidates for further screening. To aid this, we created GPCR-Bench, a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00660
更新日期:2016-04-25 00:00:00
abstract::Molecular fingerprints are widely used for similarity-based virtual screening in drug discovery projects. In this paper we discuss the performance and the complementarity of nine two-dimensional fingerprints (Daylight, Unity, AlFi, Hologram, CATS, TRUST, Molprint 2D, ChemGPS, and ALOGP) in retrieving active molecules ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0504723
更新日期:2006-05-01 00:00:00
abstract::There is a growing public concern about the lack of reproducibility of experimental data published in peer-reviewed scientific literature. Herein, we review the most recent alerts regarding experimental data quality and discuss initiatives taken thus far to address this problem, especially in the area of chemical geno...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章,评审
doi:10.1021/acs.jcim.6b00129
更新日期:2016-07-25 00:00:00
abstract::In this article, we present a systematic way to classify a family of high-genus fullerenes (HGFs) by decomposing them into two types of necklike structures, which are the negatively curved parts of parent toroidal carbon nanotubes. By replacing the faces of a uniform polyhedron with these necks, an HGF polyhedron corr...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci9001124
更新日期:2009-07-01 00:00:00
abstract::Protein-protein interactions (PPIs) play vital roles in regulating biological processes, such as cellular and signaling pathways. Hotspots are certain residues located at protein-protein interfaces that contribute more in protein-protein binding than other residues. Research on the mutational effects of hotspots is im...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00966
更新日期:2021-01-25 00:00:00
abstract::Protein-protein interactions play a key role in a multitude of biological processes, such as signal transduction, de novo drug design, immune responses, and enzymatic activities. It is of great interest to understand how proteins interact with each other. The general approach is to explore all possible poses and ident...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci5002372
更新日期:2014-06-23 00:00:00
abstract::Molecular docking can account for receptor flexibility by combining the docking score over multiple rigid receptor conformations, such as snapshots from a molecular dynamics simulation. Here, we evaluate a number of common snapshot selection strategies using a quality metric from stratified sampling, the efficiency of...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00314
更新日期:2018-09-24 00:00:00
abstract::A MMGBSA variant (here referred to as Nwat-MMGBSA), based on the inclusion of a certain number of explicit water molecules (Nwat) during the calculations, has been tested on a set of 20 protein-protein complexes, using the correlation between predicted and experimental binding energy as the evaluation metric. Besides ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00196
更新日期:2016-09-26 00:00:00
abstract::Due to the recent availability of high quality small molecule databases, such as ZINC and PubChem,1,2 virtual screening is playing an even more important role in identifying biologically relevant molecules in drug discovery campaigns. The success of pharmacophore-based virtual screening (PBVS) relies largely on the ac...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700089w
更新日期:2007-07-01 00:00:00
abstract::Pharmacophore search is a key component of many drug discovery efforts. Pharmer is a new computational approach to pharmacophore search that scales with the breadth and complexity of the query, not the size of the compound library being screened. Two novel methods for organizing pharmacophore data, the Pharmer KDB-tre...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200097m
更新日期:2011-06-27 00:00:00
abstract::A new methodology to describe the interactions in "receptor-ligand" complexes is presented. The methodology is based on a combination of the 3D/4D QSAR BiS/MC and CoCon algorithms. The first algorithm performs the restricted docking of compounds to receptor pockets. The second determines the relationships between the ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800405n
更新日期:2009-06-01 00:00:00
abstract::Janus kinase 2 (JAK2) is a protein tyrosine kinase implicated in signaling by specific members of the cytokine receptor family. Although it has been established that the JAK2 tyrosine kinase is negatively regulated by the JAK homology 2 (JH2) pseudokinase domain, the underlying mechanism of JH2 mediated regulation rem...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300308g
更新日期:2012-11-26 00:00:00
abstract::Novel statistical potentials derived from known protein structures are presented. They are designed to describe cation-pi and amino-pi interactions between a positively charged amino acid or an amino acid carrying a partially charged amino group and an aromatic moiety. These potentials are based on the propensity of r...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050395b
更新日期:2006-03-01 00:00:00
abstract::Telomere maintenance is a universal cancer hallmark, and small molecules that disrupt telomere maintenance generally have anticancer properties. Since the vast majority of cancer cells utilize telomerase activity for telomere maintenance, the enzyme has been considered as an anticancer drug target. Recently, rational ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00336
更新日期:2015-12-28 00:00:00
abstract::Deep learning methods applied to problems in chemoinformatics often require the use of recursive neural networks to handle data with graphical structure and variable size. We present a useful classification of recursive neural network approaches into two classes, the inner and outer approach. The inner approach uses r...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00384
更新日期:2018-02-26 00:00:00
abstract::Group 1 metabotropic glutamate receptors (mGluR) are G-protein coupled receptors with a large bilobate extracellular ligand binding region (LBR) that resembles a Venus fly trap. Closing of this LBR in the presence of a ligand is associated with the activation of the receptor. From conformational sampling of the LBR-li...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400160x
更新日期:2013-06-24 00:00:00
abstract::Deep learning has demonstrated significant potential in advancing state of the art in many problem domains, especially those benefiting from automated feature extraction. Yet, the methodology has seen limited adoption in the field of ligand-based virtual screening (LBVS) as traditional approaches typically require lar...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00622
更新日期:2020-10-26 00:00:00
abstract::The balance between structural stability and functional plasticity in proteins that share common three-dimensional folds is the key factor that drives protein evolvability. The ability to distinguish the parts of homologous proteins that underlie common structural organization patterns from the parts acting as regulat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00504
更新日期:2017-04-24 00:00:00
abstract::Development of accurate force field parameters for molecular ions in the context of a polarizable energy function based on the classical Drude oscillator is a crucial step toward an accurate polarizable model for modeling and simulations of biological macromolecules. Toward this goal we have undertaken a hierarchical ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00132
更新日期:2018-05-29 00:00:00
abstract::SIRT2, which is a NAD+ (nicotinamide adenine dinucleotide) dependent deacetylase, has been demonstrated to play an important role in the occurrence and development of a variety of diseases such as cancer, ischemia-reperfusion, and neurodegenerative diseases. Small molecule inhibitors of SIRT2 are thought to be potenti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00714
更新日期:2017-04-24 00:00:00
abstract::We implemented a fragment-based de novo design algorithm for a population-based optimization of molecular structures. The concept is grounded on an evolution strategy with mutation and crossover operators for structure breeding. Molecular building blocks were obtained from the pseudo-retrosynthesis of a collection of ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci6005307
更新日期:2007-03-01 00:00:00
abstract::The membrane permeability of cyclic peptides and peptidomimetics, which are generally larger and more complex than typical drug molecules, is likely strongly influenced by the conformational behavior of these compounds in polar and apolar environments. The size and complexity of peptides often limit their bioavailabil...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00251
更新日期:2016-08-22 00:00:00
abstract::This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400716y
更新日期:2014-03-24 00:00:00
abstract::A compound's synthetic accessibility (SA) is an important aspect of drug design, since in some cases computer-designed compounds cannot be synthesized. There have been several reports on SA prediction, most of which have focused on the difficulties of synthetic reactions based on retro-synthesis analyses, reaction dat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500568d
更新日期:2014-12-22 00:00:00
abstract::Designing organic saccharide sensors for use in aqueous solution is a nontrivial endeavor. Incorporation of hydrogen bonding groups on a sensor's receptor unit to target saccharides is an obvious strategy but not one that is likely to ensure analyte-receptor interactions over analyte-solvent or receptor-solvent intera...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00987
更新日期:2019-05-28 00:00:00