Abstract:
:Databases of small, potentially bioactive molecules are ubiquitous across the industry and academia. Designed such that each unique compound should appear only once, the multiplicity of ways in which many compounds can be represented means that these databases require methods for standardizing the representation of chemistry. This is commonly achieved through the use of "Chemistry Business Rules", sets of predefined rules that describe the "house style" of the database in question. At Syngenta, the historical approach to the design of chemistry business rules has been to focus on consistency of representation, with chemical relevance given secondary consideration. In this work, we overturn that convention. Through the use of quantum chemistry calculations, we define a set of chemistry business rules for tautomer standardization that reproduces gas-phase energetic preferences. We go on to show that, compared to our historic approach, this method yields tautomers that are in better agreement with those observed experimentally in condensed phases and that are better suited for use in predictive models.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Baker CM,Kidley NJ,Papachristos K,Hotson M,Carson R,Gravestock D,Pouliot M,Harrison J,Dowling Adoi
10.1021/acs.jcim.0c00232subject
Has Abstractpub_date
2020-08-24 00:00:00pages
3781-3791issue
8eissn
1549-9596issn
1549-960Xjournal_volume
60pub_type
杂志文章abstract::The similarity/diversity measures play a fundamental role in library searching, virtual screening, and quantitative structure-activity relationship/quantitative structure-property relationship modeling as well as in genomics and proteomics. In this paper, a new similarity/diversity measure is proposed as a new approac...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060099e
更新日期:2006-09-01 00:00:00
abstract::This study has assessed the use of consensus regression, as compared to single multiple linear regression, models for the development of quantitative structure-activity relationships (QSARs). To provide a comparison, four data sets of varying size and complexity were analyzed: silastic membrane flux, toxicity of pheno...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700016d
更新日期:2007-07-01 00:00:00
abstract::Up to now, publicly available data sets to build and evaluate Ames mutagenicity prediction tools have been very limited in terms of size and chemical space covered. In this report we describe a new unique public Ames mutagenicity data set comprising about 6500 nonconfidential compounds (available as SMILES strings and...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900161g
更新日期:2009-09-01 00:00:00
abstract::Inhibition of plasmin has been found to effectively reduce fibrinolysis and to avoid hemorrhage. This can be achieved by addressing its kringle 1 domain with the known drug and lysine analogue tranexamic acid. Guided by shape similarities toward a previously discovered lead compound, 5-(4-piperidyl)isoxazol-3-ol, a se...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00255
更新日期:2017-07-24 00:00:00
abstract::Functional diversity of the three-finger-protein domain (TFPD) had been acquired via hypervariability of some sequence positions and extensive insertion/deletion of short AA-segments that caused multidimensional drift of several sequence attributes such as the overall (HI) and local hydrophobicity levels, the isoelect...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00322
更新日期:2015-09-28 00:00:00
abstract::Retrieving molecules with specific structural features is a fundamental requirement of today's molecular database technologies. Estimates claim the chemical space relevant for drug discovery to be around 10⁶⁰ molecules. This figure is many orders of magnitude larger than the amount of molecules conventional databases ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400107k
更新日期:2013-07-22 00:00:00
abstract::We propose a hypothesis that "a model of active compound can be provided by integrating information of compounds high-ranked by docking simulation of a random compound library". In our hypothesis, the inclusion of true active compounds in the high-ranked compound is not necessary. We regard the high-ranked compounds a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7003384
更新日期:2008-03-01 00:00:00
abstract::The quantitative structure-activity relationship (QSAR) approach has been used to model a wide range of chemical-induced biological responses. However, it had not been utilized to model chemical-induced genomewide gene expression changes until very recently, owing to the complexity of training and evaluating a very la...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00281
更新日期:2017-09-25 00:00:00
abstract::In this study, two probabilistic machine-learning algorithms were compared for in silico target prediction of bioactive molecules, namely the well-established Laplacian-modified Naïve Bayes classifier (NB) and the more recently introduced (to Cheminformatics) Parzen-Rosenblatt Window. Both classifiers were trained in ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300435j
更新日期:2013-08-26 00:00:00
abstract::Following the theoretical model by Hann et al. moderately complex structures are preferable lead compounds since they lead to specific binding events involving the complete ligand molecule. To make this concept usable in practice for library design, we studied several complexity measures on the biological activity of ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0503558
更新日期:2006-03-01 00:00:00
abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700006f
更新日期:2007-07-01 00:00:00
abstract::This paper presents an exploratory study of a novel method for flexible 3-D similarity searching based on autocorrelation vectors and smoothed bounded distance matrices. Although the new approach is unable to outperform an existing 2-D similarity searching in terms of enrichment factors, it is able to retrieve differe...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0503863
更新日期:2006-03-01 00:00:00
abstract::Designing organic saccharide sensors for use in aqueous solution is a nontrivial endeavor. Incorporation of hydrogen bonding groups on a sensor's receptor unit to target saccharides is an obvious strategy but not one that is likely to ensure analyte-receptor interactions over analyte-solvent or receptor-solvent intera...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00987
更新日期:2019-05-28 00:00:00
abstract::Histone deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases, and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective histone deacetylase inhibitors (HDACIs). To facilitat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci5005515
更新日期:2015-02-23 00:00:00
abstract::In this DFT study, activities of 11 different N2O4, N2O3, and NO2 core containing Zr(IV) complexes, 4,13-diaza-18-crown-6 (I'N2O4), 1,4,10-trioxa-7,13-diazacyclopentadecane (I'N2O3), and 2-(2-methoxy)ethanol (I'NO2), respectively, and their analogues in peptide hydrolysis have been investigated. Based on the experimen...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00781
更新日期:2017-05-22 00:00:00
abstract::Umami or the taste of monosodium glutamate represents one of the major attractive taste modalities in humans. Therefore, knowledge about biophysical and biochemical properties of the umami taste is important for both scientific research and the food industry. Experimental approaches for predicting umami peptides are l...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00707
更新日期:2020-12-28 00:00:00
abstract::The consistent handling of molecules is probably the most basic and important requirement in the field of cheminformatics. Reliable results can only be obtained if the underlying calculations are independent of the specific way molecules are represented in the input data. However, ensuring consistency is a complex tas...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400724v
更新日期:2014-03-24 00:00:00
abstract::Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies tha...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00216
更新日期:2016-03-28 00:00:00
abstract::We describe here our tool named PyPLIF HIPPOS, which was newly developed to analyze the docking results of AutoDock Vina and PLANTS. Its predecessor, PyPLIF (https://github.com/radifar/pyplif), is a molecular interaction fingerprinting tool for the docking results of PLANTS, exclusively. Unlike its predecessor, PyPLIF...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00305
更新日期:2020-08-24 00:00:00
abstract::Increased reports of oseltamivir (OTV)-resistant strains of the influenza virus, such as the H274Y mutation on its neuraminidase (NA), have created some cause for concern. Many studies have been conducted in the attempt to uncover the mechanism of OTV resistance in H274Y NA. However, most of the reported studies on H2...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00331
更新日期:2016-01-25 00:00:00
abstract::This study examines the dependence of molecular alignment accuracy on a variety of factors including the choice of molecular template, alignment method, conformational flexibility, and type of protein target. We used eight test systems for which X-ray data on 145 ligand-protein complexes were available. The use of X-r...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060134h
更新日期:2006-09-01 00:00:00
abstract::We present an induced fit docking approach called Adaptive BP-Dock that integrates perturbation response scanning (PRS) with the flexible docking protocol of RosettaLigand in an adaptive manner. We first perturb the binding pocket residues of a receptor and obtain a new conformation based on the residue response fluct...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00587
更新日期:2016-04-25 00:00:00
abstract::Covalent inhibitors have been gaining increased attention in drug discovery due to their beneficial properties such as long residence time, high biochemical efficiency, and specificity. Optimization of covalent inhibitors is a complex task that involves parallel monitoring of the noncovalent recognition elements and t...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00834
更新日期:2020-12-28 00:00:00
abstract::The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined. A protocol requiring minimal user intervention has been established to align training and test set molecules using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are exemplified stu...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7002945
更新日期:2008-02-01 00:00:00
abstract::It is demonstrated that the fragmentation of druglike molecules by applying simplistic pseudo-retrosynthesis results in a stock of chemically meaningful building blocks for de novo molecule generation. A stochastic search algorithm in conjunction with ligand-based similarity scoring (Flux: fragment-based ligand builde...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0503560
更新日期:2006-03-01 00:00:00
abstract::The important role of water molecules in protein-ligand binding energetics has attracted wide attention in recent years. A range of computational methods has been developed to predict the favorable locations of water molecules in a protein binding pocket. Most of the current methods are based on extensive molecular dy...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00619
更新日期:2020-09-28 00:00:00
abstract::A database has been derived from recently reported [60]fullerene derivatives, and their binding scores with HIV-1 PR have been computed using docking techniques. Computational methods have been used to predict which derivatives may have high binding affinities, and for these compounds biological tests have been perfor...
journal_title:Journal of chemical information and modeling
pub_type: 信件
doi:10.1021/ci900047s
更新日期:2009-05-01 00:00:00
abstract::Standardization is used to ensure that the variables in a similarity calculation make an equal contribution to the computed similarity value. This paper compares the use of seven different methods that have been suggested previously for the standardization of integer-valued or real-valued data, comparing the results w...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800224h
更新日期:2009-02-01 00:00:00
abstract::Engineered nanomaterials (ENMs) are increasingly infiltrating our lives as a result of their applications across multiple fields. However, ENM formulations may result in the modulation of pathways and mechanisms of toxic action that endanger human health and the environment. Alternative testing methods such as in sili...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00223
更新日期:2017-09-25 00:00:00
abstract::Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity to the nearest bioactive to predict a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00316
更新日期:2018-01-22 00:00:00