Searching for recursively defined generic chemical patterns in nonenumerated fragment spaces.


:Retrieving molecules with specific structural features is a fundamental requirement of today's molecular database technologies. Estimates claim the chemical space relevant for drug discovery to be around 10⁶⁰ molecules. This figure is many orders of magnitude larger than the amount of molecules conventional databases retain today and will store in the future. An elegant description of such a large chemical space is provided by the concept of fragment spaces. A fragment space comprises fragments that are molecules with open valences and describes rules how to connect these fragments to products. Due to the combinatorial nature of fragment spaces, a complete enumeration of its products is intractable. We present an algorithm to search fragment spaces for generic chemical patterns as present in the SMARTS chemical pattern language. Our method allows specification of the chemical surrounding of an atom in a query and, therefore, enables a chemically intuitive search. During the search, the costly enumeration of products is avoided. The result is a fragment space that exactly describes all possible molecules that contain the user-defined pattern. We evaluated the algorithm in three different drug development use-cases and performed a large scale statistical analysis with 738 SMARTS patterns on three public available fragment spaces. Our results show the ability of the algorithm to explore the chemical space around known active molecules, to analyze fragment spaces for the presence of likely toxic molecules, and to identify complex macromolecular structures under additional structural constraints. By searching the fragment space in its nonenumerated form, spaces covering up to 10¹⁹ molecules can be examined in times ranging between 47 s and 19 min depending on the complexity of the query pattern.


J Chem Inf Model


Ehrlich HC,Henzler AM,Rarey M




Has Abstract


2013-07-22 00:00:00












  • Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets.

    abstract::With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Kramer C,Gedeck P

    更新日期:2010-11-22 00:00:00

  • PyPLIF HIPPOS: A Molecular Interaction Fingerprinting Tool for Docking Results of AutoDock Vina and PLANTS.

    abstract::We describe here our tool named PyPLIF HIPPOS, which was newly developed to analyze the docking results of AutoDock Vina and PLANTS. Its predecessor, PyPLIF (, is a molecular interaction fingerprinting tool for the docking results of PLANTS, exclusively. Unlike its predecessor, PyPLIF...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Istyastono EP,Radifar M,Yuniarti N,Prasasty VD,Mungkasi S

    更新日期:2020-08-24 00:00:00

  • Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    abstract::Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Schneider N,Fechner N,Landrum GA,Stiefl N

    更新日期:2017-08-28 00:00:00

  • Trans and Cis Conformations of the Antihypertensive Drug Valsartan Respectively Lock the Inactive and Active-like States of Angiotensin II Type 1 Receptor: A Molecular Dynamics Study.

    abstract::Angiotensin II type 1 receptor (AT1R) is the principal regulator of blood pressure in humans. The overactivation of AT1R by the stimulation of angiotensin II would result in high blood pressure. To prevent hypertension, nonpeptide "sartan" drugs, such as valsartan (VST), have been developed to competitively block the ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Wang L,Yan F

    更新日期:2018-10-22 00:00:00

  • An Analysis of Different Components of a High-Throughput Screening Library.

    abstract::Since many projects at pharmaceutical organizations get their start from a high-throughput screening (HTS) campaign, improving the quality of the HTS deck can improve the likelihood of discovering a high-quality lead molecule that can be progressed to a drug candidate. Over the past decade, Janssen has implemented sev...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Saha A,Varghese T,Liu A,Allen SJ,Mirzadegan T,Hack MD

    更新日期:2018-10-22 00:00:00

  • Retrospect and Prospect of Single Particle Cryo-Electron Microscopy: The Class of Integral Membrane Proteins as an Example.

    abstract::A giant technological leap in the field of cryo-electron microscopy (cryo-EM) has assured the achievement of near-atomic resolution structures of biological macromolecules. As a recognition of this accomplishment, the Nobel Prize in Chemistry was awarded in 2017 to Jacques Dubochet, Joachim Frank, and Richard Henderso...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Akbar S,Mozumder S,Sengupta J

    更新日期:2020-05-26 00:00:00

  • In silico prediction of aqueous solubility: the solubility challenge.

    abstract::The dissolution of a chemical into water is a process fundamental to both chemistry and biology. The persistence of a chemical within the environment and the effects of a chemical within the body are dependent primarily upon aqueous solubility. With the well-documented limitations hindering the accurate experimental d...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Hewitt M,Cronin MT,Enoch SJ,Madden JC,Roberts DW,Dearden JC

    更新日期:2009-11-01 00:00:00

  • Gas-phase and solution conformations of selected dimeric structural units of heparin.

    abstract::The molecular structure of four dimeric units (D-E, E-F, F-G, and G-H) of the DEFGH structural unit of heparin, their anionic forms, and their sodium salts have been studied using the B3LYP/6-31+G(d) method. The optimized geometries indicate that the most stable structure of these dimeric units in neutral state is sta...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Remko M,von der Lieth CW

    更新日期:2006-07-01 00:00:00

  • Evaluating Free Energies of Binding and Conservation of Crystallographic Waters Using SZMAP.

    abstract::The SZMAP method computes binding free energies and the corresponding thermodynamic components for water molecules in the binding site of a protein structure [ SZMAP, 1.0.0 ; OpenEye Scientific Software Inc. : Santa Fe, NM, USA , 2011 ]. In this work, the ability of SZMAP to predict water structure and thermodynamic s...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Bayden AS,Moustakas DT,Joseph-McCarthy D,Lamb ML

    更新日期:2015-08-24 00:00:00

  • A Coarse-Grained Force Field Parameterized for MgCl2 and CaCl2 Aqueous Solutions.

    abstract::Calcium and magnesium ions play important roles in many physicochemical processes. To facilitate the investigation of phenomena related to these ions that occur over large length and time scales, a coarse-grained force field (CGFF) is developed for MgCl2 and CaCl2 aqueous solutions. The ions are modeled by CG beads wi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Gong Z,Sun H

    更新日期:2017-07-24 00:00:00

  • Structural and Functional Characterization of Allatostatin Receptor Type-C of Thaumetopoea pityocampa, a Potential Target for Next-Generation Pest Control Agents.

    abstract::Insect neuropeptide receptors, including allatostatin receptor type C (AstR-C), a G protein-coupled receptor, are among the potential targets for designing next-generation pesticides that despite their importance in offering a new mode-of-action have been overlooked. Focusing on AstR-C of Thaumetopoea pityocampa, a co...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Shahraki A,Işbilir A,Dogan B,Lohse MJ,Durdagi S,Birgul-Iyison N

    更新日期:2021-01-21 00:00:00

  • Protein kinases: docking and homology modeling reliability.

    abstract::A database of about 700 high-resolution kinase structures was used to test the reliability of 17 docking procedures (using six docking software packages) by means of self- and cross-docking studies. The analysis of about 80 000 docking calculations suggests that the docking of an unknown ligand into a kinase has a pro...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Tuccinardi T,Botta M,Giordano A,Martinelli A

    更新日期:2010-08-23 00:00:00

  • Benchmark performance of MultiCASE Inc. software in Ames mutagenicity set.

    abstract::The predictive performances of MC4PC were evaluated using its learning machine functionality. Its superior characteristics are demonstrated in this following up study using the newly published Ames mutagenicity benchmark set. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 评论,信件


    authors: Saiakhov RD,Klopman G

    更新日期:2010-09-27 00:00:00

  • An Efficient Lossless Compression Algorithm for Trajectories of Atom Positions and Volumetric Data.

    abstract::We present our newly developed and highly efficient lossless compression algorithm for trajectories of atom positions and volumetric data. The algorithm is designed as a two-step approach. In the first step, efficient polynomial extrapolation schemes reduce the information entropy of the data by exploiting both spatia...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Brehm M,Thomas M

    更新日期:2018-10-22 00:00:00

  • Study of Data Set Modelability: Modelability, Rivality, and Weighted Modelability Indexes.

    abstract::The knowledge of the capacity of a data set to be modeled in the first stages of the building of quantitative structure-activity relationship (QSAR) prediction models is an important issue because it might reduce the effort and time necessary to select or reject data sets and in refining the data set's composition. Th...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Luque Ruiz I,Gómez-Nieto MÁ

    更新日期:2018-09-24 00:00:00

  • Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions.

    abstract::We report a novel method called ADAN (Applicability Domain ANalysis) for assessing the reliability of drug property predictions obtained by in silico methods. The assessment provided by ADAN is based on the comparison of the query compound with the training set, using six diverse similarity criteria. For every criteri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Carrió P,Pinto M,Ecker G,Sanz F,Pastor M

    更新日期:2014-05-27 00:00:00

  • Molecular Mechanism, Dynamics, and Energetics of Protein-Mediated Dinucleotide Flipping in a Mismatched DNA: A Computational Study of the RAD4-DNA Complex.

    abstract::DNA damage alters genetic information and adversely affects gene expression pathways leading to various complex genetic disorders and cancers. DNA repair proteins recognize and rectify DNA damage and mismatches with high fidelity. A critical molecular event that occurs during most protein-mediated DNA repair processes...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Pitta K,Krishnan M

    更新日期:2018-03-26 00:00:00

  • Full and partial agonism of ionotropic glutamate receptors indicated by molecular dynamics simulations.

    abstract::Ionotropic glutamate receptors (iGluRs) are synaptic proteins that facilitate signal transmission in the central nervous system. Extracellular iGluR cleft closure is linked to receptor activation; however, the mechanism underlying partial agonism is not entirely understood. Full agonists close the bilobed ligand-bindi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Postila PA,Ylilauri M,Pentikäinen OT

    更新日期:2011-05-23 00:00:00

  • Coordination of Na(+) by monoamine ligands in dopamine, norepinephrine, and serotonin transporters.

    abstract::The reuptake of neurotransmitters by dopamine, norepinephrine, and serotonin transporters during neuronal transmission requires a sodium gradient. An "ionic mode" of binding proposes that aspartate anchors the ligand's positive charge but ignores the direct role of sodium in ligand binding seen in the only representat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Xhaard H,Backström V,Denessiouk K,Johnson MS

    更新日期:2008-07-01 00:00:00

  • Toward high throughput 3D virtual screening using spherical harmonic surface representations.

    abstract::Searching chemical databases for possible drug leads is often one of the main activities conducted during the early stages of a drug development project. This article shows that spherical harmonic molecular shape representations provide a powerful way to search and cluster small-molecule databases rapidly and accurate...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Mavridis L,Hudson BD,Ritchie DW

    更新日期:2007-09-01 00:00:00

  • Pharmer: efficient and exact pharmacophore search.

    abstract::Pharmacophore search is a key component of many drug discovery efforts. Pharmer is a new computational approach to pharmacophore search that scales with the breadth and complexity of the query, not the size of the compound library being screened. Two novel methods for organizing pharmacophore data, the Pharmer KDB-tre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Koes DR,Camacho CJ

    更新日期:2011-06-27 00:00:00

  • Molecular Simulation of αvβ6 Integrin Inhibitors.

    abstract::The urgent need for new treatments for the chronic lung disease idiopathic pulmonary fibrosis (IPF) motivates research into antagonists of the RGD binding integrin αvβ6, a protein linked to the initiation and progression of the disease. Molecular dynamics (MD) simulations of αvβ6 in complex with its natural ligand, pr...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Guest EE,Oatley SA,Macdonald SJF,Hirst JD

    更新日期:2020-11-23 00:00:00

  • Time-Domain Analysis of Molecular Dynamics Trajectories Using Deep Neural Networks: Application to Activity Ranking of Tankyrase Inhibitors.

    abstract::Molecular dynamics simulations provide valuable insights into the behavior of molecular systems. Extending the recent trend of using machine learning techniques to predict physicochemical properties from molecular dynamics data, we propose to consider the trajectories as multidimensional time series represented by 2D ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Berishvili VP,Perkin VO,Voronkov AE,Radchenko EV,Syed R,Venkata Ramana Reddy C,Pillay V,Kumar P,Choonara YE,Kamal A,Palyulin VA

    更新日期:2019-08-26 00:00:00

  • GDP Release from the Open Conformation of Gα Requires Allosteric Signaling from the Agonist-Bound Human β2 Adrenergic Receptor.

    abstract::G-protein-coupled receptors (GPCRs) transmit signals into the cell in response to ligand binding at its extracellular domain, which is characterized by the coupling of agonist-induced receptor conformational change to guanine nucleotide (GDP) exchange with guanosine triphosphate on a heterotrimeric (αβγ) guanine nucle...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Kumar V,Hoag H,Sader S,Scorese N,Liu H,Wu C

    更新日期:2020-08-24 00:00:00

  • Combinatorial × computational × cheminformatics (C3) approach to characterization of congeneric libraries of organic pollutants.

    abstract::Congeners are molecules based on the same carbon skeleton but are different by the number of substituents and/or a substitution pattern. Examples are 1-chloronaphthalene, 1,4-dichloronaphthalene, and 1,3,8-trichloronaphthalene. Various persistent organic pollutants (POPs) exist in the environment as families of congen...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Haranczyk M,Urbaszek P,Ng EG,Puzyn T

    更新日期:2012-11-26 00:00:00

  • Ligand-Based Discovery of a New Scaffold for Allosteric Modulation of the μ-Opioid Receptor.

    abstract::With the hope of discovering effective analgesics with fewer side effects, attention has recently shifted to allosteric modulators of the opioid receptors. In the past two years, the first chemotypes of positive or silent allosteric modulators (PAMs or SAMs, respectively) of μ- and δ-opioid receptor types have been re...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Bisignano P,Burford NT,Shang Y,Marlow B,Livingston KE,Fenton AM,Rockwell K,Budenholzer L,Traynor JR,Gerritz SW,Alt A,Filizola M

    更新日期:2015-09-28 00:00:00

  • TAMkin: a versatile package for vibrational analysis and chemical kinetics.

    abstract::TAMkin is a program for the calculation and analysis of normal modes, thermochemical properties and chemical reaction rates. At present, the output from the frequently applied software programs ADF, CHARMM, CPMD, CP2K, Gaussian, Q-Chem, and VASP can be analyzed. The normal-mode analysis can be performed using a broad ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Ghysels A,Verstraelen T,Hemelsoet K,Waroquier M,Van Speybroeck V

    更新日期:2010-09-27 00:00:00

  • Conformational analysis of macrocycles: finding what common search methods miss.

    abstract::As computational drug design becomes increasingly reliant on virtual screening and on high-throughput 3D modeling, the need for fast, robust, and reliable methods for sampling molecular conformations has become greater than ever. Furthermore, chemical novelty is at a premium, forcing medicinal chemists to explore more...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Bonnet P,Agrafiotis DK,Zhu F,Martin E

    更新日期:2009-10-01 00:00:00

  • Molecular Mechanism underlying PRMT1 Dimerization for SAM Binding and Methylase Activity.

    abstract::Protein arginine methyltransferases (PRMTs) catalyze the posttranslational methylation of arginine, which is important in a range of biological processes, including epigenetic regulation, signal transduction, and cancer progression. Although previous studies of PRMT1 mutants suggest that the dimerization arm and the N...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: Zhou R,Xie Y,Hu H,Hu G,Patel VS,Zhang J,Yu K,Huang Y,Jiang H,Liang Z,Zheng YG,Luo C

    更新日期:2015-12-28 00:00:00

  • Influence of protonation, tautomeric, and stereoisomeric states on protein-ligand docking results.

    abstract::In this work, we present a systematical investigation of the influence of ligand protonation states, stereoisomers, and tautomers on results obtained with the two protein-ligand docking programs GOLD and PLANTS. These different states were generated with a fully automated tool, called SPORES (Structure PrOtonation and...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章


    authors: ten Brink T,Exner TE

    更新日期:2009-06-01 00:00:00