Get Your Atoms in Order--An Open-Source Implementation of a Novel and Robust Molecular Canonicalization Algorithm.

Abstract:

:Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with that algorithm that lead to noncanonical atom orderings as well as problems when it is applied to large molecules like proteins. Furthermore, each cheminformatics toolkit or software provides its own version of a canonical ordering, most based on unpublished algorithms, which also complicates the generation of a universal unique identifier for molecules. We present an alternative canonicalization approach that uses a standard stable-sorting algorithm instead of a Morgan-like index. Two new invariants that allow canonical ordering of molecules with dependent chirality as well as those with highly symmetrical cyclic graphs have been developed. The new approach proved to be robust and fast when tested on the 1.45 million compounds of the ChEMBL 20 data set in different scenarios like random renumbering of input atoms or SMILES round tripping. Our new algorithm is able to generate a canonical order of the atoms of protein molecules within a few milliseconds. The novel algorithm is implemented in the open-source cheminformatics toolkit RDKit. With this paper, we provide a reference Python implementation of the algorithm that could easily be integrated in any cheminformatics toolkit. This provides a first step toward a common standard for canonical atom ordering to generate a universal unique identifier for molecules other than InChI.

journal_name

J Chem Inf Model

authors

Schneider N,Sayle RA,Landrum GA

doi

10.1021/acs.jcim.5b00543

subject

Has Abstract

pub_date

2015-10-26 00:00:00

pages

2111-20

issue

10

eissn

1549-9596

issn

1549-960X

journal_volume

55

pub_type

杂志文章
  • Ligand coordinate analysis of SC-558 from the active site to the surface of COX-2: a molecular dynamics study.

    abstract::We have performed a ligand coordinate analysis to monitor the movement of the inhibitor SC-558 from the active site of the COX-2 protein to the exterior using molecular dynamics techniques. This study provides an insight into the intermolecular interactions formed by the ligand during this journey. The published cryst...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050142i

    authors: Sai Ram KV,Rambabu G,Sarma JA,Desiraju GR

    更新日期:2006-07-01 00:00:00

  • Rigorous Computational Study Reveals What Docking Overlooks: Double Trouble from Membrane Association in Protein Kinase C Modulators.

    abstract::Increasing protein kinase C (PKC) activity is of potential therapeutic value. Its activation involves an interaction between the C1 domain and diacylglycerol (DAG) at intracellular membrane surfaces; DAG mimetics hold promise as new drugs. We previously developed the isophthalate derivative HMI-1a3, an effective but h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00624

    authors: Lautala S,Provenzani R,Koivuniemi A,Kulig W,Talman V,Róg T,Tuominen RK,Yli-Kauhaluoma J,Bunker A

    更新日期:2020-11-23 00:00:00

  • Exploring inhibitor release pathways in histone deacetylases using random acceleration molecular dynamics simulations.

    abstract::Molecular channel exploration perseveres to be the prominent solution for eliciting structure and accessibility of active site and other internal spaces of macromolecules. The volume and silhouette characterization of these channels provides answers for the issues of substrate access and ligand swapping between the ob...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200584f

    authors: Kalyaanamoorthy S,Chen YP

    更新日期:2012-02-27 00:00:00

  • Computational and conformational evaluation of FTase alternative substrates: insight into a novel enzyme binding pocket.

    abstract::Protein farnesyltransferase (FTase) is an important anticancer drug target. In an effort to develop isoprenoid diphosphate-based FTase inhibitors, striking variations have been observed in the ability of conservatively modified analogues to bind to the enzyme. For example, 2Z-GGPP is an alternative substrate with high...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0496550

    authors: Henriksen BS,Zahn TJ,Evanseck JD,Firestine SM,Gibbs RA

    更新日期:2005-07-01 00:00:00

  • BCL::MolAlign: Three-Dimensional Small Molecule Alignment for Pharmacophore Mapping.

    abstract::Small molecule flexible alignment is a critical component of both ligand- and structure-based methods in computer-aided drug discovery. Despite its importance, the availability of high-quality flexible alignment software packages is limited. Here, we present BCL::MolAlign, a freely available property-based molecular a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00020

    authors: Brown BP,Mendenhall J,Meiler J

    更新日期:2019-02-25 00:00:00

  • "Social" network of isomers based on bond count distance: algorithms.

    abstract::This paper introduces the concept of an isomer network based on the reaction step counts between pairs of isomers as an alternative means to view and analyze isomer space. The computation of isomer networks is computationally expensive with respect to both run time and memory. Accordingly, this paper focuses on the de...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4005173

    authors: Kouri TM,Awale M,Slyby JK,Reymond JL,Mehta DP

    更新日期:2014-01-27 00:00:00

  • Receptor-based virtual ligand screening for the identification of novel CDC25 phosphatase inhibitors.

    abstract::CDC25 phosphatases play critical roles in cell cycle regulation and are attractive targets for anticancer therapies. Several small non-peptide molecules are known to inhibit CDC25, but many of them appear to form a covalent bond with the enzyme or act through oxidation of the thiolate group of the catalytic cysteine. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700313e

    authors: Montes M,Braud E,Miteva MA,Goddard ML,Mondésert O,Kolb S,Brun MP,Ducommun B,Garbay C,Villoutreix BO

    更新日期:2008-01-01 00:00:00

  • In Silico Classifiers for the Assessment of Drug Proarrhythmicity.

    abstract::Drug-induced torsade de pointes (TdP) is a life-threatening ventricular arrhythmia responsible for the withdrawal of many drugs from the market. Although currently used TdP risk-assessment methods are effective, they are expensive and prone to produce false positives. In recent years, in silico cardiac simulations hav...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00201

    authors: Llopis-Lorente J,Gomis-Tena J,Cano J,Romero L,Saiz J,Trenor B

    更新日期:2020-10-26 00:00:00

  • Unveiling the Atomic-Level Determinants of Acylase-Ligand Complexes: An Experimental and Computational Study.

    abstract::The industrial production of higher-generation semisynthetic cephalosporins starts from 7-aminocephalosporanic acid (7-ACA), which is obtained by deacylation of the naturally occurring antibiotic cephalosporin C (CephC). The enzymatic process in which CephC is directly converted into 7-ACA by a cephalosporin C acylase...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00535

    authors: Mollica L,Conti G,Pollegioni L,Cavalli A,Rosini E

    更新日期:2015-10-26 00:00:00

  • Heterogeneous Dielectric Implicit Membrane Model for the Calculation of MMPBSA Binding Free Energies.

    abstract::Membrane-bound protein receptors are a primary biological drug target, but the computational analysis of membrane proteins has been limited. In order to improve molecular mechanics Poisson-Boltzmann surface area (MMPBSA) binding free energy calculations for membrane protein-ligand systems, we have optimized a new hete...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00363

    authors: Greene D,Qi R,Nguyen R,Qiu T,Luo R

    更新日期:2019-06-24 00:00:00

  • PyCGTOOL: Automated Generation of Coarse-Grained Molecular Dynamics Models from Atomistic Trajectories.

    abstract::Development of coarse-grained (CG) molecular dynamics models is often a laborious process which commonly relies upon approximations to similar models, rather than systematic parametrization. PyCGTOOL automates much of the construction of CG models via calculation of both equilibrium values and force constants of inter...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00096

    authors: Graham JA,Essex JW,Khalid S

    更新日期:2017-04-24 00:00:00

  • Benchmark Sets for Binding Hot Spot Identification in Fragment-Based Ligand Discovery.

    abstract::Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00877

    authors: Wakefield AE,Yueh C,Beglov D,Castilho MS,Kozakov D,Keserű GM,Whitty A,Vajda S

    更新日期:2020-12-28 00:00:00

  • AlphaSpace: Fragment-Centric Topographical Mapping To Target Protein-Protein Interaction Interfaces.

    abstract::Inhibition of protein-protein interactions (PPIs) is emerging as a promising therapeutic strategy despite the difficulty in targeting such interfaces with drug-like small molecules. PPIs generally feature large and flat binding surfaces as compared to typical drug targets. These features pose a challenge for structura...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00103

    authors: Rooklin D,Wang C,Katigbak J,Arora PS,Zhang Y

    更新日期:2015-08-24 00:00:00

  • Exploration of the accessible chemical space of acyclic alkanes.

    abstract::Saturated acyclic alkanes show steric strain if they are highly branched and, in extreme cases, fall apart rapidly at room temperature. Consequently, attempts to count the number of isomeric forms for a given molecular formula that neglect this physical consideration will inevitably overestimate the size of the availa...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700246b

    authors: Paton RS,Goodman JM

    更新日期:2007-11-01 00:00:00

  • Bridging molecular docking to membrane molecular dynamics to investigate GPCR-ligand recognition: the human A₂A adenosine receptor as a key study.

    abstract::G protein-coupled receptors (GPCRs) represent the largest family of cell-surface receptors and about one-third of the actual targets of clinically used drugs. Following the progress made in the field of GPCRs structural determination, docking-based screening for novel potent and selective ligands is becoming an increa...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400532b

    authors: Sabbadin D,Ciancetta A,Moro S

    更新日期:2014-01-27 00:00:00

  • Assessment of the Cruzain Cysteine Protease Reversible and Irreversible Covalent Inhibition Mechanism.

    abstract::Reversible and irreversible covalent ligands are advanced cysteine protease inhibitors in the drug development pipeline. K777 is an irreversible inhibitor of cruzain, a necessary enzyme for the survival of the Trypanosoma cruzi (T. cruzi) parasite, the causative agent of Chagas disease. Despite their importance, irrev...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01138

    authors: Silva JRA,Cianni L,Araujo D,Batista PHJ,de Vita D,Rosini F,Leitão A,Lameira J,Montanari CA

    更新日期:2020-03-23 00:00:00

  • Combined Experimental and Molecular Simulation Study of Insulin-Chitosan Complexation Driven by Electrostatic Interactions.

    abstract::Protein-polysaccharide complexes constructed via self-assembly methods are often used to develop novel biomaterials for a wide range of applications in biomedicine, food, and biotechnology. The objective of this work was to investigate theoretically and to demonstrate via constant-pH Monte Carlo simulations that the c...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00814

    authors: Prudkin-Silva C,Pérez OE,Martínez KD,Barroso da Silva FL

    更新日期:2020-02-24 00:00:00

  • Structure-based discovery of novel non-nucleosidic DNA alkyltransferase inhibitors: virtual screening and in vitro and in vivo activities.

    abstract::The human DNA-repair O (6)-alkylguanine DNA alkyltransferase (MGMT or hAGT) protein protects DNA from environmental alkylating agents and also plays an important role in tumor resistance to chemotherapy treatment. Available inhibitors, based on pseudosubstrate analogs, have been shown to induce substantial bone marrow...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700447r

    authors: Ruiz FM,Gil-Redondo R,Morreale A,Ortiz AR,Fábrega C,Bravo J

    更新日期:2008-04-01 00:00:00

  • Identification of Enzyme Genes Using Chemical Structure Alignments of Substrate-Product Pairs.

    abstract::Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies tha...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00216

    authors: Moriya Y,Yamada T,Okuda S,Nakagawa Z,Kotera M,Tokimatsu T,Kanehisa M,Goto S

    更新日期:2016-03-28 00:00:00

  • Exploring Topological Pharmacophore Graphs for Scaffold Hopping.

    abstract::The primary goal of ligand-based virtual screening is to identify active compounds consisting of a core scaffold that is not found in the current active compound pool. Scaffold hopping is the term used for this purpose. In the present study, topological representations of pharmacophore features on chemical graphs were...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00098

    authors: Nakano H,Miyao T,Funatsu K

    更新日期:2020-04-27 00:00:00

  • Selective Fusion of Heterogeneous Classifiers for Predicting Substrates of Membrane Transporters.

    abstract::Membrane transporters play a crucial role in determining fate of administered drugs in a biological system. Early identification of plausible transporters for a drug molecule can provide insights into its therapeutic, pharmacokinetic, and toxicological profiles. In the present study, predictive models for classifying ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00508

    authors: Shaikh N,Sharma M,Garg P

    更新日期:2017-03-27 00:00:00

  • Adaptive BP-Dock: An Induced Fit Docking Approach for Full Receptor Flexibility.

    abstract::We present an induced fit docking approach called Adaptive BP-Dock that integrates perturbation response scanning (PRS) with the flexible docking protocol of RosettaLigand in an adaptive manner. We first perturb the binding pocket residues of a receptor and obtain a new conformation based on the residue response fluct...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00587

    authors: Bolia A,Ozkan SB

    更新日期:2016-04-25 00:00:00

  • Modeling compound-target interaction network of traditional Chinese medicines for type II diabetes mellitus: insight for polypharmacology and drug design.

    abstract::In this study, in order to elucidate the action mechanism of traditional Chinese medicines (TCMs) that exhibit clinical efficacy for type II diabetes mellitus (T2DM), an integrated protocol that combines molecular docking and pharmacophore mapping was employed to find the potential inhibitors from TCM for the T2DM-rel...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400146u

    authors: Tian S,Li Y,Li D,Xu X,Wang J,Zhang Q,Hou T

    更新日期:2013-07-22 00:00:00

  • Phytochemical informatics of traditional Chinese medicine and therapeutic relevance.

    abstract::Distribution patterns of 8411 compounds from 240 Chinese herbs were analyzed in relation to the herbal categories of traditional Chinese medicine (TCM), using Random Forest (RF) and self-organizing maps (SOM). RF was used first to construct TCM profiles of individual compounds, which describe their affinities for 28 m...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700155t

    authors: Ehrman TM,Barlow DJ,Hylands PJ

    更新日期:2007-11-01 00:00:00

  • Predicting the DNA Conductance Using a Deep Feedforward Neural Network Model.

    abstract::Double-stranded DNA (dsDNA) has been established as an efficient medium for charge migration, bringing it to the forefront of the field of molecular electronics and biological research. The charge migration rate is controlled by the electronic couplings between the two nucleobases of DNA/RNA. These electronic coupling...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01072

    authors: Aggarwal A,Vinayak V,Bag S,Bhattacharyya C,Waghmare UV,Maiti PK

    更新日期:2021-01-25 00:00:00

  • Molecular Dynamics Simulations of Supramolecular Anticancer Nanotubes.

    abstract::We report here on long-time all-atomistic molecular dynamics simulations of functional supramolecular nanotubes composed by the self-assembly of peptide-drug amphiphiles (DAs). These DAs have been shown to possess an inherently high drug loading of the hydrophobic anticancer drug camptothecin. We probe the self-assemb...

    journal_title:Journal of chemical information and modeling

    pub_type: 信件

    doi:10.1021/acs.jcim.8b00193

    authors: Kang M,Chakraborty K,Loverde SM

    更新日期:2018-06-25 00:00:00

  • Benchmark performance of MultiCASE Inc. software in Ames mutagenicity set.

    abstract::The predictive performances of MC4PC were evaluated using its learning machine functionality. Its superior characteristics are demonstrated in this following up study using the newly published Ames mutagenicity benchmark set. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 评论,信件

    doi:10.1021/ci1000899

    authors: Saiakhov RD,Klopman G

    更新日期:2010-09-27 00:00:00

  • Dependence of QSAR models on the selection of trial descriptor sets: a demonstration using nanotoxicity endpoints of decorated nanotubes.

    abstract::Little attention has been given to the selection of trial descriptor sets when designing a QSAR analysis even though a great number of descriptor classes, and often a greater number of descriptors within a given class, are now available. This paper reports an effort to explore interrelationships between QSAR models an...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3005308

    authors: Shao CY,Chen SZ,Su BH,Tseng YJ,Esposito EX,Hopfinger AJ

    更新日期:2013-01-28 00:00:00

  • Prediction of the Favorable Hydration Sites in a Protein Binding Pocket and Its Application to Scoring Function Formulation.

    abstract::The important role of water molecules in protein-ligand binding energetics has attracted wide attention in recent years. A range of computational methods has been developed to predict the favorable locations of water molecules in a protein binding pocket. Most of the current methods are based on extensive molecular dy...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00619

    authors: Li Y,Gao Y,Holloway MK,Wang R

    更新日期:2020-09-28 00:00:00

  • Mechanisms for Flavin-Mediated Oxidation: Hydride or Hydrogen-Atom Transfer?

    abstract::Flavins are versatile biological cofactors which catalyze proton-coupled electron transfers (PCET) with varying number and coupling of electrons. Flavin-mediated oxidations of nicotinamide adenine dinucleotide (NADH) and of succinate, initial redox reactions in cellular respiration, were examined here with multiconfig...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00945

    authors: Curtolo F,Arantes GM

    更新日期:2020-12-28 00:00:00