Abstract:
:Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with that algorithm that lead to noncanonical atom orderings as well as problems when it is applied to large molecules like proteins. Furthermore, each cheminformatics toolkit or software provides its own version of a canonical ordering, most based on unpublished algorithms, which also complicates the generation of a universal unique identifier for molecules. We present an alternative canonicalization approach that uses a standard stable-sorting algorithm instead of a Morgan-like index. Two new invariants that allow canonical ordering of molecules with dependent chirality as well as those with highly symmetrical cyclic graphs have been developed. The new approach proved to be robust and fast when tested on the 1.45 million compounds of the ChEMBL 20 data set in different scenarios like random renumbering of input atoms or SMILES round tripping. Our new algorithm is able to generate a canonical order of the atoms of protein molecules within a few milliseconds. The novel algorithm is implemented in the open-source cheminformatics toolkit RDKit. With this paper, we provide a reference Python implementation of the algorithm that could easily be integrated in any cheminformatics toolkit. This provides a first step toward a common standard for canonical atom ordering to generate a universal unique identifier for molecules other than InChI.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Schneider N,Sayle RA,Landrum GAdoi
10.1021/acs.jcim.5b00543subject
Has Abstractpub_date
2015-10-26 00:00:00pages
2111-20issue
10eissn
1549-9596issn
1549-960Xjournal_volume
55pub_type
杂志文章abstract::The concept of chemoisosterism of protein environments is introduced as the complementary property to bioisosterism of chemical fragments. In the same way that two chemical fragments are considered bioisosteric if they can bind to the same protein environment, two protein environments will be considered chemoisosteric...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci3002974
更新日期:2013-02-25 00:00:00
abstract::One of the largest commercial applications of enzymes and surfactants is as main components in modern detergents. The high concentration of surfactant compounds usually present in detergents can, however, negatively affect the enzymatic activity. To remedy this drawback, it is of great importance to characterize the i...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00857
更新日期:2019-05-28 00:00:00
abstract::Representative molecules from 10 classes of prohibited substances were taken from the World Anti-Doping Agency (WADA) list, augmented by molecules from corresponding activity classes found in the MDDR database. Together with some explicitly allowed compounds, these formed a set of 5245 molecules. Five types of fingerp...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0601160
更新日期:2006-11-01 00:00:00
abstract::We present our newly developed and highly efficient lossless compression algorithm for trajectories of atom positions and volumetric data. The algorithm is designed as a two-step approach. In the first step, efficient polynomial extrapolation schemes reduce the information entropy of the data by exploiting both spatia...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00501
更新日期:2018-10-22 00:00:00
abstract::Sterol 14α-demethylase (CYP51) is the main drug target for the treatment of fungal infections. The discovery of new efficient fungal CYP51 inhibitors requires an understanding of the structural requirements for selectivity for the fungal over the human ortholog. In this study, a binding mode of the pyridylethanol(phen...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500556k
更新日期:2014-12-22 00:00:00
abstract::Umami or the taste of monosodium glutamate represents one of the major attractive taste modalities in humans. Therefore, knowledge about biophysical and biochemical properties of the umami taste is important for both scientific research and the food industry. Experimental approaches for predicting umami peptides are l...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00707
更新日期:2020-12-28 00:00:00
abstract::In this study, we have developed a two model system to mimic the active and inactive states of a G-protein coupled receptor specifically the alpha1A adrenergic receptor. We have docked two agonists, epinephrine (phenylamine type) and oxymetazoline (imidazoline type), as well as two antagonists, prazosin and 5-methylur...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700026v
更新日期:2007-09-01 00:00:00
abstract::Novel statistical potentials derived from known protein structures are presented. They are designed to describe cation-pi and amino-pi interactions between a positively charged amino acid or an amino acid carrying a partially charged amino group and an aromatic moiety. These potentials are based on the propensity of r...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050395b
更新日期:2006-03-01 00:00:00
abstract::We introduce the statistics behind a novel type of SAR analysis named "nonadditivity analysis". On the basis of all pairs of matched pairs within a given data set, the approach analyzes whether the same transformations between related molecules have the same effect, i.e., whether they are additive. Assuming that the e...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00631
更新日期:2019-09-23 00:00:00
abstract::Prediction of compound properties from structure via quantitative structure-activity relationship and machine-learning approaches is an important computational chemistry task in small-molecule drug research. Though many such properties are dependent on three-dimensional structures or even conformer ensembles, the majo...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00151
更新日期:2018-05-29 00:00:00
abstract::G-protein coupled receptors (GPCRs) are highly relevant drug targets. Four GPCRs with known crystal structure were analyzed with docking (AutoDock4) and postdocking (MM-PBSA) in order to evaluate the ability to recognize known antagonists from a larger database of molecular decoys and to predict correct binding modes....
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci4000745
更新日期:2013-04-22 00:00:00
abstract::The metabolism of xenobiotics--and more specifically drugs--in the liver is a critical process controlling their half-life. Although there exist experimental methods, which measure the metabolic stability of xenobiotics and identify their metabolites, developing higher throughput predictive methods is an avenue of res...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci3003073
更新日期:2012-09-24 00:00:00
abstract::We present a succession of structural changes involved in hormone peptide activation of a prototypical GPCR. Microsecond molecular dynamics simulation generated conformational ensembles reveal propagation of structural changes through key "microswitches" within human AT1R bound to native hormone. The endocrine octa-pe...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00583
更新日期:2019-01-28 00:00:00
abstract::The binding affinity and relative maximal efficacy of human A3 adenosine receptor (AR) agonists were each subjected to ligand-based three-dimensional quantitative structure-activity relationship analysis. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) used a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600501z
更新日期:2007-05-01 00:00:00
abstract::MHC class II molecules bind peptides derived from extracellular proteins that have been ingested by antigen-presenting cells and display them to the immune system. Peptide loading occurs within the antigen-presenting cell and is facilitated by HLA-DM. HLA-DM stabilizes the open conformation of the MHCII binding groove...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00104
更新日期:2019-06-24 00:00:00
abstract::Among the photophysical parameters that underpin Förster resonance energy transfer (FRET), perhaps the least explored is the spectral overlap term ( J). While by definition J increases linearly with acceptor molar absorption coefficient (ε(A) in M-1 cm-1), is proportional to wavelength (λ4), and depends on the degree ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00753
更新日期:2019-02-25 00:00:00
abstract::The partitioning of solute molecules between immiscible solvents with significantly different polarities is of great importance. The polarization between the solute and solvent molecules plays an essential role in determining the solubility of the solute, which makes computational studies utilizing molecular mechanics...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00001
更新日期:2017-10-23 00:00:00
abstract::Traditional herbal medicine has been an inseparable part of the traditional medical science in many countries throughout history. Nowadays, the popularity of using herbal medicines in daily life, as well as clinical practices, has gradually expanded to numerous Western countries with positive impacts and acceptance. T...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00826
更新日期:2020-03-23 00:00:00
abstract::Different forms of synaptic plasticity in the cerebellum expressed at the synapses onto Purkinje cells (PCs) are mediated by membrane metabotropic glutamate receptors (mGluRs). There are three main mGluR groups with a total of 8 subtypes. Although mGluRs are also found at the climbing fiber (CF) to PC synapses, the di...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050161s
更新日期:2005-11-01 00:00:00
abstract::The quantitative structure-activity relationship (QSAR) approach has been used to model a wide range of chemical-induced biological responses. However, it had not been utilized to model chemical-induced genomewide gene expression changes until very recently, owing to the complexity of training and evaluating a very la...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00281
更新日期:2017-09-25 00:00:00
abstract::Protein-protein interactions are central to many biological processes, from intracellular communication to cytoskeleton assembly, and therefore represent an important class of targets for new therapeutics. The most common secondary structure in natural proteins is an α-helix. Small molecules seem to be attractive cand...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200424a
更新日期:2012-02-27 00:00:00
abstract::An essential feature of all practical de novo molecule generating programs is the ability to focus the potential combinatorial explosion of grown molecules on a desired chemical space. It is a daunting task to balance the generation of new molecules with limitations on growth that produce desired features such as stab...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci9000458
更新日期:2009-07-01 00:00:00
abstract::With a rapid increase in the number of high-resolution protein-ligand structures, the known protein-ligand structures can be used to gain insight into ligand-binding modes in a target protein. On the basis of the fact that the structurally similar binding sites share information about their ligands, we have developed ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300178e
更新日期:2012-10-22 00:00:00
abstract::Factor Xa inhibitors are innovative anticoagulant agents that provide a better safety/efficacy profile compared to other anticoagulative drugs. A chemical feature-based modeling approach was applied to identify crucial pharmacophore patterns from 3D crystal structures of inhibitors bound to human factor Xa (Pdb entrie...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci049778k
更新日期:2005-01-01 00:00:00
abstract::The determination of the validity of a QSAR model when applied to new compounds is an important concern in the field of QSAR and QSPR modeling. Various scoring techniques can be applied to specific types of models. We present a technique with which we can state whether a new compound will be well predicted by a previo...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0497511
更新日期:2005-01-01 00:00:00
abstract::We report a new classification method for pyranose ring conformations called Best-fit, Four-Membered Plane (BFMP), which describes pyranose ring conformations based on reference planes defined by four atoms. The method is able to characterize all asymmetrical and symmetrical shapes of a pyran ring, is readily automate...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500325b
更新日期:2014-10-27 00:00:00
abstract::The viral NS5B RNA-dependent RNA-polymerase (RdRp) is one of the best-studied and promising targets for the development of novel therapeutics against hepatitis C virus (HCV). Allosteric inhibition of this enzyme has emerged as a viable strategy toward blocking replication of viral RNA in cell based systems. Herein, we...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci9004749
更新日期:2010-04-26 00:00:00
abstract::The early stages of drug discovery rely on hit-to-lead programs, where initial hits undergo partial optimization to improve binding affinities for their biological target. This is an expensive and time-consuming process, requiring multiple iterations of trial and error designs, an ideal scenario for applying computer ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00938
更新日期:2020-03-23 00:00:00
abstract::A new radial space-filling method for visualizing cluster hierarchies is presented. The method, referred to as a radial clustergram, arranges the clusters into a series of layers, each representing a different level of the tree. It uses adjacency of nodes instead of links to represent parent-child relationships and al...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600427x
更新日期:2007-01-01 00:00:00
abstract::Deep learning has demonstrated significant potential in advancing state of the art in many problem domains, especially those benefiting from automated feature extraction. Yet, the methodology has seen limited adoption in the field of ligand-based virtual screening (LBVS) as traditional approaches typically require lar...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00622
更新日期:2020-10-26 00:00:00