ChemSchematicResolver: A Toolkit to Decode 2D Chemical Diagrams with Labels and R-Groups into Annotated Chemical Named Entities.

Abstract:

:The number of journal articles in the scientific domain has grown to the point where it has become impossible for researchers to capitalize on all findings in their relevant discipline. Information is stored in these articles in a number of ways, including figures that describe important results. In organic chemistry, these figures often present chemical schematic diagrams that graphically define the structures of carbon-based compounds. These diagrams are intuitive for an expert to comprehend, but they are not designed for machines. This work presents ChemSchematicResolver, a software tool that can be used to identify chemical schematic diagrams within the figure of a document, resolve any R-group substituents within them, and convert the resulting diagrams to a machine-readable format in a high-throughput, autonomous fashion. The tool includes a new algorithm that is used to identify relevant diagrams and a mechanism that combines these data with contextual information from the rest of the document for the creation of highly relational databases. It includes support for a variety of general R-group structures, the first time this is available in any open-source chemical schematic diagram extraction tool. It is presented alongside a self-generated evaluation set, on which the most important assessment metric, precision, achieved 83-100% for all assessed areas. The ChemSchematicResolver tool is released under the MIT license and is available to download from www.chemschematicresolver.org.

journal_name

J Chem Inf Model

authors

Beard EJ,Cole JM

doi

10.1021/acs.jcim.0c00042

subject

Has Abstract

pub_date

2020-04-27 00:00:00

pages

2059-2072

issue

4

eissn

1549-9596

issn

1549-960X

journal_volume

60

pub_type

杂志文章
  • Growth of ligand-target interaction data in ChEMBL is associated with increasing and activity measurement-dependent compound promiscuity.

    abstract::Compounds with high-confidence target annotations and activity measurements in the original and current release of the ChEMBL database have been compared to better understand how the growth of compound activity data might influence the spectrum of ligand-target interactions and the degree of target promiscuity among a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3003304

    authors: Hu Y,Bajorath J

    更新日期:2012-10-22 00:00:00

  • Molecular Oxygen Binding in the Mitochondrial Electron Transfer Flavoprotein.

    abstract::Reactive oxygen species such as superoxide are potentially harmful byproducts of the aerobic metabolism in the inner mitochondrial membrane, and complexes I, II, III of the electron transport chain have been identified as primary sources. The mitochondrial fatty acid b-oxidation pathway may also play a yet uncharacter...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00702

    authors: Husen P,Nielsen C,Martino CF,Solov'yov IA

    更新日期:2019-11-25 00:00:00

  • Develop and test a solvent accessible surface area-based model in conformational entropy calculations.

    abstract::It is of great interest in modern drug design to accurately calculate the free energies of protein-ligand or nucleic acid-ligand binding. MM-PBSA (molecular mechanics Poisson-Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300064d

    authors: Wang J,Hou T

    更新日期:2012-05-25 00:00:00

  • Probing the Binding Pathway of BRACO19 to a Parallel-Stranded Human Telomeric G-Quadruplex Using Molecular Dynamics Binding Simulation with AMBER DNA OL15 and Ligand GAFF2 Force Fields.

    abstract::Human telomeric DNA G-quadruplex has been identified as a good therapeutic target in cancer treatment. G-quadruplex-specific ligands that stabilize the G-quadruplex have great potential to be developed as anticancer agents. Two crystal structures (an apo form of parallel stranded human telomeric G-quadruplex and its h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00287

    authors: Machireddy B,Kalra G,Jonnalagadda S,Ramanujachary K,Wu C

    更新日期:2017-11-27 00:00:00

  • Retrospect and Prospect of Single Particle Cryo-Electron Microscopy: The Class of Integral Membrane Proteins as an Example.

    abstract::A giant technological leap in the field of cryo-electron microscopy (cryo-EM) has assured the achievement of near-atomic resolution structures of biological macromolecules. As a recognition of this accomplishment, the Nobel Prize in Chemistry was awarded in 2017 to Jacques Dubochet, Joachim Frank, and Richard Henderso...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01015

    authors: Akbar S,Mozumder S,Sengupta J

    更新日期:2020-05-26 00:00:00

  • Synergistic use of compound properties and docking scores in neural network modeling of CYP2D6 binding: predicting affinity and conformational sampling.

    abstract::Cytochrome P450 2D6 (CYP2D6) is used to develop an approach for predicting affinity and relevant binding conformation(s) for highly flexible binding sites. The approach combines the use of docking scores and compound properties as attributes in building a neural network (NN) model. It begins by identifying segments of...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600267k

    authors: Bazeley PS,Prithivi S,Struble CA,Povinelli RJ,Sem DS

    更新日期:2006-11-01 00:00:00

  • On three-electron bonds and hydrogen bonds in the open-shell complexes [H2X2]+ for X = F, Cl, and Br.

    abstract::The [H2X2]+ (X = Cl, Br) formula could refer to two possible stable structures, namely, the hydrogen-bonded complex and the three-electron-bonded one. In contrary to the results published by other authors, we claim that for the F-type structures the hydrogen-bonded form is the only possible one and the [HFFH]+ complex...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600355g

    authors: Bil A,Berski S,Latajka Z

    更新日期:2007-05-01 00:00:00

  • RosENet: Improving Binding Affinity Prediction by Leveraging Molecular Mechanics Energies with an Ensemble of 3D Convolutional Neural Networks.

    abstract::The worldwide increase and proliferation of drug resistant microbes, coupled with the lag in new drug development, represents a major threat to human health. In order to reduce the time and cost for exploring the chemical search space, drug discovery increasingly relies on computational biology approaches. One key ste...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00075

    authors: Hassan-Harrirou H,Zhang C,Lemmin T

    更新日期:2020-06-22 00:00:00

  • Allosteric Modulation of Human Hsp90α Conformational Dynamics.

    abstract::Central to Hsp90's biological function is its ability to interconvert between various conformational states. Drug targeting of Hsp90's regulatory mechanisms, including its modulation by cochaperone association, presents as an attractive therapeutic strategy for Hsp90 associated pathologies. In this study, we utilized ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00630

    authors: Penkler DL,Atilgan C,Tastan Bishop Ö

    更新日期:2018-02-26 00:00:00

  • Systematic analysis of enzyme-catalyzed reaction patterns and prediction of microbial biodegradation pathways.

    abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700006f

    authors: Oh M,Yamada T,Hattori M,Goto S,Kanehisa M

    更新日期:2007-07-01 00:00:00

  • Prediction of synthetic accessibility based on commercially available compound databases.

    abstract::A compound's synthetic accessibility (SA) is an important aspect of drug design, since in some cases computer-designed compounds cannot be synthesized. There have been several reports on SA prediction, most of which have focused on the difficulties of synthetic reactions based on retro-synthesis analyses, reaction dat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500568d

    authors: Fukunishi Y,Kurosawa T,Mikami Y,Nakamura H

    更新日期:2014-12-22 00:00:00

  • Flexophore, a new versatile 3D pharmacophore descriptor that considers molecular flexibility.

    abstract::A novel pharmacophore descriptor Flexophore is presented, which considers molecular flexibility when comparing descriptor similarities. The descriptor is a complete reduced graph of the underlying molecule. Its nodes are represented by enhanced MM2 atom types, while the edge descriptions encode the molecular flexibili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700359j

    authors: von Korff M,Freyss J,Sander T

    更新日期:2008-04-01 00:00:00

  • Use of 3D QSAR models for database screening: a feasibility study.

    abstract::The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined. A protocol requiring minimal user intervention has been established to align training and test set molecules using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are exemplified stu...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7002945

    authors: Hillebrecht A,Klebe G

    更新日期:2008-02-01 00:00:00

  • Structure-activity relationships in non-ligand binding pocket (non-LBP) diarylhydrazide antiandrogens.

    abstract::We report the synthesis and a study of the structure-activity relationships of a new series of diarylhydrazides as potential selective non-ligand binding pocket androgen receptor antagonists. Their biological activity as antiandrogens in the context of the development of treatments for castration resistant prostate ca...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400189m

    authors: Caboni L,Egan B,Kelly B,Blanco F,Fayne D,Meegan MJ,Lloyd DG

    更新日期:2013-08-26 00:00:00

  • Equally Weighted Multiscale Elastic Network Model and Its Comparison with Traditional and Parameter-Free Models.

    abstract::Dynamical properties of proteins play an essential role in their function exertion. The elastic network model (ENM) is an effective and efficient tool in characterizing the intrinsic dynamical properties encoded in biomacromolecule structures. The Gaussian network model (GNM) and anisotropic network model (ANM) are th...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01178

    authors: Gong W,Liu Y,Zhao Y,Wang S,Han Z,Li C

    更新日期:2021-01-26 00:00:00

  • Rigidity Strengthening: A Mechanism for Protein-Ligand Binding.

    abstract::Protein-ligand binding is essential to almost all life processes. The understanding of protein-ligand interactions is fundamentally important to rational drug and protein design. Based on large scale data sets, we show that protein rigidity strengthening or flexibility reduction is a mechanism in protein-ligand bindin...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00226

    authors: Nguyen DD,Xiao T,Wang M,Wei GW

    更新日期:2017-07-24 00:00:00

  • Turbocharging Matched Molecular Pair Analysis: Optimizing the Identification and Analysis of Pairs.

    abstract::We have applied the two most commonly used methods for automatic matched pair identification, obtained the optimum settings, and discovered that the two methods are synergistic. A turbocharging approach to matched pair analysis is advocated in which a first round (a conservative categorical approach that uses an analo...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00335

    authors: Lukac I,Zarnecka J,Griffen EJ,Dossetter AG,St-Gallay SA,Enoch SJ,Madden JC,Leach AG

    更新日期:2017-10-23 00:00:00

  • Characterization of DNA primary sequences by a new similarity/diversity measure based on the partial ordering.

    abstract::The similarity/diversity measures play a fundamental role in library searching, virtual screening, and quantitative structure-activity relationship/quantitative structure-property relationship modeling as well as in genomics and proteomics. In this paper, a new similarity/diversity measure is proposed as a new approac...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060099e

    authors: Todeschini R,Consonni V,Mauri A,Ballabio D

    更新日期:2006-09-01 00:00:00

  • Benchmark Sets for Binding Hot Spot Identification in Fragment-Based Ligand Discovery.

    abstract::Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00877

    authors: Wakefield AE,Yueh C,Beglov D,Castilho MS,Kozakov D,Keserű GM,Whitty A,Vajda S

    更新日期:2020-12-28 00:00:00

  • Insights on the facet specific adsorption of amino acids and peptides toward platinum.

    abstract::Engineering shape-controlled bionanomaterials requires comprehensive understanding of interactions between biomolecules and inorganic surfaces. We explore the origin of facet-selective binding of peptides adsorbed onto Pt(100) and Pt(111) crystallographic planes. Using molecular dynamics simulations, we show that upon...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400630d

    authors: Ramakrishnan SK,Martin M,Cloitre T,Firlej L,Cuisinier FJ,Gergely C

    更新日期:2013-12-23 00:00:00

  • Flux (1): a virtual synthesis scheme for fragment-based de novo design.

    abstract::It is demonstrated that the fragmentation of druglike molecules by applying simplistic pseudo-retrosynthesis results in a stock of chemically meaningful building blocks for de novo molecule generation. A stochastic search algorithm in conjunction with ligand-based similarity scoring (Flux: fragment-based ligand builde...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0503560

    authors: Fechner U,Schneider G

    更新日期:2006-03-01 00:00:00

  • Three-dimensional quantitative structure-activity relationship of nucleosides acting at the A3 adenosine receptor: analysis of binding and relative efficacy.

    abstract::The binding affinity and relative maximal efficacy of human A3 adenosine receptor (AR) agonists were each subjected to ligand-based three-dimensional quantitative structure-activity relationship analysis. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) used a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600501z

    authors: Kimand SK,Jacobson KA

    更新日期:2007-05-01 00:00:00

  • Interpretation of Quantitative Structure-Activity Relationship Models: Past, Present, and Future.

    abstract::This paper is an overview of the most significant and impactful interpretation approaches of quantitative structure-activity relationship (QSAR) models, their development, and application. The evolution of the interpretation paradigm from "model → descriptors → (structure)" to "model → structure" is indicated. The lat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章,评审

    doi:10.1021/acs.jcim.7b00274

    authors: Polishchuk P

    更新日期:2017-11-27 00:00:00

  • 3D QSAR methods: Phase and Catalyst compared.

    abstract::The programs Phase and Catalyst HypoGen are compared for their performance in determining three-dimensional quantitative structure-activity relationships. Eight sets of compounds with measured activity were collected from the public literature and partitioned into suitable training and test sets by an automated proced...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7000082

    authors: Evans DA,Doman TN,Thorner DA,Bodkin MJ

    更新日期:2007-05-01 00:00:00

  • ANN multiscale model of anti-HIV drugs activity vs AIDS prevalence in the US at county level based on information indices of molecular graphs and social networks.

    abstract::This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400716y

    authors: González-Díaz H,Herrera-Ibatá DM,Duardo-Sánchez A,Munteanu CR,Orbegozo-Medina RA,Pazos A

    更新日期:2014-03-24 00:00:00

  • Radial clustergrams: visualizing the aggregate properties of hierarchical clusters.

    abstract::A new radial space-filling method for visualizing cluster hierarchies is presented. The method, referred to as a radial clustergram, arranges the clusters into a series of layers, each representing a different level of the tree. It uses adjacency of nodes instead of links to represent parent-child relationships and al...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600427x

    authors: Agrafiotis DK,Bandyopadhyay D,Farnum M

    更新日期:2007-01-01 00:00:00

  • Simulation of 2D NMR Spectra of Carbohydrates Using GODESS Software.

    abstract::Glycan Optimized Dual Empirical Spectrum Simulation (GODESS) is a web service, which has been recently shown to be one of the most accurate tools for simulation of (1)H and (13)C 1D NMR spectra of natural carbohydrates and their derivatives. The new version of GODESS supports visualization of the simulated (1)H and (1...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00083

    authors: Kapaev RR,Toukach PV

    更新日期:2016-06-27 00:00:00

  • Effect of data standardization on chemical clustering and similarity searching.

    abstract::Standardization is used to ensure that the variables in a similarity calculation make an equal contribution to the computed similarity value. This paper compares the use of seven different methods that have been suggested previously for the standardization of integer-valued or real-valued data, comparing the results w...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800224h

    authors: Chu CW,Holliday JD,Willett P

    更新日期:2009-02-01 00:00:00

  • Ligand-based molecular modeling study on a chemically diverse series of cholecystokinin-B/gastrin receptor antagonists: generation of predictive model.

    abstract::Pharmacophore hypotheses were developed for six structurally diverse series of cholecystokinin-B/gastrin receptor (CCK-BR) antagonists. A training set consisting of 33 compounds was carefully selected. The activity spread of the training set molecules was from 0.1 to 2100 nM. The most predictive pharmacophore model (h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050257m

    authors: Chopra M,Mishra AK

    更新日期:2005-11-01 00:00:00

  • Efficient Strategy for the Calculation of Solvation Free Energies in Water and Chloroform at the Quantum Mechanical/Molecular Mechanical Level.

    abstract::The partitioning of solute molecules between immiscible solvents with significantly different polarities is of great importance. The polarization between the solute and solvent molecules plays an essential role in determining the solubility of the solute, which makes computational studies utilizing molecular mechanics...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00001

    authors: Wang M,Li P,Jia X,Liu W,Shao Y,Hu W,Zheng J,Brooks BR,Mei Y

    更新日期:2017-10-23 00:00:00