Abstract:
:The number of journal articles in the scientific domain has grown to the point where it has become impossible for researchers to capitalize on all findings in their relevant discipline. Information is stored in these articles in a number of ways, including figures that describe important results. In organic chemistry, these figures often present chemical schematic diagrams that graphically define the structures of carbon-based compounds. These diagrams are intuitive for an expert to comprehend, but they are not designed for machines. This work presents ChemSchematicResolver, a software tool that can be used to identify chemical schematic diagrams within the figure of a document, resolve any R-group substituents within them, and convert the resulting diagrams to a machine-readable format in a high-throughput, autonomous fashion. The tool includes a new algorithm that is used to identify relevant diagrams and a mechanism that combines these data with contextual information from the rest of the document for the creation of highly relational databases. It includes support for a variety of general R-group structures, the first time this is available in any open-source chemical schematic diagram extraction tool. It is presented alongside a self-generated evaluation set, on which the most important assessment metric, precision, achieved 83-100% for all assessed areas. The ChemSchematicResolver tool is released under the MIT license and is available to download from www.chemschematicresolver.org.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Beard EJ,Cole JMdoi
10.1021/acs.jcim.0c00042subject
Has Abstractpub_date
2020-04-27 00:00:00pages
2059-2072issue
4eissn
1549-9596issn
1549-960Xjournal_volume
60pub_type
杂志文章abstract::Compounds with high-confidence target annotations and activity measurements in the original and current release of the ChEMBL database have been compared to better understand how the growth of compound activity data might influence the spectrum of ligand-target interactions and the degree of target promiscuity among a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci3003304
更新日期:2012-10-22 00:00:00
abstract::Reactive oxygen species such as superoxide are potentially harmful byproducts of the aerobic metabolism in the inner mitochondrial membrane, and complexes I, II, III of the electron transport chain have been identified as primary sources. The mitochondrial fatty acid b-oxidation pathway may also play a yet uncharacter...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00702
更新日期:2019-11-25 00:00:00
abstract::It is of great interest in modern drug design to accurately calculate the free energies of protein-ligand or nucleic acid-ligand binding. MM-PBSA (molecular mechanics Poisson-Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300064d
更新日期:2012-05-25 00:00:00
abstract::Human telomeric DNA G-quadruplex has been identified as a good therapeutic target in cancer treatment. G-quadruplex-specific ligands that stabilize the G-quadruplex have great potential to be developed as anticancer agents. Two crystal structures (an apo form of parallel stranded human telomeric G-quadruplex and its h...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00287
更新日期:2017-11-27 00:00:00
abstract::A giant technological leap in the field of cryo-electron microscopy (cryo-EM) has assured the achievement of near-atomic resolution structures of biological macromolecules. As a recognition of this accomplishment, the Nobel Prize in Chemistry was awarded in 2017 to Jacques Dubochet, Joachim Frank, and Richard Henderso...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b01015
更新日期:2020-05-26 00:00:00
abstract::Cytochrome P450 2D6 (CYP2D6) is used to develop an approach for predicting affinity and relevant binding conformation(s) for highly flexible binding sites. The approach combines the use of docking scores and compound properties as attributes in building a neural network (NN) model. It begins by identifying segments of...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600267k
更新日期:2006-11-01 00:00:00
abstract::The [H2X2]+ (X = Cl, Br) formula could refer to two possible stable structures, namely, the hydrogen-bonded complex and the three-electron-bonded one. In contrary to the results published by other authors, we claim that for the F-type structures the hydrogen-bonded form is the only possible one and the [HFFH]+ complex...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600355g
更新日期:2007-05-01 00:00:00
abstract::The worldwide increase and proliferation of drug resistant microbes, coupled with the lag in new drug development, represents a major threat to human health. In order to reduce the time and cost for exploring the chemical search space, drug discovery increasingly relies on computational biology approaches. One key ste...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00075
更新日期:2020-06-22 00:00:00
abstract::Central to Hsp90's biological function is its ability to interconvert between various conformational states. Drug targeting of Hsp90's regulatory mechanisms, including its modulation by cochaperone association, presents as an attractive therapeutic strategy for Hsp90 associated pathologies. In this study, we utilized ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00630
更新日期:2018-02-26 00:00:00
abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700006f
更新日期:2007-07-01 00:00:00
abstract::A compound's synthetic accessibility (SA) is an important aspect of drug design, since in some cases computer-designed compounds cannot be synthesized. There have been several reports on SA prediction, most of which have focused on the difficulties of synthetic reactions based on retro-synthesis analyses, reaction dat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500568d
更新日期:2014-12-22 00:00:00
abstract::A novel pharmacophore descriptor Flexophore is presented, which considers molecular flexibility when comparing descriptor similarities. The descriptor is a complete reduced graph of the underlying molecule. Its nodes are represented by enhanced MM2 atom types, while the edge descriptions encode the molecular flexibili...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700359j
更新日期:2008-04-01 00:00:00
abstract::The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined. A protocol requiring minimal user intervention has been established to align training and test set molecules using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are exemplified stu...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7002945
更新日期:2008-02-01 00:00:00
abstract::We report the synthesis and a study of the structure-activity relationships of a new series of diarylhydrazides as potential selective non-ligand binding pocket androgen receptor antagonists. Their biological activity as antiandrogens in the context of the development of treatments for castration resistant prostate ca...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400189m
更新日期:2013-08-26 00:00:00
abstract::Dynamical properties of proteins play an essential role in their function exertion. The elastic network model (ENM) is an effective and efficient tool in characterizing the intrinsic dynamical properties encoded in biomacromolecule structures. The Gaussian network model (GNM) and anisotropic network model (ANM) are th...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c01178
更新日期:2021-01-26 00:00:00
abstract::Protein-ligand binding is essential to almost all life processes. The understanding of protein-ligand interactions is fundamentally important to rational drug and protein design. Based on large scale data sets, we show that protein rigidity strengthening or flexibility reduction is a mechanism in protein-ligand bindin...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00226
更新日期:2017-07-24 00:00:00
abstract::We have applied the two most commonly used methods for automatic matched pair identification, obtained the optimum settings, and discovered that the two methods are synergistic. A turbocharging approach to matched pair analysis is advocated in which a first round (a conservative categorical approach that uses an analo...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00335
更新日期:2017-10-23 00:00:00
abstract::The similarity/diversity measures play a fundamental role in library searching, virtual screening, and quantitative structure-activity relationship/quantitative structure-property relationship modeling as well as in genomics and proteomics. In this paper, a new similarity/diversity measure is proposed as a new approac...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060099e
更新日期:2006-09-01 00:00:00
abstract::Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00877
更新日期:2020-12-28 00:00:00
abstract::Engineering shape-controlled bionanomaterials requires comprehensive understanding of interactions between biomolecules and inorganic surfaces. We explore the origin of facet-selective binding of peptides adsorbed onto Pt(100) and Pt(111) crystallographic planes. Using molecular dynamics simulations, we show that upon...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400630d
更新日期:2013-12-23 00:00:00
abstract::It is demonstrated that the fragmentation of druglike molecules by applying simplistic pseudo-retrosynthesis results in a stock of chemically meaningful building blocks for de novo molecule generation. A stochastic search algorithm in conjunction with ligand-based similarity scoring (Flux: fragment-based ligand builde...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0503560
更新日期:2006-03-01 00:00:00
abstract::The binding affinity and relative maximal efficacy of human A3 adenosine receptor (AR) agonists were each subjected to ligand-based three-dimensional quantitative structure-activity relationship analysis. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) used a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600501z
更新日期:2007-05-01 00:00:00
abstract::This paper is an overview of the most significant and impactful interpretation approaches of quantitative structure-activity relationship (QSAR) models, their development, and application. The evolution of the interpretation paradigm from "model → descriptors → (structure)" to "model → structure" is indicated. The lat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章,评审
doi:10.1021/acs.jcim.7b00274
更新日期:2017-11-27 00:00:00
abstract::The programs Phase and Catalyst HypoGen are compared for their performance in determining three-dimensional quantitative structure-activity relationships. Eight sets of compounds with measured activity were collected from the public literature and partitioned into suitable training and test sets by an automated proced...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7000082
更新日期:2007-05-01 00:00:00
abstract::This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400716y
更新日期:2014-03-24 00:00:00
abstract::A new radial space-filling method for visualizing cluster hierarchies is presented. The method, referred to as a radial clustergram, arranges the clusters into a series of layers, each representing a different level of the tree. It uses adjacency of nodes instead of links to represent parent-child relationships and al...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600427x
更新日期:2007-01-01 00:00:00
abstract::Glycan Optimized Dual Empirical Spectrum Simulation (GODESS) is a web service, which has been recently shown to be one of the most accurate tools for simulation of (1)H and (13)C 1D NMR spectra of natural carbohydrates and their derivatives. The new version of GODESS supports visualization of the simulated (1)H and (1...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00083
更新日期:2016-06-27 00:00:00
abstract::Standardization is used to ensure that the variables in a similarity calculation make an equal contribution to the computed similarity value. This paper compares the use of seven different methods that have been suggested previously for the standardization of integer-valued or real-valued data, comparing the results w...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800224h
更新日期:2009-02-01 00:00:00
abstract::Pharmacophore hypotheses were developed for six structurally diverse series of cholecystokinin-B/gastrin receptor (CCK-BR) antagonists. A training set consisting of 33 compounds was carefully selected. The activity spread of the training set molecules was from 0.1 to 2100 nM. The most predictive pharmacophore model (h...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050257m
更新日期:2005-11-01 00:00:00
abstract::The partitioning of solute molecules between immiscible solvents with significantly different polarities is of great importance. The polarization between the solute and solvent molecules plays an essential role in determining the solubility of the solute, which makes computational studies utilizing molecular mechanics...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00001
更新日期:2017-10-23 00:00:00