Machine Learning Enhanced Spectrum Recognition Based on Computer Vision (SRCV) for Intelligent NMR Data Extraction.

Abstract:

:A machine learning enhanced spectrum recognition system called spectrum recognition based on computer vision (SRCV) for data extraction from previously analyzed 13C and 1H NMR spectra has been developed. The intelligent system was designed with four function modules to extract data from three areas of NMR images, including 13C and 1H chemical shifts, the integral, and the range of the shift values. During this study, three machine learning models were pretrained for number recognition, which is the key procedure for NMR data extraction. The k nearest neighbor (kNN) method was selected with optimized k (k = 4), which displayed a 100% recognition rate. Subsequently, the performance of SRCV was tested and validated to have high accuracy with a short processing time (11-21 s) for each NMR spectral image. Our spectrum recognizer enables high-throughput 13C and 1H NMR data extraction from abundant spectra in the literature and has the potential to be used for spectral database construction. In addition, the system may be applicable to be developed for data import to computer-assisted structure elucidation systems, which would automate this procedure significantly. SRCV can be accessed in GitHub (https://github.com/WJmodels/SRCV).

journal_name

J Chem Inf Model

authors

Jia W,Yang Z,Yang M,Cheng L,Lei Z,Wang X

doi

10.1021/acs.jcim.0c01046

subject

Has Abstract

pub_date

2021-01-25 00:00:00

pages

21-25

issue

1

eissn

1549-9596

issn

1549-960X

journal_volume

61

pub_type

杂志文章
  • Role of halogen bonds in thyroid hormone receptor selectivity: pharmacophore-based 3D-QSSR studies.

    abstract::Most physiological effects of thyroid hormones are mediated by the two thyroid hormone receptor subtypes, TRalpha and TRbeta. Several pharmacological effects mediated by TRbeta might be beneficial in important medical conditions such as obesity, hypercholesterolemia and diabetes, and selective TRbeta activation may el...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900316e

    authors: Valadares NF,Salum LB,Polikarpov I,Andricopulo AD,Garratt RC

    更新日期:2009-11-01 00:00:00

  • SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules.

    abstract::SMIfp (SMILES fingerprint) is defined here as a scalar fingerprint describing organic molecules by counting the occurrences of 34 different symbols in their SMILES strings, which creates a 34-dimensional chemical space. Ligand-based virtual screening using the city-block distance CBD(SMIfp) as similarity measure provi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400206h

    authors: Schwartz J,Awale M,Reymond JL

    更新日期:2013-08-26 00:00:00

  • Molecular Dynamics Simulations of Membrane-Bound STIM1 to Investigate Conformational Changes during STIM1 Activation upon Calcium Release.

    abstract::Calcium is involved in important intracellular processes, such as intracellular signaling from cell membrane receptors to the nucleus. Typically, calcium levels are kept at less than 100 nM in the nucleus and cytosol, but some calcium is stored in the endoplasmic reticulum (ER) lumen for rapid release to activate intr...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00475

    authors: Mukherjee S,Karolak A,Debant M,Buscaglia P,Renaudineau Y,Mignen O,Guida WC,Brooks WH

    更新日期:2017-02-27 00:00:00

  • Gas-phase and solution conformations of selected dimeric structural units of heparin.

    abstract::The molecular structure of four dimeric units (D-E, E-F, F-G, and G-H) of the DEFGH structural unit of heparin, their anionic forms, and their sodium salts have been studied using the B3LYP/6-31+G(d) method. The optimized geometries indicate that the most stable structure of these dimeric units in neutral state is sta...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060060+

    authors: Remko M,von der Lieth CW

    更新日期:2006-07-01 00:00:00

  • The ensemble performance index: an improved measure for assessing ensemble pose prediction performance.

    abstract::We present a theoretical study on the performance of ensemble docking methodologies considering multiple protein structures. We perform a theoretical analysis of pose prediction experiments which is completely unbiased, as we make no assumptions about specific scoring functions, search paradigms, protein structures, o...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci2002796

    authors: Korb O,McCabe P,Cole J

    更新日期:2011-11-28 00:00:00

  • Protein Solvent Shell Structure Provides Rapid Analysis of Hydration Dynamics.

    abstract::The solvation layer surrounding a protein is clearly an intrinsic part of protein structure-dynamics-function, and our understanding of how the hydration dynamics influences protein function is emerging. We have recently reported simulations indicating a correlation between regional hydration dynamics and the structur...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00009

    authors: Dahanayake JN,Shahryari E,Roberts KM,Heikes ME,Kasireddy C,Mitchell-Koch KR

    更新日期:2019-05-28 00:00:00

  • Viscosity Prediction of Lubricants by a General Feed-Forward Neural Network.

    abstract::Modern industrial lubricants are often blended with an assortment of chemical additives to improve the performance of the base stock. Machine learning-based predictive models allow fast and veracious derivation of material properties and facilitate novel and innovative material designs. In this study, we outline the d...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01068

    authors: Loh GC,Lee HC,Tee XY,Chow PS,Zheng JW

    更新日期:2020-03-23 00:00:00

  • Reliable and Performant Identification of Low-Energy Conformers in the Gas Phase and Water.

    abstract::Prediction of compound properties from structure via quantitative structure-activity relationship and machine-learning approaches is an important computational chemistry task in small-molecule drug research. Though many such properties are dependent on three-dimensional structures or even conformer ensembles, the majo...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00151

    authors: Cavasin AT,Hillisch A,Uellendahl F,Schneckener S,Göller AH

    更新日期:2018-05-29 00:00:00

  • Identification of Enzyme Genes Using Chemical Structure Alignments of Substrate-Product Pairs.

    abstract::Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies tha...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00216

    authors: Moriya Y,Yamada T,Okuda S,Nakagawa Z,Kotera M,Tokimatsu T,Kanehisa M,Goto S

    更新日期:2016-03-28 00:00:00

  • Pharmacophore-based virtual screening and experimental validation of novel inhibitors against cyanobacterial fructose-1,6-/sedoheptulose-1,7-bisphosphatase.

    abstract::Cyanobacterial fructose-1,6-/sedoheptulose-1,7-bisphoshatase (cy-FBP/SBPase) is a potential enzymatic target for screening of novel inhibitors that can combat harmful algal blooms. In the present study, we targeted the substrate binding pocket of cy-FBP/SBPase. A series of novel hit compounds from the SPECs database w...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4007529

    authors: Sun Y,Zhang R,Li D,Feng L,Wu D,Feng L,Huang P,Ren Y,Feng J,Xiao S,Wan J

    更新日期:2014-03-24 00:00:00

  • Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets.

    abstract::On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00143

    authors: Clark AM,Dole K,Coulon-Spektor A,McNutt A,Grass G,Freundlich JS,Reynolds RC,Ekins S

    更新日期:2015-06-22 00:00:00

  • In silico drug screening approach for the design of magic bullets: a successful example with anti-HIV fullerene derivatized amino acids.

    abstract::A database has been derived from recently reported [60]fullerene derivatives, and their binding scores with HIV-1 PR have been computed using docking techniques. Computational methods have been used to predict which derivatives may have high binding affinities, and for these compounds biological tests have been perfor...

    journal_title:Journal of chemical information and modeling

    pub_type: 信件

    doi:10.1021/ci900047s

    authors: Durdagi S,Supuran CT,Strom TA,Doostdar N,Kumar MK,Barron AR,Mavromoustakos T,Papadopoulos MG

    更新日期:2009-05-01 00:00:00

  • Computational fragment-based approach at PDB scale by protein local similarity.

    abstract::The large volume of protein-ligand structures now available enables innovative and efficient protocols in computational FBDD (Fragment-Based Drug Design) to be proposed based on experimental data. In this work, we build a database of MED-Portions, where a MED-Portion is a new structural object encoding protein-fragmen...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci8003094

    authors: Moriaud F,Doppelt-Azeroual O,Martin L,Oguievetskaia K,Koch K,Vorotyntsev A,Adcock SA,Delfaud F

    更新日期:2009-02-01 00:00:00

  • In silico analysis of the thermodynamic stability changes of psychrophilic and mesophilic alpha-amylases upon exhaustive single-site mutations.

    abstract::Identifying sequence modifications that distinguish psychrophilic from mesophilic proteins is important for designing enzymes with different thermodynamic stabilities and to understand the underlying mechanisms. The PoPMuSiC algorithm is used to introduce, in silico, all the single-site mutations in four mesophilic an...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050473v

    authors: Gilis D

    更新日期:2006-05-01 00:00:00

  • Efficient Strategy for the Calculation of Solvation Free Energies in Water and Chloroform at the Quantum Mechanical/Molecular Mechanical Level.

    abstract::The partitioning of solute molecules between immiscible solvents with significantly different polarities is of great importance. The polarization between the solute and solvent molecules plays an essential role in determining the solubility of the solute, which makes computational studies utilizing molecular mechanics...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00001

    authors: Wang M,Li P,Jia X,Liu W,Shao Y,Hu W,Zheng J,Brooks BR,Mei Y

    更新日期:2017-10-23 00:00:00

  • Computational Insight Into the Mechanism of SARS-CoV-2 Membrane Fusion.

    abstract::Membrane fusion, a key step in the early stages of virus propagation, allows the release of the viral genome in the host cell cytoplasm. The process is initiated by fusion peptides that are small, hydrophobic components of viral membrane-embedded glycoproteins and are typically conserved within virus families. Here, w...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01231

    authors: Borkotoky S,Dey D,Banerjee M

    更新日期:2021-01-25 00:00:00

  • Enrichment analysis for discovering biological associations in phenotypic screens.

    abstract::A phenotypic screen (PS) is used to identify compounds causing a desired phenotype in a complex biological system where mechanisms and targets are largely unknown. Deconvoluting the mechanism of action of actives and identification of relevant targets and pathways remains a formidable challenge. Current methods fail t...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400245c

    authors: Polyakov VR,Moorcroft ND,Drawid A

    更新日期:2014-02-24 00:00:00

  • Study of chromatographic retention of natural terpenoids by chemoinformatic tools.

    abstract::The study of chromatographic retention of natural products can be used to increase their identification speed in complex biological matrices. In this work, six variables were used to study the retention behavior in reversed phase liquid chromatography of 39 sesquiterpene lactones (SL) from an in-house database using c...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500581q

    authors: Oliveira TB,Gobbo-Neto L,Schmidt TJ,Da Costa FB

    更新日期:2015-01-26 00:00:00

  • GalaxyGPCRloop: Template-Based and Ab Initio Structure Sampling of the Extracellular Loops of G-Protein-Coupled Receptors.

    abstract::The second extracellular loops (ECL2s) of G-protein-coupled receptors (GPCRs) are often involved in GPCR functions, and their structures have important implications in drug discovery. However, structure prediction of ECL2 is difficult because of its long length and the structural diversity among different GPCRs. In th...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00148

    authors: Won J,Lee GR,Park H,Seok C

    更新日期:2018-06-25 00:00:00

  • Coordination of Na(+) by monoamine ligands in dopamine, norepinephrine, and serotonin transporters.

    abstract::The reuptake of neurotransmitters by dopamine, norepinephrine, and serotonin transporters during neuronal transmission requires a sodium gradient. An "ionic mode" of binding proposes that aspartate anchors the ligand's positive charge but ignores the direct role of sodium in ligand binding seen in the only representat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700255d

    authors: Xhaard H,Backström V,Denessiouk K,Johnson MS

    更新日期:2008-07-01 00:00:00

  • Pharmacophore identification, in silico screening, and virtual library design for inhibitors of the human factor Xa.

    abstract::Factor Xa inhibitors are innovative anticoagulant agents that provide a better safety/efficacy profile compared to other anticoagulative drugs. A chemical feature-based modeling approach was applied to identify crucial pharmacophore patterns from 3D crystal structures of inhibitors bound to human factor Xa (Pdb entrie...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci049778k

    authors: Krovat EM,Frühwirth KH,Langer T

    更新日期:2005-01-01 00:00:00

  • Searching for recursively defined generic chemical patterns in nonenumerated fragment spaces.

    abstract::Retrieving molecules with specific structural features is a fundamental requirement of today's molecular database technologies. Estimates claim the chemical space relevant for drug discovery to be around 10⁶⁰ molecules. This figure is many orders of magnitude larger than the amount of molecules conventional databases ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400107k

    authors: Ehrlich HC,Henzler AM,Rarey M

    更新日期:2013-07-22 00:00:00

  • Advantages of Relative versus Absolute Data for the Development of Quantitative Structure-Activity Relationship Classification Models.

    abstract::The appropriate selection of a chemical space represented by the data set, the selection of its chemical data representation, the development of a correct modeling process using a robust and reproducible algorithm, and the performance of an exhaustive training and external validation determine the usability and reprod...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00492

    authors: Ruiz IL,Gómez-Nieto MÁ

    更新日期:2017-11-27 00:00:00

  • Protein kinases: docking and homology modeling reliability.

    abstract::A database of about 700 high-resolution kinase structures was used to test the reliability of 17 docking procedures (using six docking software packages) by means of self- and cross-docking studies. The analysis of about 80 000 docking calculations suggests that the docking of an unknown ligand into a kinase has a pro...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100161z

    authors: Tuccinardi T,Botta M,Giordano A,Martinelli A

    更新日期:2010-08-23 00:00:00

  • Cross-docking of inhibitors into CDK2 structures. 2.

    abstract::In the preceding paper (Duca, J. S.; Madison, V. S.; Voigt, J. H. J. Chem. Inf. Model. 2008, 48, 659-668), the accuracy of docking and affinity predictions of the Gold and Glide programs were investigated using single protein conformations spanning 150 CDK2/inhibitor crystallographic complexes. High docking accuracy w...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700428d

    authors: Voigt JH,Elkin C,Madison VS,Duca JS

    更新日期:2008-03-01 00:00:00

  • HLA-DM Stabilizes the Empty MHCII Binding Groove: A Model Using Customized Natural Move Monte Carlo.

    abstract::MHC class II molecules bind peptides derived from extracellular proteins that have been ingested by antigen-presenting cells and display them to the immune system. Peptide loading occurs within the antigen-presenting cell and is facilitated by HLA-DM. HLA-DM stabilizes the open conformation of the MHCII binding groove...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00104

    authors: Demharter S,Knapp B,Deane C,Minary P

    更新日期:2019-06-24 00:00:00

  • Benchmarking of Semiempirical Quantum-Mechanical Methods on Systems Relevant to Computer-Aided Drug Design.

    abstract::The semiempirical quantum mechanical (SQM) methods used in drug design are commonly parametrized and tested on data sets of systems that may not be representative models for drug-biomolecule interactions in terms of both size and chemical composition. This is addressed here with a new benchmark data set, PLF547, deriv...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01171

    authors: Kříž K,Řezáč J

    更新日期:2020-03-23 00:00:00

  • Structure-based design and screen of novel inhibitors for class II 3-hydroxy-3-methylglutaryl coenzyme A reductase from Streptococcus pneumoniae.

    abstract::3-Hydroxy-3-methylglutaryl coenzyme A reductase (HMGR) is a primary target in the current clinical treatment of hypercholesterolemia with specific inhibitors of "statin" family. Statins are excellent inhibitors of the class I (human) enzyme but relatively poor inhibitors of the class II enzyme, which are well-known as...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300163v

    authors: Li D,Gui J,Li Y,Feng L,Han X,Sun Y,Sun T,Chen Z,Cao Y,Zhang Y,Zhou L,Hu X,Ren Y,Wan J

    更新日期:2012-07-23 00:00:00

  • BiKi Life Sciences: A New Suite for Molecular Dynamics and Related Methods in Drug Discovery.

    abstract::In this paper, we introduce the BiKi Life Sciences suite. This software makes it easy for computational medicinal chemists to run ad hoc molecular dynamics protocols in a novel and task-oriented environment; as a notebook, BiKi (acronym of Binding Kinetics) keeps memory of any activity together with dependencies among...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00680

    authors: Decherchi S,Bottegoni G,Spitaleri A,Rocchia W,Cavalli A

    更新日期:2018-02-26 00:00:00

  • Systematics of high-genus fullerenes.

    abstract::In this article, we present a systematic way to classify a family of high-genus fullerenes (HGFs) by decomposing them into two types of necklike structures, which are the negatively curved parts of parent toroidal carbon nanotubes. By replacing the faces of a uniform polyhedron with these necks, an HGF polyhedron corr...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9001124

    authors: Chuang C,Jin BY

    更新日期:2009-07-01 00:00:00