Concept-based semi-automatic classification of drugs.

Abstract:

:The anatomical therapeutic chemical (ATC) classification system maintained by the World Health Organization provides a global standard for the classification of medical substances and serves as a source for drug repurposing research. Nevertheless, it lacks several drugs that are major players in the global drug market. In order to establish classifications for yet unclassified drugs, this paper presents a newly developed approach based on a combination of information extraction (IE) and machine learning (ML) techniques. Most of the information about drugs is published in the scientific articles. Therefore, an IE-based framework is employed to extract terms from free text that express drug's chemical, pharmacological, therapeutic, and systemic effects. The extracted terms are used as features within a ML framework to predict putative ATC class labels for unclassified drugs. The system was tested on a portion of ATC containing drugs with an indication on the cardiovascular system. The class prediction turned out to be successful with the best predictive accuracy of 89.47% validated by a 100-fold bootstrapping of the training set and an accuracy of 77.12% on an independent test set. The presented concept-based classification system outperformed state-of-the-art classification methods based on chemical structure properties.

journal_name

J Chem Inf Model

authors

Gurulingappa H,Kolárik C,Hofmann-Apitius M,Fluck J

doi

10.1021/ci9000844

subject

Has Abstract

pub_date

2009-08-01 00:00:00

pages

1986-92

issue

8

eissn

1549-9596

issn

1549-960X

journal_volume

49

pub_type

杂志文章
  • Customizable Generation of Synthetically Accessible, Local Chemical Subspaces.

    abstract::Screening large libraries of chemicals has been an efficient strategy to discover bioactive compounds; however a portion of the potential for success is limited to the available libraries. Synergizing combinatorial and computational chemistries has emerged as a time-efficient strategy to explore the chemical space mor...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00648

    authors: Pottel J,Moitessier N

    更新日期:2017-03-27 00:00:00

  • Systematic analysis of enzyme-catalyzed reaction patterns and prediction of microbial biodegradation pathways.

    abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700006f

    authors: Oh M,Yamada T,Hattori M,Goto S,Kanehisa M

    更新日期:2007-07-01 00:00:00

  • Advantages of Relative versus Absolute Data for the Development of Quantitative Structure-Activity Relationship Classification Models.

    abstract::The appropriate selection of a chemical space represented by the data set, the selection of its chemical data representation, the development of a correct modeling process using a robust and reproducible algorithm, and the performance of an exhaustive training and external validation determine the usability and reprod...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00492

    authors: Ruiz IL,Gómez-Nieto MÁ

    更新日期:2017-11-27 00:00:00

  • Support vector regression scoring of receptor-ligand complexes for rank-ordering and virtual screening of chemical libraries.

    abstract::The community structure-activity resource (CSAR) data sets are used to develop and test a support vector machine-based scoring function in regression mode (SVR). Two scoring functions (SVR-KB and SVR-EP) are derived with the objective of reproducing the trend of the experimental binding affinities provided within the ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200078f

    authors: Li L,Wang B,Meroueh SO

    更新日期:2011-09-26 00:00:00

  • Hidden active information in a random compound library: extraction using a pseudo-structure-activity relationship model.

    abstract::We propose a hypothesis that "a model of active compound can be provided by integrating information of compounds high-ranked by docking simulation of a random compound library". In our hypothesis, the inclusion of true active compounds in the high-ranked compound is not necessary. We regard the high-ranked compounds a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7003384

    authors: Fukunishi H,Teramoto R,Shimada J

    更新日期:2008-03-01 00:00:00

  • Improving protocols for protein mapping through proper comparison to crystallography data.

    abstract::Computational approaches to fragment-based drug design (FBDD) can complement experiments and facilitate the identification of potential hot spots along the protein surface. However, the evaluation of computational methods for mapping binding sites frequently focuses upon the ability to reproduce crystallographic coord...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300430v

    authors: Lexa KW,Carlson HA

    更新日期:2013-02-25 00:00:00

  • Free energy calculations give insight into the stereoselective hydroxylation of α-ionones by engineered cytochrome P450 BM3 mutants.

    abstract::Previously, stereoselective hydroxylation of α-ionone by Cytochrome P450 BM3 mutants M01 A82W and M11 L437N was observed. While both mutants hydroxylate α-ionone in a regioselective manner at the C3 position, M01 A82W catalyzes formation of trans-3-OH-α-ionone products whereas M11 L437N exhibits opposite stereoselecti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300243n

    authors: de Beer SB,Venkataraman H,Geerke DP,Oostenbrink C,Vermeulen NP

    更新日期:2012-08-27 00:00:00

  • First Multitarget Chemo-Bioinformatic Model To Enable the Discovery of Antibacterial Peptides against Multiple Gram-Positive Pathogens.

    abstract::Antimicrobial peptides (AMPs) have emerged as promising therapeutic alternatives to fight against the diverse infections caused by different pathogenic microorganisms. In this context, theoretical approaches in bioinformatics have paved the way toward the creation of several in silico models capable of predicting anti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00630

    authors: Speck-Planche A,Kleandrova VV,Ruso JM,Cordeiro MN

    更新日期:2016-03-28 00:00:00

  • Chemoinformatics-based classification of prohibited substances employed for doping in sport.

    abstract::Representative molecules from 10 classes of prohibited substances were taken from the World Anti-Doping Agency (WADA) list, augmented by molecules from corresponding activity classes found in the MDDR database. Together with some explicitly allowed compounds, these formed a set of 5245 molecules. Five types of fingerp...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0601160

    authors: Cannon EO,Bender A,Palmer DS,Mitchell JB

    更新日期:2006-11-01 00:00:00

  • SERAPhiC: a benchmark for in silico fragment-based drug design.

    abstract::Our main objective was to compile a data set of high-quality protein-fragment complexes and make it publicly available. Once assembled, the data set was challenged using docking procedures to address the following questions: (i) Can molecular docking correctly reproduce the experimentally solved structures? (ii) How t...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci2003363

    authors: Favia AD,Bottegoni G,Nobeli I,Bisignano P,Cavalli A

    更新日期:2011-11-28 00:00:00

  • Structural and Sequence Similarity Makes a Significant Impact on Machine-Learning-Based Scoring Functions for Protein-Ligand Interactions.

    abstract::The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBb...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00049

    authors: Li Y,Yang J

    更新日期:2017-04-24 00:00:00

  • Consensus adaptation of fields for molecular comparison (AFMoC) models incorporate ligand and receptor conformational variability into tailor-made scoring functions.

    abstract::Taking into account dynamical behavior and/or structural inaccuracies of receptor-ligand systems becomes increasingly important in structure-based drug design. Here, we describe the development of consensus Adaptation of Fields for Molecular Comparison (AFMoC) (abbreviated as AFMoCcon) models that account for multiple...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7002472

    authors: Breu B,Silber K,Gohlke H

    更新日期:2007-11-01 00:00:00

  • Computational evidence for the role of Arabidopsis thaliana UVR8 as UV-B photoreceptor and identification of its chromophore amino acids.

    abstract::A homology model of the Arabidopsis thaliana UV resistance locus 8 (UVR8) protein is presented herein, showing a seven-bladed β-propeller conformation similar to the globular structure of RCC1. The UVR8 amino acid sequence contains a very high amount of conserved tryptophans, and the homology model shows that seven of...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200017f

    authors: Wu M,Grahn E,Eriksson LA,Strid A

    更新日期:2011-06-27 00:00:00

  • Impact of template choice on homology model efficiency in virtual screening.

    abstract::Homology modeling is a reliable method of predicting the three-dimensional structures of proteins that lack NMR or X-ray crystallographic data. It employs the assumption that a structural resemblance exists between closely related proteins. Despite the availability of many crystal structures of possible templates, onl...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500001f

    authors: Rataj K,Witek J,Mordalski S,Kosciolek T,Bojarski AJ

    更新日期:2014-06-23 00:00:00

  • Pathway analysis for drug repositioning based on public database mining.

    abstract::Sixteen FDA-approved drugs were investigated to elucidate their mechanisms of action (MOAs) and clinical functions by pathway analysis based on retrieved drug targets interacting with or affected by the investigated drugs. Protein and gene targets and associated pathways were obtained by data-mining of public database...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4005354

    authors: Pan Y,Cheng T,Wang Y,Bryant SH

    更新日期:2014-02-24 00:00:00

  • Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    abstract::Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00249

    authors: Schneider N,Fechner N,Landrum GA,Stiefl N

    更新日期:2017-08-28 00:00:00

  • Identification of Enzyme Genes Using Chemical Structure Alignments of Substrate-Product Pairs.

    abstract::Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies tha...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00216

    authors: Moriya Y,Yamada T,Okuda S,Nakagawa Z,Kotera M,Tokimatsu T,Kanehisa M,Goto S

    更新日期:2016-03-28 00:00:00

  • In vitro drug sensitivity-gene expression correlations involve a tissue of origin dependency.

    abstract::A major concern of chemogenomics is to associate drug activity with biological variables. Several reports have clustered cell line drug activity profiles as well as drug activity-gene expression correlation profiles and noted that the resulting groupings differ but still reflect mechanism of action. The present paper ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060073n

    authors: Andersson CR,Fryknäs M,Rickardson L,Larsson R,Isaksson A,Gustafsson MG

    更新日期:2007-01-01 00:00:00

  • Enrichment factor analyses on G-protein coupled receptors with known crystal structure.

    abstract::G-protein coupled receptors (GPCRs) are highly relevant drug targets. Four GPCRs with known crystal structure were analyzed with docking (AutoDock4) and postdocking (MM-PBSA) in order to evaluate the ability to recognize known antagonists from a larger database of molecular decoys and to predict correct binding modes....

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4000745

    authors: Anighoro A,Rastelli G

    更新日期:2013-04-22 00:00:00

  • HackaMol: An Object-Oriented Modern Perl Library for Molecular Hacking on Multiple Scales.

    abstract::HackaMol is an open source, object-oriented toolkit written in Modern Perl that organizes atoms within molecules and provides chemically intuitive attributes and methods. The library consists of two components: HackaMol, the core that contains classes for storing and manipulating molecular information, and HackaMol::X...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500359e

    authors: Riccardi D,Parks JM,Johs A,Smith JC

    更新日期:2015-04-27 00:00:00

  • CoNTub v2.0--algorithms for constructing C3-symmetric models of three-nanotube junctions.

    abstract::Here, a method is described for easily building three-carbon nanotube junctions. It allows the geometry to be found and bond connectivity of C(3) symmetric nanotube junctions to be established. Such junctions may present a variable degree of pyramidalization and are composed of three identical carbon nanotubes with ar...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200056p

    authors: Melchor S,Martin-Martinez FJ,Dobado JA

    更新日期:2011-06-27 00:00:00

  • Evaluation of Generalized Born Models for Large Scale Affinity Prediction of Cyclodextrin Host-Guest Complexes.

    abstract::Binding affinity prediction with implicit solvent models remains a challenge in virtual screening for drug discovery. In order to assess the predictive power of implicit solvent models in docking techniques with Amber scoring, three generalized Born models (GBHCT, GBOBCI, and GBOBCII) available in Dock 6.7 were utiliz...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00418

    authors: Zhang H,Yin C,Yan H,van der Spoel D

    更新日期:2016-10-24 00:00:00

  • Molecular Dynamics Simulation of the Conformational Preferences of Pseudouridine Derivatives: Improving the Distribution in the Glycosidic Torsion Space.

    abstract::There are only four derivatives of pseudouridine (Ψ) that are known to occur naturally in RNA as post-transcriptional modifications. We have studied the conformational consequences of pseudouridylation and further modifications using replica exchange molecular dynamics simulations at the nucleoside level, and the simu...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00369

    authors: Dutta N,Sarzynska J,Lahiri A

    更新日期:2020-10-26 00:00:00

  • Descriptor Data Bank (DDB): A Cloud Platform for Multiperspective Modeling of Protein-Ligand Interactions.

    abstract::Protein-ligand (PL) interactions play a key role in many life processes such as molecular recognition, molecular binding, signal transmission, and cell metabolism. Examples of interaction forces include hydrogen bonding, hydrophobic effects, steric clashes, electrostatic contacts, and van der Waals attractions. Curren...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00310

    authors: Ashtawy HM,Mahapatra NR

    更新日期:2018-01-22 00:00:00

  • GESSE: Predicting Drug Side Effects from Drug-Target Relationships.

    abstract::The in silico prediction of unwanted side effects (SEs) caused by the promiscuous behavior of drugs and their targets is highly relevant to the pharmaceutical industry. Considerable effort is now being put into computational and experimental screening of several suspected off-target proteins in the hope that SEs might...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00120

    authors: Pérez-Nueno VI,Souchet M,Karaboga AS,Ritchie DW

    更新日期:2015-09-28 00:00:00

  • Random Forest Refinement of Pairwise Potentials for Protein-Ligand Decoy Detection.

    abstract::An accurate scoring function is expected to correctly select the most stable structure from a set of pose candidates. One can hypothesize that a scoring function's ability to identify the most stable structure might be improved by emphasizing the most relevant atom pairwise interactions. However, it is hard to evaluat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00356

    authors: Pei J,Zheng Z,Kim H,Song LF,Walworth S,Merz MR,Merz KM Jr

    更新日期:2019-07-22 00:00:00

  • Benchmark Sets for Binding Hot Spot Identification in Fragment-Based Ligand Discovery.

    abstract::Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00877

    authors: Wakefield AE,Yueh C,Beglov D,Castilho MS,Kozakov D,Keserű GM,Whitty A,Vajda S

    更新日期:2020-12-28 00:00:00

  • Improved Prediction of Drug-Target Interactions Using Self-Paced Learning with Collaborative Matrix Factorization.

    abstract::Identifying drug-target interactions (DTIs) plays an important role in the field of drug discovery, drug side-effects, and drug repositioning. However, in vivo or biochemical experimental methods for identifying new DTIs are extremely expensive and time-consuming. Recently, in silico or various computational methods h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00408

    authors: Xia LY,Yang ZY,Zhang H,Liang Y

    更新日期:2019-07-22 00:00:00

  • Fragment-Based Computational Method for Designing GPCR Ligands.

    abstract::G protein-coupled receptors (GPCRs) are the largest family of cell surface receptors, which is arguably the most important family of drug target. With the technology breakthroughs in X-ray crystallography and cryo-electron microscopy, more than 300 GPCR-ligand complex structures have been publicly reported since 2007,...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00699

    authors: Li Y,Sun Y,Song Y,Dai D,Zhao Z,Zhang Q,Zhong W,Hu LA,Ma Y,Li X,Wang R

    更新日期:2020-09-28 00:00:00

  • Develop and test a solvent accessible surface area-based model in conformational entropy calculations.

    abstract::It is of great interest in modern drug design to accurately calculate the free energies of protein-ligand or nucleic acid-ligand binding. MM-PBSA (molecular mechanics Poisson-Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300064d

    authors: Wang J,Hou T

    更新日期:2012-05-25 00:00:00