Abstract:
:The anatomical therapeutic chemical (ATC) classification system maintained by the World Health Organization provides a global standard for the classification of medical substances and serves as a source for drug repurposing research. Nevertheless, it lacks several drugs that are major players in the global drug market. In order to establish classifications for yet unclassified drugs, this paper presents a newly developed approach based on a combination of information extraction (IE) and machine learning (ML) techniques. Most of the information about drugs is published in the scientific articles. Therefore, an IE-based framework is employed to extract terms from free text that express drug's chemical, pharmacological, therapeutic, and systemic effects. The extracted terms are used as features within a ML framework to predict putative ATC class labels for unclassified drugs. The system was tested on a portion of ATC containing drugs with an indication on the cardiovascular system. The class prediction turned out to be successful with the best predictive accuracy of 89.47% validated by a 100-fold bootstrapping of the training set and an accuracy of 77.12% on an independent test set. The presented concept-based classification system outperformed state-of-the-art classification methods based on chemical structure properties.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Gurulingappa H,Kolárik C,Hofmann-Apitius M,Fluck Jdoi
10.1021/ci9000844subject
Has Abstractpub_date
2009-08-01 00:00:00pages
1986-92issue
8eissn
1549-9596issn
1549-960Xjournal_volume
49pub_type
杂志文章abstract::Screening large libraries of chemicals has been an efficient strategy to discover bioactive compounds; however a portion of the potential for success is limited to the available libraries. Synergizing combinatorial and computational chemistries has emerged as a time-efficient strategy to explore the chemical space mor...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00648
更新日期:2017-03-27 00:00:00
abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700006f
更新日期:2007-07-01 00:00:00
abstract::The appropriate selection of a chemical space represented by the data set, the selection of its chemical data representation, the development of a correct modeling process using a robust and reproducible algorithm, and the performance of an exhaustive training and external validation determine the usability and reprod...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00492
更新日期:2017-11-27 00:00:00
abstract::The community structure-activity resource (CSAR) data sets are used to develop and test a support vector machine-based scoring function in regression mode (SVR). Two scoring functions (SVR-KB and SVR-EP) are derived with the objective of reproducing the trend of the experimental binding affinities provided within the ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200078f
更新日期:2011-09-26 00:00:00
abstract::We propose a hypothesis that "a model of active compound can be provided by integrating information of compounds high-ranked by docking simulation of a random compound library". In our hypothesis, the inclusion of true active compounds in the high-ranked compound is not necessary. We regard the high-ranked compounds a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7003384
更新日期:2008-03-01 00:00:00
abstract::Computational approaches to fragment-based drug design (FBDD) can complement experiments and facilitate the identification of potential hot spots along the protein surface. However, the evaluation of computational methods for mapping binding sites frequently focuses upon the ability to reproduce crystallographic coord...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300430v
更新日期:2013-02-25 00:00:00
abstract::Previously, stereoselective hydroxylation of α-ionone by Cytochrome P450 BM3 mutants M01 A82W and M11 L437N was observed. While both mutants hydroxylate α-ionone in a regioselective manner at the C3 position, M01 A82W catalyzes formation of trans-3-OH-α-ionone products whereas M11 L437N exhibits opposite stereoselecti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300243n
更新日期:2012-08-27 00:00:00
abstract::Antimicrobial peptides (AMPs) have emerged as promising therapeutic alternatives to fight against the diverse infections caused by different pathogenic microorganisms. In this context, theoretical approaches in bioinformatics have paved the way toward the creation of several in silico models capable of predicting anti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00630
更新日期:2016-03-28 00:00:00
abstract::Representative molecules from 10 classes of prohibited substances were taken from the World Anti-Doping Agency (WADA) list, augmented by molecules from corresponding activity classes found in the MDDR database. Together with some explicitly allowed compounds, these formed a set of 5245 molecules. Five types of fingerp...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0601160
更新日期:2006-11-01 00:00:00
abstract::Our main objective was to compile a data set of high-quality protein-fragment complexes and make it publicly available. Once assembled, the data set was challenged using docking procedures to address the following questions: (i) Can molecular docking correctly reproduce the experimentally solved structures? (ii) How t...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci2003363
更新日期:2011-11-28 00:00:00
abstract::The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBb...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00049
更新日期:2017-04-24 00:00:00
abstract::Taking into account dynamical behavior and/or structural inaccuracies of receptor-ligand systems becomes increasingly important in structure-based drug design. Here, we describe the development of consensus Adaptation of Fields for Molecular Comparison (AFMoC) (abbreviated as AFMoCcon) models that account for multiple...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7002472
更新日期:2007-11-01 00:00:00
abstract::A homology model of the Arabidopsis thaliana UV resistance locus 8 (UVR8) protein is presented herein, showing a seven-bladed β-propeller conformation similar to the globular structure of RCC1. The UVR8 amino acid sequence contains a very high amount of conserved tryptophans, and the homology model shows that seven of...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200017f
更新日期:2011-06-27 00:00:00
abstract::Homology modeling is a reliable method of predicting the three-dimensional structures of proteins that lack NMR or X-ray crystallographic data. It employs the assumption that a structural resemblance exists between closely related proteins. Despite the availability of many crystal structures of possible templates, onl...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500001f
更新日期:2014-06-23 00:00:00
abstract::Sixteen FDA-approved drugs were investigated to elucidate their mechanisms of action (MOAs) and clinical functions by pathway analysis based on retrieved drug targets interacting with or affected by the investigated drugs. Protein and gene targets and associated pathways were obtained by data-mining of public database...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci4005354
更新日期:2014-02-24 00:00:00
abstract::Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00249
更新日期:2017-08-28 00:00:00
abstract::Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies tha...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00216
更新日期:2016-03-28 00:00:00
abstract::A major concern of chemogenomics is to associate drug activity with biological variables. Several reports have clustered cell line drug activity profiles as well as drug activity-gene expression correlation profiles and noted that the resulting groupings differ but still reflect mechanism of action. The present paper ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060073n
更新日期:2007-01-01 00:00:00
abstract::G-protein coupled receptors (GPCRs) are highly relevant drug targets. Four GPCRs with known crystal structure were analyzed with docking (AutoDock4) and postdocking (MM-PBSA) in order to evaluate the ability to recognize known antagonists from a larger database of molecular decoys and to predict correct binding modes....
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci4000745
更新日期:2013-04-22 00:00:00
abstract::HackaMol is an open source, object-oriented toolkit written in Modern Perl that organizes atoms within molecules and provides chemically intuitive attributes and methods. The library consists of two components: HackaMol, the core that contains classes for storing and manipulating molecular information, and HackaMol::X...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500359e
更新日期:2015-04-27 00:00:00
abstract::Here, a method is described for easily building three-carbon nanotube junctions. It allows the geometry to be found and bond connectivity of C(3) symmetric nanotube junctions to be established. Such junctions may present a variable degree of pyramidalization and are composed of three identical carbon nanotubes with ar...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200056p
更新日期:2011-06-27 00:00:00
abstract::Binding affinity prediction with implicit solvent models remains a challenge in virtual screening for drug discovery. In order to assess the predictive power of implicit solvent models in docking techniques with Amber scoring, three generalized Born models (GBHCT, GBOBCI, and GBOBCII) available in Dock 6.7 were utiliz...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00418
更新日期:2016-10-24 00:00:00
abstract::There are only four derivatives of pseudouridine (Ψ) that are known to occur naturally in RNA as post-transcriptional modifications. We have studied the conformational consequences of pseudouridylation and further modifications using replica exchange molecular dynamics simulations at the nucleoside level, and the simu...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00369
更新日期:2020-10-26 00:00:00
abstract::Protein-ligand (PL) interactions play a key role in many life processes such as molecular recognition, molecular binding, signal transmission, and cell metabolism. Examples of interaction forces include hydrogen bonding, hydrophobic effects, steric clashes, electrostatic contacts, and van der Waals attractions. Curren...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00310
更新日期:2018-01-22 00:00:00
abstract::The in silico prediction of unwanted side effects (SEs) caused by the promiscuous behavior of drugs and their targets is highly relevant to the pharmaceutical industry. Considerable effort is now being put into computational and experimental screening of several suspected off-target proteins in the hope that SEs might...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00120
更新日期:2015-09-28 00:00:00
abstract::An accurate scoring function is expected to correctly select the most stable structure from a set of pose candidates. One can hypothesize that a scoring function's ability to identify the most stable structure might be improved by emphasizing the most relevant atom pairwise interactions. However, it is hard to evaluat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00356
更新日期:2019-07-22 00:00:00
abstract::Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00877
更新日期:2020-12-28 00:00:00
abstract::Identifying drug-target interactions (DTIs) plays an important role in the field of drug discovery, drug side-effects, and drug repositioning. However, in vivo or biochemical experimental methods for identifying new DTIs are extremely expensive and time-consuming. Recently, in silico or various computational methods h...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00408
更新日期:2019-07-22 00:00:00
abstract::G protein-coupled receptors (GPCRs) are the largest family of cell surface receptors, which is arguably the most important family of drug target. With the technology breakthroughs in X-ray crystallography and cryo-electron microscopy, more than 300 GPCR-ligand complex structures have been publicly reported since 2007,...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00699
更新日期:2020-09-28 00:00:00
abstract::It is of great interest in modern drug design to accurately calculate the free energies of protein-ligand or nucleic acid-ligand binding. MM-PBSA (molecular mechanics Poisson-Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300064d
更新日期:2012-05-25 00:00:00