In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Naïve Bayes and Parzen-Rosenblatt window.

Abstract:

:In this study, two probabilistic machine-learning algorithms were compared for in silico target prediction of bioactive molecules, namely the well-established Laplacian-modified Naïve Bayes classifier (NB) and the more recently introduced (to Cheminformatics) Parzen-Rosenblatt Window. Both classifiers were trained in conjunction with circular fingerprints on a large data set of bioactive compounds extracted from ChEMBL, covering 894 human protein targets with more than 155,000 ligand-protein pairs. This data set is also provided as a benchmark data set for future target prediction methods due to its size as well as the number of bioactivity classes it contains. In addition to evaluating the methods, different performance measures were explored. This is not as straightforward as in binary classification settings, due to the number of classes, the possibility of multiple class memberships, and the need to translate model scores into "yes/no" predictions for assessing model performance. Both algorithms achieved a recall of correct targets that exceeds 80% in the top 1% of predictions. Performance depends significantly on the underlying diversity and size of a given class of bioactive compounds, with small classes and low structural similarity affecting both algorithms to different degrees. When tested on an external test set extracted from WOMBAT covering more than 500 targets by excluding all compounds with Tanimoto similarity above 0.8 to compounds from the ChEMBL data set, the current methodologies achieved a recall of 63.3% and 66.6% among the top 1% for Naïve Bayes and Parzen-Rosenblatt Window, respectively. While those numbers seem to indicate lower performance, they are also more realistic for settings where protein targets need to be established for novel chemical substances.

journal_name

J Chem Inf Model

authors

Koutsoukas A,Lowe R,Kalantarmotamedi Y,Mussa HY,Klaffke W,Mitchell JB,Glen RC,Bender A

doi

10.1021/ci300435j

subject

Has Abstract

pub_date

2013-08-26 00:00:00

pages

1957-66

issue

8

eissn

1549-9596

issn

1549-960X

journal_volume

53

pub_type

杂志文章
  • Rigorous Computational Study Reveals What Docking Overlooks: Double Trouble from Membrane Association in Protein Kinase C Modulators.

    abstract::Increasing protein kinase C (PKC) activity is of potential therapeutic value. Its activation involves an interaction between the C1 domain and diacylglycerol (DAG) at intracellular membrane surfaces; DAG mimetics hold promise as new drugs. We previously developed the isophthalate derivative HMI-1a3, an effective but h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00624

    authors: Lautala S,Provenzani R,Koivuniemi A,Kulig W,Talman V,Róg T,Tuominen RK,Yli-Kauhaluoma J,Bunker A

    更新日期:2020-11-23 00:00:00

  • Mechanism of Hormone Peptide Activation of a GPCR: Angiotensin II Activated State of AT1R Initiated by van der Waals Attraction.

    abstract::We present a succession of structural changes involved in hormone peptide activation of a prototypical GPCR. Microsecond molecular dynamics simulation generated conformational ensembles reveal propagation of structural changes through key "microswitches" within human AT1R bound to native hormone. The endocrine octa-pe...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00583

    authors: Singh KD,Unal H,Desnoyer R,Karnik SS

    更新日期:2019-01-28 00:00:00

  • Phytochemical informatics of traditional Chinese medicine and therapeutic relevance.

    abstract::Distribution patterns of 8411 compounds from 240 Chinese herbs were analyzed in relation to the herbal categories of traditional Chinese medicine (TCM), using Random Forest (RF) and self-organizing maps (SOM). RF was used first to construct TCM profiles of individual compounds, which describe their affinities for 28 m...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700155t

    authors: Ehrman TM,Barlow DJ,Hylands PJ

    更新日期:2007-11-01 00:00:00

  • Coordination of Na(+) by monoamine ligands in dopamine, norepinephrine, and serotonin transporters.

    abstract::The reuptake of neurotransmitters by dopamine, norepinephrine, and serotonin transporters during neuronal transmission requires a sodium gradient. An "ionic mode" of binding proposes that aspartate anchors the ligand's positive charge but ignores the direct role of sodium in ligand binding seen in the only representat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700255d

    authors: Xhaard H,Backström V,Denessiouk K,Johnson MS

    更新日期:2008-07-01 00:00:00

  • Optimal Measurement Network of Pairwise Differences.

    abstract::When both the difference between two quantities and their individual values can be measured or computationally predicted, multiple quantities can be determined from the measurements or predictions of select individual quantities and select pairwise differences. These measurements and predictions form a network connect...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00528

    authors: Xu H

    更新日期:2019-11-25 00:00:00

  • Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions.

    abstract::We report a novel method called ADAN (Applicability Domain ANalysis) for assessing the reliability of drug property predictions obtained by in silico methods. The assessment provided by ADAN is based on the comparison of the query compound with the training set, using six diverse similarity criteria. For every criteri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500172z

    authors: Carrió P,Pinto M,Ecker G,Sanz F,Pastor M

    更新日期:2014-05-27 00:00:00

  • Discovery of Inhibitors of Four Bromodomains by Fragment-Anchored Ligand Docking.

    abstract::The high-throughput docking protocol called ALTA-VS (anchor-based library tailoring approach for virtual screening) was developed in 2005 for the efficient in silico screening of large libraries of compounds by preselection of only those molecules that have optimal fragments (anchors) for the protein target. Here we p...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00336

    authors: Marchand JR,Dalle Vedove A,Lolli G,Caflisch A

    更新日期:2017-10-23 00:00:00

  • Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram Generation of Complex Molecules and Ligand-Protein Interactions.

    abstract::Computer programs for structure diagram generation (SDG) are indispensable cheminformatic tools that translate one- or three-dimensional (1D or 3D) chemical structure data stored in electronic formats to human-readable 2D depictions. Although many such programs are known, only a moderate part of chemical space can be ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00391

    authors: Frączek T

    更新日期:2016-12-27 00:00:00

  • Structure-Based Kinase Profiling To Understand the Polypharmacological Behavior of Therapeutic Molecules.

    abstract::Several drugs elicit their therapeutic efficacy by modulating multiple cellular targets and possess varied polypharmacological actions. The identification of the molecular targets of a potent bioactive molecule is essential in determining its overall polypharmacological profile. Experimental procedures are expensive a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00227

    authors: Dutta D,Das R,Mandal C,Mandal C

    更新日期:2018-01-22 00:00:00

  • Heuristics from Modeling of Spectral Overlap in Förster Resonance Energy Transfer (FRET).

    abstract::Among the photophysical parameters that underpin Förster resonance energy transfer (FRET), perhaps the least explored is the spectral overlap term ( J). While by definition J increases linearly with acceptor molar absorption coefficient (ε(A) in M-1 cm-1), is proportional to wavelength (λ4), and depends on the degree ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00753

    authors: Qi Q,Taniguchi M,Lindsey JS

    更新日期:2019-02-25 00:00:00

  • Discovery and Evaluation of Anti-Fibrinolytic Plasmin Inhibitors Derived from 5-(4-Piperidyl)isoxazol-3-ol (4-PIOL).

    abstract::Inhibition of plasmin has been found to effectively reduce fibrinolysis and to avoid hemorrhage. This can be achieved by addressing its kringle 1 domain with the known drug and lysine analogue tranexamic acid. Guided by shape similarities toward a previously discovered lead compound, 5-(4-piperidyl)isoxazol-3-ol, a se...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00255

    authors: Schmidt TC,Eriksson PO,Gustafsson D,Cosgrove D,Frølund B,Boström J

    更新日期:2017-07-24 00:00:00

  • Computational simulations of the interactions between acetyl-coenzyme-A carboxylase and clodinafop: resistance mechanism due to active and nonactive site mutations.

    abstract::Grass weed populations resistant to acetyl-CoA carboxylase-inhibiting (ACCase; EC 6.4.1.2) herbicides represent a major problem for the sustainable development of modern agriculture. In the present study, extensive computational simulations, including homology modeling, molecular dynamics (MD) simulations, and molecul...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900174d

    authors: Zhu XL,Ge-Fei H,Zhan CG,Yang GF

    更新日期:2009-08-01 00:00:00

  • Free energy calculations give insight into the stereoselective hydroxylation of α-ionones by engineered cytochrome P450 BM3 mutants.

    abstract::Previously, stereoselective hydroxylation of α-ionone by Cytochrome P450 BM3 mutants M01 A82W and M11 L437N was observed. While both mutants hydroxylate α-ionone in a regioselective manner at the C3 position, M01 A82W catalyzes formation of trans-3-OH-α-ionone products whereas M11 L437N exhibits opposite stereoselecti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300243n

    authors: de Beer SB,Venkataraman H,Geerke DP,Oostenbrink C,Vermeulen NP

    更新日期:2012-08-27 00:00:00

  • Jaqpot Quattro: A Novel Computational Web Platform for Modeling and Analysis in Nanoinformatics.

    abstract::Engineered nanomaterials (ENMs) are increasingly infiltrating our lives as a result of their applications across multiple fields. However, ENM formulations may result in the modulation of pathways and mechanisms of toxic action that endanger human health and the environment. Alternative testing methods such as in sili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00223

    authors: Chomenidis C,Drakakis G,Tsiliki G,Anagnostopoulou E,Valsamis A,Doganis P,Sopasakis P,Sarimveis H

    更新日期:2017-09-25 00:00:00

  • Viscosity Prediction of Lubricants by a General Feed-Forward Neural Network.

    abstract::Modern industrial lubricants are often blended with an assortment of chemical additives to improve the performance of the base stock. Machine learning-based predictive models allow fast and veracious derivation of material properties and facilitate novel and innovative material designs. In this study, we outline the d...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01068

    authors: Loh GC,Lee HC,Tee XY,Chow PS,Zheng JW

    更新日期:2020-03-23 00:00:00

  • Generalized topological indices. Modeling gas-phase rate coefficients of atmospheric relevance.

    abstract::We develop the idea that the use of ad hoc molecular descriptors in QSAR/QSPR studies is not an optimal solution. Instead, we propose to optimize these descriptors for the specific properties under study. In the case of topological indices (TIs) we propose the use of the generalized topological indices (GTIs), which a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600448b

    authors: Estrada E,Matamala AR

    更新日期:2007-05-01 00:00:00

  • Structural insight into the unique binding properties of pyridylethanol(phenylethyl)amine inhibitor in human CYP51.

    abstract::Sterol 14α-demethylase (CYP51) is the main drug target for the treatment of fungal infections. The discovery of new efficient fungal CYP51 inhibitors requires an understanding of the structural requirements for selectivity for the fungal over the human ortholog. In this study, a binding mode of the pyridylethanol(phen...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500556k

    authors: Zelenko U,Hodošček M,Rozman D,Golič Grdadolnik S

    更新日期:2014-12-22 00:00:00

  • Simulation of 2D NMR Spectra of Carbohydrates Using GODESS Software.

    abstract::Glycan Optimized Dual Empirical Spectrum Simulation (GODESS) is a web service, which has been recently shown to be one of the most accurate tools for simulation of (1)H and (13)C 1D NMR spectra of natural carbohydrates and their derivatives. The new version of GODESS supports visualization of the simulated (1)H and (1...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00083

    authors: Kapaev RR,Toukach PV

    更新日期:2016-06-27 00:00:00

  • The normal-mode entropy in the MM/GBSA method: effect of system truncation, buffer region, and dielectric constant.

    abstract::We have performed a systematic study of the entropy term in the MM/GBSA (molecular mechanics combined with generalized Born and surface-area solvation) approach to calculate ligand-binding affinities. The entropies are calculated by a normal-mode analysis of harmonic frequencies from minimized snapshots of molecular d...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3001919

    authors: Genheden S,Kuhn O,Mikulskis P,Hoffmann D,Ryde U

    更新日期:2012-08-27 00:00:00

  • Three-dimensional quantitative structure-activity relationship of nucleosides acting at the A3 adenosine receptor: analysis of binding and relative efficacy.

    abstract::The binding affinity and relative maximal efficacy of human A3 adenosine receptor (AR) agonists were each subjected to ligand-based three-dimensional quantitative structure-activity relationship analysis. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) used a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600501z

    authors: Kimand SK,Jacobson KA

    更新日期:2007-05-01 00:00:00

  • Prediction of synthetic accessibility based on commercially available compound databases.

    abstract::A compound's synthetic accessibility (SA) is an important aspect of drug design, since in some cases computer-designed compounds cannot be synthesized. There have been several reports on SA prediction, most of which have focused on the difficulties of synthetic reactions based on retro-synthesis analyses, reaction dat...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500568d

    authors: Fukunishi Y,Kurosawa T,Mikami Y,Nakamura H

    更新日期:2014-12-22 00:00:00

  • BFMP: a method for discretizing and visualizing pyranose conformations.

    abstract::We report a new classification method for pyranose ring conformations called Best-fit, Four-Membered Plane (BFMP), which describes pyranose ring conformations based on reference planes defined by four atoms. The method is able to characterize all asymmetrical and symmetrical shapes of a pyran ring, is readily automate...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500325b

    authors: Makeneni S,Foley BL,Woods RJ

    更新日期:2014-10-27 00:00:00

  • Sensitivity of Folding Molecular Dynamics Simulations to Even Minor Force Field Changes.

    abstract::We examine the sensitivity of folding molecular dynamics simulations on the choice between three variants of the same force field (the AMBER99SB force field and its ILDN, NMR-ILDN, and STAR-ILDN variants). Using two different peptide systems (a marginally stable helical peptide and a β-hairpin) and a grand total of mo...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00493

    authors: Serafeim AP,Salamanos G,Patapati KK,Glykos NM

    更新日期:2016-10-24 00:00:00

  • Conformational determinants of the activity of antiproliferative factor glycopeptide.

    abstract::The antiproliferative factor (APF) involved in interstitial cystitis is a glycosylated nonapeptide (TVPAAVVVA) containing a sialylated core 1 α-O-disaccharide linked to the N-terminal threonine. The chemical structure of APF was deduced using spectroscopic techniques and confirmed using total synthesis. The synthetic ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400147s

    authors: Mallajosyula SS,Adams KM,Barchi JJ,MacKerell AD

    更新日期:2013-05-24 00:00:00

  • Predicting the DNA Conductance Using a Deep Feedforward Neural Network Model.

    abstract::Double-stranded DNA (dsDNA) has been established as an efficient medium for charge migration, bringing it to the forefront of the field of molecular electronics and biological research. The charge migration rate is controlled by the electronic couplings between the two nucleobases of DNA/RNA. These electronic coupling...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01072

    authors: Aggarwal A,Vinayak V,Bag S,Bhattacharyya C,Waghmare UV,Maiti PK

    更新日期:2021-01-25 00:00:00

  • Coarse-Grained Prediction of Peptide Binding to G-Protein Coupled Receptors.

    abstract::In this study, we used the Martini Coarse-Grained model with no applied restraints to predict the binding mode of some peptides to G-Protein Coupled Receptors (GPCRs). Both the Neurotensin-1 and the chemokine CXCR4 receptors were used as test cases. Their ligands, NTS8-13 and CVX15 peptides, respectively, were initial...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00503

    authors: Delort B,Renault P,Charlier L,Raussin F,Martinez J,Floquet N

    更新日期:2017-03-27 00:00:00

  • Growth of ligand-target interaction data in ChEMBL is associated with increasing and activity measurement-dependent compound promiscuity.

    abstract::Compounds with high-confidence target annotations and activity measurements in the original and current release of the ChEMBL database have been compared to better understand how the growth of compound activity data might influence the spectrum of ligand-target interactions and the degree of target promiscuity among a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3003304

    authors: Hu Y,Bajorath J

    更新日期:2012-10-22 00:00:00

  • Searching for coordinated activity cliffs using particle swarm optimization.

    abstract::Activity cliffs are formed by structurally similar compounds having large potency differences. Coordinated activity cliffs evolve when compounds within groups of structural neighbors form multiple cliffs with different partners, giving rise to local networks of cliffs in a data set. Using particle swarm optimization, ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3000503

    authors: Namasivayam V,Bajorath J

    更新日期:2012-04-23 00:00:00

  • RosENet: Improving Binding Affinity Prediction by Leveraging Molecular Mechanics Energies with an Ensemble of 3D Convolutional Neural Networks.

    abstract::The worldwide increase and proliferation of drug resistant microbes, coupled with the lag in new drug development, represents a major threat to human health. In order to reduce the time and cost for exploring the chemical search space, drug discovery increasingly relies on computational biology approaches. One key ste...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00075

    authors: Hassan-Harrirou H,Zhang C,Lemmin T

    更新日期:2020-06-22 00:00:00

  • Critical Assessment of the Hildebrand and Hansen Solubility Parameters for Polymers.

    abstract::Solubility parameter models are widely used to select suitable solvents/nonsolvents for polymers in a variety of processing and engineering applications. In this study, we focus on two well-established models, namely, the Hildebrand and Hansen solubility parameter models. Both models are built on the basis of the noti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00656

    authors: Venkatram S,Kim C,Chandrasekaran A,Ramprasad R

    更新日期:2019-10-28 00:00:00