Ranking chemical structures for drug discovery: a new machine learning approach.

Abstract:

:With chemical libraries increasingly containing millions of compounds or more, there is a fast-growing need for computational methods that can rank or prioritize compounds for screening. Machine learning methods have shown considerable promise for this task; indeed, classification methods such as support vector machines (SVMs), together with their variants, have been used in virtual screening to distinguish active compounds from inactive ones, while regression methods such as partial least-squares (PLS) and support vector regression (SVR) have been used in quantitative structure-activity relationship (QSAR) analysis for predicting biological activities of compounds. Recently, a new class of machine learning methods - namely, ranking methods, which are designed to directly optimize ranking performance - have been developed for ranking tasks such as web search that arise in information retrieval (IR) and other applications. Here we report the application of these new ranking methods in machine learning to the task of ranking chemical structures. Our experiments show that the new ranking methods give better ranking performance than both classification based methods in virtual screening and regression methods in QSAR analysis. We also make some interesting connections between ranking performance measures used in cheminformatics and those used in IR studies.

journal_name

J Chem Inf Model

authors

Agarwal S,Dugar D,Sengupta S

doi

10.1021/ci9003865

subject

Has Abstract

pub_date

2010-05-24 00:00:00

pages

716-31

issue

5

eissn

1549-9596

issn

1549-960X

journal_volume

50

pub_type

杂志文章
  • Potent Human Telomerase Inhibitors: Molecular Dynamic Simulations, Multiple Pharmacophore-Based Virtual Screening, and Biochemical Assays.

    abstract::Telomere maintenance is a universal cancer hallmark, and small molecules that disrupt telomere maintenance generally have anticancer properties. Since the vast majority of cancer cells utilize telomerase activity for telomere maintenance, the enzyme has been considered as an anticancer drug target. Recently, rational ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00336

    authors: Shirgahi Talari F,Bagherzadeh K,Golestanian S,Jarstfer M,Amanlou M

    更新日期:2015-12-28 00:00:00

  • Heteroaromatic π-stacking energy landscapes.

    abstract::In this study we investigate π-stacking interactions of a variety of aromatic heterocycles with benzene using dispersion corrected density functional theory. We calculate extensive potential energy surfaces for parallel-displaced interaction geometries. We find that dispersion contributes significantly to the interact...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500183u

    authors: Huber RG,Margreiter MA,Fuchs JE,von Grafenstein S,Tautermann CS,Liedl KR,Fox T

    更新日期:2014-05-27 00:00:00

  • Identification of ligand templates using local structure alignment for structure-based drug design.

    abstract::With a rapid increase in the number of high-resolution protein-ligand structures, the known protein-ligand structures can be used to gain insight into ligand-binding modes in a target protein. On the basis of the fact that the structurally similar binding sites share information about their ligands, we have developed ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300178e

    authors: Lee HS,Im W

    更新日期:2012-10-22 00:00:00

  • Improved Prediction of Drug-Target Interactions Using Self-Paced Learning with Collaborative Matrix Factorization.

    abstract::Identifying drug-target interactions (DTIs) plays an important role in the field of drug discovery, drug side-effects, and drug repositioning. However, in vivo or biochemical experimental methods for identifying new DTIs are extremely expensive and time-consuming. Recently, in silico or various computational methods h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00408

    authors: Xia LY,Yang ZY,Zhang H,Liang Y

    更新日期:2019-07-22 00:00:00

  • COSMOsar3D: molecular field analysis based on local COSMO σ-profiles.

    abstract::The COSMO surface polarization charge density σ resulting from quantum chemical calculations combined with a virtual conductor embedding has been widely proven to be a very suitable descriptor for the quantification of interactions of molecules in liquids. In a preceding paper, grid-based local histograms of σ have be...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300231t

    authors: Klamt A,Thormann M,Wichmann K,Tosco P

    更新日期:2012-08-27 00:00:00

  • Jaqpot Quattro: A Novel Computational Web Platform for Modeling and Analysis in Nanoinformatics.

    abstract::Engineered nanomaterials (ENMs) are increasingly infiltrating our lives as a result of their applications across multiple fields. However, ENM formulations may result in the modulation of pathways and mechanisms of toxic action that endanger human health and the environment. Alternative testing methods such as in sili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00223

    authors: Chomenidis C,Drakakis G,Tsiliki G,Anagnostopoulou E,Valsamis A,Doganis P,Sopasakis P,Sarimveis H

    更新日期:2017-09-25 00:00:00

  • Informatics-Aided Density Functional Theory Study on the Li Ion Transport of Tavorite-Type LiMTO4F (M(3+)-T(5+), M(2+)-T(6+)).

    abstract::The ongoing search for fast Li-ion conducting solid electrolytes has driven the deployment surge on density functional theory (DFT) computation and materials informatics for exploring novel chemistries before actual experimental testing. Existing structure prototypes can now be readily evaluated beforehand not only to...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500752n

    authors: Jalem R,Kimura M,Nakayama M,Kasuga T

    更新日期:2015-06-22 00:00:00

  • Benchmark data set for in silico prediction of Ames mutagenicity.

    abstract::Up to now, publicly available data sets to build and evaluate Ames mutagenicity prediction tools have been very limited in terms of size and chemical space covered. In this report we describe a new unique public Ames mutagenicity data set comprising about 6500 nonconfidential compounds (available as SMILES strings and...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900161g

    authors: Hansen K,Mika S,Schroeter T,Sutter A,ter Laak A,Steger-Hartmann T,Heinrich N,Müller KR

    更新日期:2009-09-01 00:00:00

  • Database of Nuclear Independent Chemical Shifts (NICS) versus NICSZZ of Polycyclic Aromatic Hydrocarbons (PAHs).

    abstract::In the present contribution, we have developed a database, called the FAR-database, where the acronym FAR stands for Fused Aromatic Rings, which presents the results of nuclear independent chemical shifts calculations, NICS(0), NICS(1), NICS(0)ZZ, and NICS(1)ZZ, of 660 neutral benzenoid-PAHs and cyclopenta-fused PAHs....

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00909

    authors: Alvarez-Ramírez F,Ruiz-Morales Y

    更新日期:2020-02-24 00:00:00

  • Search for novel aminoglycosides by combining fragment-based virtual screening and 3D-QSAR scoring.

    abstract::Aminoglycosides are antibiotics targeting the 16S RNA A site of the bacterial ribosome. There have been many efforts directed toward design of their synthetic derivatives, however with only few successes. As RNA binders, aminoglycosides are also a difficult target for computational drug design, since most of the exist...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800361a

    authors: Setny P,Trylska J

    更新日期:2009-02-01 00:00:00

  • Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram Generation of Complex Molecules and Ligand-Protein Interactions.

    abstract::Computer programs for structure diagram generation (SDG) are indispensable cheminformatic tools that translate one- or three-dimensional (1D or 3D) chemical structure data stored in electronic formats to human-readable 2D depictions. Although many such programs are known, only a moderate part of chemical space can be ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00391

    authors: Frączek T

    更新日期:2016-12-27 00:00:00

  • GDP Release from the Open Conformation of Gα Requires Allosteric Signaling from the Agonist-Bound Human β2 Adrenergic Receptor.

    abstract::G-protein-coupled receptors (GPCRs) transmit signals into the cell in response to ligand binding at its extracellular domain, which is characterized by the coupling of agonist-induced receptor conformational change to guanine nucleotide (GDP) exchange with guanosine triphosphate on a heterotrimeric (αβγ) guanine nucle...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00432

    authors: Kumar V,Hoag H,Sader S,Scorese N,Liu H,Wu C

    更新日期:2020-08-24 00:00:00

  • CoMFA, CoMSIA, and molecular hologram QSAR studies of novel neuronal nAChRs ligands-open ring analogues of 3-pyridyl ether.

    abstract::3-Pyridyl ethers are excellent nAChRs ligands, which show high subtype selectivity and binding affinity to alpha4beta2 nAChR. Although the quantitative structure-activity relationship (QSAR) of nAChRs ligands has been widely investigated using various classes of compounds, the open ring analogues of 3-pyridyl ethers h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0498113

    authors: Zhang H,Li H,Liu C

    更新日期:2005-03-01 00:00:00

  • Toward high throughput 3D virtual screening using spherical harmonic surface representations.

    abstract::Searching chemical databases for possible drug leads is often one of the main activities conducted during the early stages of a drug development project. This article shows that spherical harmonic molecular shape representations provide a powerful way to search and cluster small-molecule databases rapidly and accurate...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7001507

    authors: Mavridis L,Hudson BD,Ritchie DW

    更新日期:2007-09-01 00:00:00

  • Protein Solvent Shell Structure Provides Rapid Analysis of Hydration Dynamics.

    abstract::The solvation layer surrounding a protein is clearly an intrinsic part of protein structure-dynamics-function, and our understanding of how the hydration dynamics influences protein function is emerging. We have recently reported simulations indicating a correlation between regional hydration dynamics and the structur...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00009

    authors: Dahanayake JN,Shahryari E,Roberts KM,Heikes ME,Kasireddy C,Mitchell-Koch KR

    更新日期:2019-05-28 00:00:00

  • A critical assessment of combined ligand- and structure-based approaches to HERG channel blocker modeling.

    abstract::Blockade of human ether-à-go-go related gene (hERG) channel prolongs the duration of the cardiac action potential and is a common reason for drug failure in preclinical safety trials. Therefore, it is of great importance to develop robust in silico tools to predict potential hERG blockers in the early stages of drug d...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200271d

    authors: Du-Cuny L,Chen L,Zhang S

    更新日期:2011-11-28 00:00:00

  • Factors affecting d-block metal-ligand bond lengths: toward an automated library of molecular geometry for metal complexes.

    abstract::Metal-ligand (M-L) bond lengths for a range of ligands (carboxylates, chlorides, pyridines, water, tertiary phosphines, and alkenes) and a variety of metals have been retrieved from the Cambridge Structural Database, CSD. Analysis of the factors which affect M-L bond lengths (for example, ligand coordination mode, oxi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0500785

    authors: Harris SE,Orpen AG,Bruno IJ,Taylor R

    更新日期:2005-11-01 00:00:00

  • Random forest models to predict aqueous solubility.

    abstract::Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueou...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060164k

    authors: Palmer DS,O'Boyle NM,Glen RC,Mitchell JB

    更新日期:2007-01-01 00:00:00

  • Identifying promising compounds in drug discovery: genetic algorithms and some new statistical techniques.

    abstract::Throughout the drug discovery process, discovery teams are compelled to use statistics for making decisions using data from a variety of inputs. For instance, teams are asked to prioritize compounds for subsequent stages of the drug discovery process, given results from multiple screens. To assist in the prioritizatio...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600556v

    authors: Mandal A,Johnson K,Wu CF,Bornemeier D

    更新日期:2007-05-01 00:00:00

  • Partitioning of pi-electrons in rings for Clar structures of benzenoid hydrocarbons.

    abstract::Resonance structures of polycyclic aromatic hydrocarbons can be associated with numerical formulas by assigning pi-electrons of C=C double bonds to individual benzenoid rings. Each C=C double bond in a resonance structure assigns two pi-electrons to a ring in a fused-benzenoid system if it is not shared by adjacent ri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050196s

    authors: Randić M,Balaban AT

    更新日期:2006-01-01 00:00:00

  • Holistic Approach to Partial Covalent Interactions in Protein Structure Prediction and Design with Rosetta.

    abstract::Partial covalent interactions (PCIs) in proteins, which include hydrogen bonds, salt bridges, cation-π, and π-π interactions, contribute to thermodynamic stability and facilitate interactions with other biomolecules. Several score functions have been developed within the Rosetta protein modeling framework that identif...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00398

    authors: Combs SA,Mueller BK,Meiler J

    更新日期:2018-05-29 00:00:00

  • Advantages of Relative versus Absolute Data for the Development of Quantitative Structure-Activity Relationship Classification Models.

    abstract::The appropriate selection of a chemical space represented by the data set, the selection of its chemical data representation, the development of a correct modeling process using a robust and reproducible algorithm, and the performance of an exhaustive training and external validation determine the usability and reprod...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00492

    authors: Ruiz IL,Gómez-Nieto MÁ

    更新日期:2017-11-27 00:00:00

  • Cross-docking of inhibitors into CDK2 structures. 2.

    abstract::In the preceding paper (Duca, J. S.; Madison, V. S.; Voigt, J. H. J. Chem. Inf. Model. 2008, 48, 659-668), the accuracy of docking and affinity predictions of the Gold and Glide programs were investigated using single protein conformations spanning 150 CDK2/inhibitor crystallographic complexes. High docking accuracy w...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700428d

    authors: Voigt JH,Elkin C,Madison VS,Duca JS

    更新日期:2008-03-01 00:00:00

  • Virtual drug screen schema based on multiview similarity integration and ranking aggregation.

    abstract::The current drug virtual screen (VS) methods mainly include two categories. i.e., ligand/target structure-based virtual screen and that, utilizing protein-ligand interaction fingerprint information based on the large number of complex structures. Since the former one focuses on the one-side information while the later...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200481c

    authors: Kang H,Sheng Z,Zhu R,Huang Q,Liu Q,Cao Z

    更新日期:2012-03-26 00:00:00

  • Binding of Cytotoxic Aβ25-35 Peptide to the Dimyristoylphosphatidylcholine Lipid Bilayer.

    abstract::Aβ25-35 is a short, cytotoxic, and naturally occurring fragment of the Alzheimer's Aβ peptide. To map the molecular mechanism of Aβ25-35 binding to the zwitterionic dimyristoylphosphatidylcholine (DMPC) bilayer, we have performed replica exchange with solute tempering molecular dynamics simulations using all-atom expl...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00045

    authors: Smith AK,Klimov DK

    更新日期:2018-05-29 00:00:00

  • Receptor-based virtual ligand screening for the identification of novel CDC25 phosphatase inhibitors.

    abstract::CDC25 phosphatases play critical roles in cell cycle regulation and are attractive targets for anticancer therapies. Several small non-peptide molecules are known to inhibit CDC25, but many of them appear to form a covalent bond with the enzyme or act through oxidation of the thiolate group of the catalytic cysteine. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700313e

    authors: Montes M,Braud E,Miteva MA,Goddard ML,Mondésert O,Kolb S,Brun MP,Ducommun B,Garbay C,Villoutreix BO

    更新日期:2008-01-01 00:00:00

  • Molecular Mechanism underlying PRMT1 Dimerization for SAM Binding and Methylase Activity.

    abstract::Protein arginine methyltransferases (PRMTs) catalyze the posttranslational methylation of arginine, which is important in a range of biological processes, including epigenetic regulation, signal transduction, and cancer progression. Although previous studies of PRMT1 mutants suggest that the dimerization arm and the N...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00454

    authors: Zhou R,Xie Y,Hu H,Hu G,Patel VS,Zhang J,Yu K,Huang Y,Jiang H,Liang Z,Zheng YG,Luo C

    更新日期:2015-12-28 00:00:00

  • Two model system of the alpha1A-adrenoceptor docked with selected ligands.

    abstract::In this study, we have developed a two model system to mimic the active and inactive states of a G-protein coupled receptor specifically the alpha1A adrenergic receptor. We have docked two agonists, epinephrine (phenylamine type) and oxymetazoline (imidazoline type), as well as two antagonists, prazosin and 5-methylur...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700026v

    authors: Asher WB,Hoskins SN,Slasor LA,Morris DH,Cook EM,Bautista DL

    更新日期:2007-09-01 00:00:00

  • Pharmacophore-based virtual screening and experimental validation of novel inhibitors against cyanobacterial fructose-1,6-/sedoheptulose-1,7-bisphosphatase.

    abstract::Cyanobacterial fructose-1,6-/sedoheptulose-1,7-bisphoshatase (cy-FBP/SBPase) is a potential enzymatic target for screening of novel inhibitors that can combat harmful algal blooms. In the present study, we targeted the substrate binding pocket of cy-FBP/SBPase. A series of novel hit compounds from the SPECs database w...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4007529

    authors: Sun Y,Zhang R,Li D,Feng L,Wu D,Feng L,Huang P,Ren Y,Feng J,Xiao S,Wan J

    更新日期:2014-03-24 00:00:00

  • The ensemble performance index: an improved measure for assessing ensemble pose prediction performance.

    abstract::We present a theoretical study on the performance of ensemble docking methodologies considering multiple protein structures. We perform a theoretical analysis of pose prediction experiments which is completely unbiased, as we make no assumptions about specific scoring functions, search paradigms, protein structures, o...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci2002796

    authors: Korb O,McCabe P,Cole J

    更新日期:2011-11-28 00:00:00