Ensemble feature selection: consistent descriptor subsets for multiple QSAR models.

Abstract:

:Selecting a small subset of descriptors from a large pool to build a predictive quantitative structure-activity relationship (QSAR) model is an important step in the QSAR modeling process. In general, subset selection is very hard to solve, even approximately, with guaranteed performance bounds. Traditional approaches employ deterministic or stochastic methods to obtain a descriptor subset that leads to an optimal model of a single type (such as linear regression or a neural network). With the development of ensemble modeling approaches, multiple models of differing types are individually developed resulting in different descriptor subsets for each model type. However, it is advantageous, from the point of view of developing interpretable QSAR models, to have a single set of descriptors that can be used for different model types. In this paper, we describe an approach to the selection of a single, optimal, subset of descriptors for multiple model types. We apply this approach to three data sets, covering both regression and classification, and show that the constraint of forcing different model types to use the same set of descriptors does not lead to a significant loss in predictive ability for the individual models considered. In addition, interpretations of the individual models developed using this approach indicate that they encode similar structure-activity trends.

journal_name

J Chem Inf Model

authors

Dutta D,Guha R,Wild D,Chen T

doi

10.1021/ci600563w

subject

Has Abstract

pub_date

2007-05-01 00:00:00

pages

989-97

issue

3

eissn

1549-9596

issn

1549-960X

journal_volume

47

pub_type

杂志文章
  • Scores of extended connectivity fingerprint as descriptors in QSPR study of melting point and aqueous solubility.

    abstract::QSPR studies, using scores of SciTegic's extended connectivity fingerprint as raw descriptors, were extended to the prediction of melting points and aqueous solubility of organic compounds. Robust partial least-squares models were developed that perform as well as the best published QSPR models for structurally divers...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800024c

    authors: Zhou D,Alelyunas Y,Liu R

    更新日期:2008-05-01 00:00:00

  • Learning To Predict Reaction Conditions: Relationships between Solvent, Molecular Structure, and Catalyst.

    abstract::Reaction databases provide a great deal of useful information to assist planning of experiments but do not provide any interpretation or chemical concepts to accompany this information. In this work, reactions are labeled with experimental conditions, and network analysis shows that consistencies within clusters of da...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00313

    authors: Walker E,Kammeraad J,Goetz J,Robo MT,Tewari A,Zimmerman PM

    更新日期:2019-09-23 00:00:00

  • Molecular Structure Extraction from Documents Using Deep Learning.

    abstract::Chemical structure extraction from documents remains a hard problem because of both false positive identification of structures during segmentation and errors in the predicted structures. Current approaches rely on handcrafted rules and subroutines that perform reasonably well generally but still routinely encounter s...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00669

    authors: Staker J,Marshall K,Abel R,McQuaw CM

    更新日期:2019-03-25 00:00:00

  • Improving protocols for protein mapping through proper comparison to crystallography data.

    abstract::Computational approaches to fragment-based drug design (FBDD) can complement experiments and facilitate the identification of potential hot spots along the protein surface. However, the evaluation of computational methods for mapping binding sites frequently focuses upon the ability to reproduce crystallographic coord...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300430v

    authors: Lexa KW,Carlson HA

    更新日期:2013-02-25 00:00:00

  • GESSE: Predicting Drug Side Effects from Drug-Target Relationships.

    abstract::The in silico prediction of unwanted side effects (SEs) caused by the promiscuous behavior of drugs and their targets is highly relevant to the pharmaceutical industry. Considerable effort is now being put into computational and experimental screening of several suspected off-target proteins in the hope that SEs might...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00120

    authors: Pérez-Nueno VI,Souchet M,Karaboga AS,Ritchie DW

    更新日期:2015-09-28 00:00:00

  • Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram Generation of Complex Molecules and Ligand-Protein Interactions.

    abstract::Computer programs for structure diagram generation (SDG) are indispensable cheminformatic tools that translate one- or three-dimensional (1D or 3D) chemical structure data stored in electronic formats to human-readable 2D depictions. Although many such programs are known, only a moderate part of chemical space can be ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00391

    authors: Frączek T

    更新日期:2016-12-27 00:00:00

  • Ligand coordinate analysis of SC-558 from the active site to the surface of COX-2: a molecular dynamics study.

    abstract::We have performed a ligand coordinate analysis to monitor the movement of the inhibitor SC-558 from the active site of the COX-2 protein to the exterior using molecular dynamics techniques. This study provides an insight into the intermolecular interactions formed by the ligand during this journey. The published cryst...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050142i

    authors: Sai Ram KV,Rambabu G,Sarma JA,Desiraju GR

    更新日期:2006-07-01 00:00:00

  • Novel inhibitors of human histone deacetylase (HDAC) identified by QSAR modeling of known inhibitors, virtual screening, and experimental validation.

    abstract::Inhibitors of histone deacetylases (HDACIs) have emerged as a new class of drugs for the treatment of human cancers and other diseases because of their effects on cell growth, differentiation, and apoptosis. In this study we have developed several quantitative structure-activity relationship (QSAR) models for 59 chemi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800366f

    authors: Tang H,Wang XS,Huang XP,Roth BL,Butler KV,Kozikowski AP,Jung M,Tropsha A

    更新日期:2009-02-01 00:00:00

  • H274Y's Effect on Oseltamivir Resistance: What Happens Before the Drug Enters the Binding Site.

    abstract::Increased reports of oseltamivir (OTV)-resistant strains of the influenza virus, such as the H274Y mutation on its neuraminidase (NA), have created some cause for concern. Many studies have been conducted in the attempt to uncover the mechanism of OTV resistance in H274Y NA. However, most of the reported studies on H2...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00331

    authors: Yusuf M,Mohamed N,Mohamad S,Janezic D,Damodaran KV,Wahab HA

    更新日期:2016-01-25 00:00:00

  • Effect of input differences on the results of docking calculations.

    abstract::The sensitivity of docking calculations to the geometry of the input ligand was studied. It was found that even small changes in the ligand input conformation can lead to large differences in the geometries and scores of the resulting docked poses. The accuracy of docked poses produced from different ligand input stru...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9000629

    authors: Feher M,Williams CI

    更新日期:2009-07-01 00:00:00

  • Role of halogen bonds in thyroid hormone receptor selectivity: pharmacophore-based 3D-QSSR studies.

    abstract::Most physiological effects of thyroid hormones are mediated by the two thyroid hormone receptor subtypes, TRalpha and TRbeta. Several pharmacological effects mediated by TRbeta might be beneficial in important medical conditions such as obesity, hypercholesterolemia and diabetes, and selective TRbeta activation may el...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900316e

    authors: Valadares NF,Salum LB,Polikarpov I,Andricopulo AD,Garratt RC

    更新日期:2009-11-01 00:00:00

  • Assessment of the Sampling Performance of Multiple-Copy Dynamics versus a Unique Trajectory.

    abstract::The goal of the present study was to ascertain the differential performance of a long molecular dynamics trajectory versus several shorter ones starting from different points in the phase space and covering the same sampling time. For this purpose, we selected the 16-mer peptide Bak16BH3 as a model for study and carri...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00347

    authors: Perez JJ,Tomas MS,Rubio-Martinez J

    更新日期:2016-10-24 00:00:00

  • Predicting Toxicities of Diverse Chemical Pesticides in Multiple Avian Species Using Tree-Based QSAR Approaches for Regulatory Purposes.

    abstract::A comprehensive safety evaluation of chemicals should require toxicity assessment in both the aquatic and terrestrial test species. Due to the application practices and nature of chemical pesticides, the avian toxicity testing is considered as an essential requirement in the risk assessment process. In this study, tre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00139

    authors: Basant N,Gupta S,Singh KP

    更新日期:2015-07-27 00:00:00

  • Force Field Benchmark of Amino Acids. 2. Partition Coefficients between Water and Organic Solvents.

    abstract::The partitioning of amino acids between water and apolar environments is of vital importance in protein function and drug delivery. Here we present an extensive benchmark for octanol/water (log Poct), chloroform/water (log Pclf), and cyclohexane/water (log Pchx) partition coefficients of neutral amino acid side chain ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00493

    authors: Zhang H,Jiang Y,Cui Z,Yin C

    更新日期:2018-08-27 00:00:00

  • FOG: Fragment Optimized Growth algorithm for the de novo generation of molecules occupying druglike chemical space.

    abstract::An essential feature of all practical de novo molecule generating programs is the ability to focus the potential combinatorial explosion of grown molecules on a desired chemical space. It is a daunting task to balance the generation of new molecules with limitations on growth that produce desired features such as stab...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9000458

    authors: Kutchukian PS,Lou D,Shakhnovich EI

    更新日期:2009-07-01 00:00:00

  • RosENet: Improving Binding Affinity Prediction by Leveraging Molecular Mechanics Energies with an Ensemble of 3D Convolutional Neural Networks.

    abstract::The worldwide increase and proliferation of drug resistant microbes, coupled with the lag in new drug development, represents a major threat to human health. In order to reduce the time and cost for exploring the chemical search space, drug discovery increasingly relies on computational biology approaches. One key ste...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00075

    authors: Hassan-Harrirou H,Zhang C,Lemmin T

    更新日期:2020-06-22 00:00:00

  • Consensus adaptation of fields for molecular comparison (AFMoC) models incorporate ligand and receptor conformational variability into tailor-made scoring functions.

    abstract::Taking into account dynamical behavior and/or structural inaccuracies of receptor-ligand systems becomes increasingly important in structure-based drug design. Here, we describe the development of consensus Adaptation of Fields for Molecular Comparison (AFMoC) (abbreviated as AFMoCcon) models that account for multiple...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7002472

    authors: Breu B,Silber K,Gohlke H

    更新日期:2007-11-01 00:00:00

  • Computational and conformational evaluation of FTase alternative substrates: insight into a novel enzyme binding pocket.

    abstract::Protein farnesyltransferase (FTase) is an important anticancer drug target. In an effort to develop isoprenoid diphosphate-based FTase inhibitors, striking variations have been observed in the ability of conservatively modified analogues to bind to the enzyme. For example, 2Z-GGPP is an alternative substrate with high...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0496550

    authors: Henriksen BS,Zahn TJ,Evanseck JD,Firestine SM,Gibbs RA

    更新日期:2005-07-01 00:00:00

  • Alanine Scanning Effects on the Biochemical and Biophysical Properties of Intrinsically Disordered Proteins: A Case Study of the Histidine to Alanine Mutations in Amyloid-β42.

    abstract::Alanine scanning is a tool in molecular biology that is commonly used to evaluate the contribution of a specific amino acid residue to the stability and function of a protein. Additionally, this tool is also used to understand whether the side chain of a specific amino acid residue plays a role in the protein's bioact...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00926

    authors: Coskuner-Weber O,Uversky VN

    更新日期:2019-02-25 00:00:00

  • Comparison of Implicit and Explicit Solvation Models for Iota-Cyclodextrin Conformation Analysis from Replica Exchange Molecular Dynamics.

    abstract::Large ring cyclodextrins have become increasingly important for drug delivery applications. In this work, we have performed replica-exchange molecular dynamics simulations using both implicit and explicit water solvation models to study the conformational diversity of iota-cyclodextrin containing 14 α-1,4 glycosidic l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00595

    authors: Khuntawee W,Kunaseth M,Rungnim C,Intagorn S,Wolschann P,Kungwan N,Rungrotmongkol T,Hannongbua S

    更新日期:2017-04-24 00:00:00

  • Computational simulations of the interactions between acetyl-coenzyme-A carboxylase and clodinafop: resistance mechanism due to active and nonactive site mutations.

    abstract::Grass weed populations resistant to acetyl-CoA carboxylase-inhibiting (ACCase; EC 6.4.1.2) herbicides represent a major problem for the sustainable development of modern agriculture. In the present study, extensive computational simulations, including homology modeling, molecular dynamics (MD) simulations, and molecul...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900174d

    authors: Zhu XL,Ge-Fei H,Zhan CG,Yang GF

    更新日期:2009-08-01 00:00:00

  • Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    abstract::Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00249

    authors: Schneider N,Fechner N,Landrum GA,Stiefl N

    更新日期:2017-08-28 00:00:00

  • First Multitarget Chemo-Bioinformatic Model To Enable the Discovery of Antibacterial Peptides against Multiple Gram-Positive Pathogens.

    abstract::Antimicrobial peptides (AMPs) have emerged as promising therapeutic alternatives to fight against the diverse infections caused by different pathogenic microorganisms. In this context, theoretical approaches in bioinformatics have paved the way toward the creation of several in silico models capable of predicting anti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00630

    authors: Speck-Planche A,Kleandrova VV,Ruso JM,Cordeiro MN

    更新日期:2016-03-28 00:00:00

  • DiSCuS: an open platform for (not only) virtual screening results management.

    abstract::DiSCuS, a "Database System for Compound Selection", has been developed. The primary goal of DiSCuS is to aid researchers in the steps subsequent to generating high-throughput virtual screening (HTVS) results, such as selection of compounds for further study, purchase, or synthesis. To do so, DiSCuS provides (1) a stor...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400587f

    authors: Wójcikowski M,Zielenkiewicz P,Siedlecki P

    更新日期:2014-01-27 00:00:00

  • Substituted 4,5'-Bithiazoles as Catalytic Inhibitors of Human DNA Topoisomerase IIα.

    abstract::Human type II topoisomerases, molecular motors that alter the DNA topology, are a major target of modern chemotherapy. Groups of catalytic inhibitors represent a new approach to overcome the known limitations of topoisomerase II poisons such as cardiotoxicity and induction of secondary tumors. Here, we present a class...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00202

    authors: Bergant Loboda K,Janežič M,Štampar M,Žegura B,Filipič M,Perdih A

    更新日期:2020-07-27 00:00:00

  • Molecular Structure-Based Large-Scale Prediction of Chemical-Induced Gene Expression Changes.

    abstract::The quantitative structure-activity relationship (QSAR) approach has been used to model a wide range of chemical-induced biological responses. However, it had not been utilized to model chemical-induced genomewide gene expression changes until very recently, owing to the complexity of training and evaluating a very la...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00281

    authors: Liu R,AbdulHameed MDM,Wallqvist A

    更新日期:2017-09-25 00:00:00

  • GPCR-Bench: A Benchmarking Set and Practitioners' Guide for G Protein-Coupled Receptor Docking.

    abstract::Virtual screening is routinely used to discover new ligands and in particular new ligand chemotypes for G protein-coupled receptors (GPCRs). To prepare for a virtual screen, we often tailor a docking protocol that will enable us to select the best candidates for further screening. To aid this, we created GPCR-Bench, a...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00660

    authors: Weiss DR,Bortolato A,Tehan B,Mason JS

    更新日期:2016-04-25 00:00:00

  • Multifingerprint based similarity searches for targeted class compound selection.

    abstract::Molecular fingerprints are widely used for similarity-based virtual screening in drug discovery projects. In this paper we discuss the performance and the complementarity of nine two-dimensional fingerprints (Daylight, Unity, AlFi, Hologram, CATS, TRUST, Molprint 2D, ChemGPS, and ALOGP) in retrieving active molecules ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0504723

    authors: Kogej T,Engkvist O,Blomberg N,Muresan S

    更新日期:2006-05-01 00:00:00

  • Adaptive BP-Dock: An Induced Fit Docking Approach for Full Receptor Flexibility.

    abstract::We present an induced fit docking approach called Adaptive BP-Dock that integrates perturbation response scanning (PRS) with the flexible docking protocol of RosettaLigand in an adaptive manner. We first perturb the binding pocket residues of a receptor and obtain a new conformation based on the residue response fluct...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00587

    authors: Bolia A,Ozkan SB

    更新日期:2016-04-25 00:00:00

  • Dependence of QSAR models on the selection of trial descriptor sets: a demonstration using nanotoxicity endpoints of decorated nanotubes.

    abstract::Little attention has been given to the selection of trial descriptor sets when designing a QSAR analysis even though a great number of descriptor classes, and often a greater number of descriptors within a given class, are now available. This paper reports an effort to explore interrelationships between QSAR models an...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci3005308

    authors: Shao CY,Chen SZ,Su BH,Tseng YJ,Esposito EX,Hopfinger AJ

    更新日期:2013-01-28 00:00:00