Abstract:
:Selecting a small subset of descriptors from a large pool to build a predictive quantitative structure-activity relationship (QSAR) model is an important step in the QSAR modeling process. In general, subset selection is very hard to solve, even approximately, with guaranteed performance bounds. Traditional approaches employ deterministic or stochastic methods to obtain a descriptor subset that leads to an optimal model of a single type (such as linear regression or a neural network). With the development of ensemble modeling approaches, multiple models of differing types are individually developed resulting in different descriptor subsets for each model type. However, it is advantageous, from the point of view of developing interpretable QSAR models, to have a single set of descriptors that can be used for different model types. In this paper, we describe an approach to the selection of a single, optimal, subset of descriptors for multiple model types. We apply this approach to three data sets, covering both regression and classification, and show that the constraint of forcing different model types to use the same set of descriptors does not lead to a significant loss in predictive ability for the individual models considered. In addition, interpretations of the individual models developed using this approach indicate that they encode similar structure-activity trends.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Dutta D,Guha R,Wild D,Chen Tdoi
10.1021/ci600563wsubject
Has Abstractpub_date
2007-05-01 00:00:00pages
989-97issue
3eissn
1549-9596issn
1549-960Xjournal_volume
47pub_type
杂志文章abstract::The cannabinoid receptor subtype 2 (CB2) is a promising therapeutic target for blood cancer, pain relief, osteoporosis, and immune system disease. The recent withdrawal of Rimonabant, which targets another closely related cannabinoid receptor (CB1), accentuates the importance of selectivity for the development of CB2 ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci3003914
更新日期:2013-01-28 00:00:00
abstract::Membrane-bound protein receptors are a primary biological drug target, but the computational analysis of membrane proteins has been limited. In order to improve molecular mechanics Poisson-Boltzmann surface area (MMPBSA) binding free energy calculations for membrane protein-ligand systems, we have optimized a new hete...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00363
更新日期:2019-06-24 00:00:00
abstract::We investigate unexpectedly short non-covalent distances (<85% of the sum of van der Waals radii) in X-ray crystal structures of proteins. We curate over 11 000 high-quality protein crystal structures and an ultra-high-resolution (1.2 Å or better) subset containing >900 structures. Although our non-covalent distance c...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00144
更新日期:2019-05-28 00:00:00
abstract::Increased reports of oseltamivir (OTV)-resistant strains of the influenza virus, such as the H274Y mutation on its neuraminidase (NA), have created some cause for concern. Many studies have been conducted in the attempt to uncover the mechanism of OTV resistance in H274Y NA. However, most of the reported studies on H2...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00331
更新日期:2016-01-25 00:00:00
abstract::The solvation layer surrounding a protein is clearly an intrinsic part of protein structure-dynamics-function, and our understanding of how the hydration dynamics influences protein function is emerging. We have recently reported simulations indicating a correlation between regional hydration dynamics and the structur...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00009
更新日期:2019-05-28 00:00:00
abstract::Computational approaches to fragment-based drug design (FBDD) can complement experiments and facilitate the identification of potential hot spots along the protein surface. However, the evaluation of computational methods for mapping binding sites frequently focuses upon the ability to reproduce crystallographic coord...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300430v
更新日期:2013-02-25 00:00:00
abstract::New molecular descriptors, RED (Renyi entropy descriptors), based on the generalized entropies introduced by Renyi are presented. Topological descriptors based on molecular features have proven to be useful for describing molecular profiles. Renyi entropy is used as a variability measure to contract a feature-pair dis...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900275w
更新日期:2009-11-01 00:00:00
abstract::An essential feature of all practical de novo molecule generating programs is the ability to focus the potential combinatorial explosion of grown molecules on a desired chemical space. It is a daunting task to balance the generation of new molecules with limitations on growth that produce desired features such as stab...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci9000458
更新日期:2009-07-01 00:00:00
abstract::The ligand binding determinants for the angiotensin II type 1 receptor (AT1R), a G protein-coupled receptor (GPCR), have been characterized by means of computer simulations. As a first step, a pharmacophore model of various known AT1R ligands exhibiting a wide range of binding affinities was generated. Second, a struc...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400400m
更新日期:2013-11-25 00:00:00
abstract::Predictive metabolism methods can be used in drug discovery projects to enhance the understanding of structure-metabolism relationships. The present study uses data mining methods to exploit biotransformation data that have been recorded in the MDL Metabolite database. Reacting center fingerprints were derived from a ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600376q
更新日期:2007-03-01 00:00:00
abstract::SIRT2, which is a NAD+ (nicotinamide adenine dinucleotide) dependent deacetylase, has been demonstrated to play an important role in the occurrence and development of a variety of diseases such as cancer, ischemia-reperfusion, and neurodegenerative diseases. Small molecule inhibitors of SIRT2 are thought to be potenti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00714
更新日期:2017-04-24 00:00:00
abstract::Our main objective was to compile a data set of high-quality protein-fragment complexes and make it publicly available. Once assembled, the data set was challenged using docking procedures to address the following questions: (i) Can molecular docking correctly reproduce the experimentally solved structures? (ii) How t...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci2003363
更新日期:2011-11-28 00:00:00
abstract::Extending the original training data with simulated unobserved data points has proven powerful to increase both the generalization ability of predictive models and their robustness against changes in the structure of data (e.g., systematic drifts in the response variable) in diverse areas such as the analysis of spect...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00570
更新日期:2015-12-28 00:00:00
abstract::The purpose of this investigation is to contribute to the development of new anticonvulsant drugs to treat patients with refractory epilepsy. We applied a virtual screening protocol that involved the search into molecular databases of new compounds and known drugs to find small molecules that interact with the open co...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00721
更新日期:2018-07-23 00:00:00
abstract::In this account, a rapid retrosynthesis-based scoring method for the assessment of synthetic accessibility of drug-like molecules, called RASA (Retrosynthesis-based Assessment of Synthetic Accessibility) is devised. RASA first constructs a synthesis tree for the target molecule based on retrosynthetic analysis; in thi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100216g
更新日期:2011-10-24 00:00:00
abstract::Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00543
更新日期:2015-10-26 00:00:00
abstract::Large ring cyclodextrins have become increasingly important for drug delivery applications. In this work, we have performed replica-exchange molecular dynamics simulations using both implicit and explicit water solvation models to study the conformational diversity of iota-cyclodextrin containing 14 α-1,4 glycosidic l...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00595
更新日期:2017-04-24 00:00:00
abstract::The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined. A protocol requiring minimal user intervention has been established to align training and test set molecules using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are exemplified stu...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7002945
更新日期:2008-02-01 00:00:00
abstract::The importance of thorough analyses of the secondary structures in proteins as basic structural units cannot be overemphasized. Although recent computational methods have achieved reasonably high accuracy for predicting secondary structures from amino acid sequences, a simple and fundamental empirical approach to char...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900452z
更新日期:2010-04-26 00:00:00
abstract::Traditionally, a drug potency is expressed in terms of thermodynamic quantities, mostly Kd, and empirical IC50 values. Although binding affinity as an estimate of drug activity remains relevant, it is increasingly clear that it is also important to include (un)binding kinetic parameters in the characterization of pote...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00518
更新日期:2018-11-26 00:00:00
abstract::This article presents the computation of both inter- and intramolecular hydrogen bond strengths from first-principles. Quantum chemical calculations conducted at the dispersion-corrected density functional theory level including free energy and solvation contributions are conducted for (i) one-to-one hydrogen-bonded c...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00132
更新日期:2019-09-23 00:00:00
abstract::The d-Ala:d-Lac ligase, VanA, plays a critical role in the resistance of vancomycin. Indeed, it is involved in the synthesis of a peptidoglycan precursor, to which vancomycin cannot bind. The reaction catalyzed by VanA requires the opening of the so-called "ω-loop", so that the substrates can enter the active site. He...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00211
更新日期:2016-09-26 00:00:00
abstract::We present a theoretical study on the performance of ensemble docking methodologies considering multiple protein structures. We perform a theoretical analysis of pose prediction experiments which is completely unbiased, as we make no assumptions about specific scoring functions, search paradigms, protein structures, o...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci2002796
更新日期:2011-11-28 00:00:00
abstract::In this study, we tried to establish a general scheme to create a model that could predict the affinity of small compounds to their target proteins. This scheme consists of a search for ligand-binding sites on a protein, a generation of bound conformations (poses) of ligands in each of the sites by docking, identifica...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci800313h
更新日期:2009-04-01 00:00:00
abstract::A giant technological leap in the field of cryo-electron microscopy (cryo-EM) has assured the achievement of near-atomic resolution structures of biological macromolecules. As a recognition of this accomplishment, the Nobel Prize in Chemistry was awarded in 2017 to Jacques Dubochet, Joachim Frank, and Richard Henderso...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b01015
更新日期:2020-05-26 00:00:00
abstract::The consistent handling of molecules is probably the most basic and important requirement in the field of cheminformatics. Reliable results can only be obtained if the underlying calculations are independent of the specific way molecules are represented in the input data. However, ensuring consistency is a complex tas...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400724v
更新日期:2014-03-24 00:00:00
abstract::The important role of water molecules in protein-ligand binding energetics has attracted wide attention in recent years. A range of computational methods has been developed to predict the favorable locations of water molecules in a protein binding pocket. Most of the current methods are based on extensive molecular dy...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00619
更新日期:2020-09-28 00:00:00
abstract::The modeling of nonlinear descriptor-target relationships is a topic of considerable interest in drug discovery. We, herein, continue reporting the use of the self-organizing map-a nonlinear, topology-preserving pattern recognition technique that exhibits considerable promise in modeling and decoding these relationshi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0500841
更新日期:2006-01-01 00:00:00
abstract::Various cages are constructed by using three types of caps: f-cap (derived from spherical fullerenes by deleting zones of various size), kf-cap (obtainable by cutting off the polar ring, of size k), and t-cap ("tubercule"-cap). Building ways are presented, some of them being possible isomerization routes in the real c...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci049738g
更新日期:2005-03-01 00:00:00
abstract::With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100264e
更新日期:2010-11-22 00:00:00