Relationships between Molecular Complexity, Biological Activity, and Structural Diversity.

Abstract:

:Following the theoretical model by Hann et al. moderately complex structures are preferable lead compounds since they lead to specific binding events involving the complete ligand molecule. To make this concept usable in practice for library design, we studied several complexity measures on the biological activity of ligand molecules. We applied the historical IC50/EC50 summary data of 160 assays run at Novartis covering a diverse range of targets, among them kinases, proteases, GPCRs, and protein-protein interactions, and compared this to the background of "inactive" compounds which have been screened for 2 years but have never shown any activity in any primary screen. As complexity measures we used the number of structural features present in various molecular fingerprints and descriptors. We found generally that with increasing activity of the ligands, their average complexity also increased, and we could therefore establish a minimum number of structural features in each descriptor needed for biological activity. Especially well suited in this context were the Similog keys and circular substructure fingerprints. These are those descriptors, which also perform especially well in the identification of bioactive compounds by similarity search, suggesting that structural features encoded in these descriptors have a high relevance for bioactivity. Since the number of features correlates with the number of atoms present in the molecule, also the number of atoms serves as a reasonable complexity measure and larger molecules have, in general, higher activities. Due to the relationship between feature counts and densities on one hand and biological activity on the other, the size bias present in almost all similarity coefficients becomes especially important. Diversity selections using these coefficients can influence the overall complexity of the resulting set of molecules, which has an impact on the biological activity that they exhibit. Using sphere-exclusion based diversity selection methods, such as OptiSim together with the Tanimoto dissimilarity, the average feature count distribution of the resulting selections is shifted toward lower complexity than that of the original set, particularly when applying tight diversity constraints. This size bias reduces the fraction of molecules in the subsets having the complexity required for a high, submicromolar activity. None of the diversity selection methods studied, namely OptiSim, divisive K-means clustering, and self-organizing maps, yielded subsets covering the activity space of the IC50 summary data set better than subsets selected randomly.

journal_name

J Chem Inf Model

authors

Schuffenhauer A,Brown N,Selzer P,Ertl P,Jacoby E

doi

10.1021/ci0503558

keywords:

subject

Has Abstract

pub_date

2006-03-01 00:00:00

pages

525-35

issue

2

eissn

1549-9596

issn

1549-960X

journal_volume

46

pub_type

杂志文章
  • Accurate Hit Estimation for Iterative Screening Using Venn-ABERS Predictors.

    abstract::Iterative screening has emerged as a promising approach to increase the efficiency of high-throughput screening (HTS) campaigns in drug discovery. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models. One of the challenges of iterative screenin...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00724

    authors: Buendia R,Kogej T,Engkvist O,Carlsson L,Linusson H,Johansson U,Toccaceli P,Ahlberg E

    更新日期:2019-03-25 00:00:00

  • FragPELE: Dynamic Ligand Growing within a Binding Site. A Novel Tool for Hit-To-Lead Drug Design.

    abstract::The early stages of drug discovery rely on hit-to-lead programs, where initial hits undergo partial optimization to improve binding affinities for their biological target. This is an expensive and time-consuming process, requiring multiple iterations of trial and error designs, an ideal scenario for applying computer ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00938

    authors: Perez C,Soler D,Soliva R,Guallar V

    更新日期:2020-03-23 00:00:00

  • Protein flexibility in virtual screening: the BACE-1 case study.

    abstract::Simulating protein flexibility is a major issue in the docking-based drug-design process for which a single methodological solution does not exist. In our search of new anti-Alzheimer ligands, we were faced with the challenge of including receptor plasticity in a virtual screening campaign aimed at finding new β-secre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300390h

    authors: Cosconati S,Marinelli L,Di Leva FS,La Pietra V,De Simone A,Mancini F,Andrisano V,Novellino E,Goodsell DS,Olson AJ

    更新日期:2012-10-22 00:00:00

  • Ligand binding determinants for angiotensin II type 1 receptor from computer simulations.

    abstract::The ligand binding determinants for the angiotensin II type 1 receptor (AT1R), a G protein-coupled receptor (GPCR), have been characterized by means of computer simulations. As a first step, a pharmacophore model of various known AT1R ligands exhibiting a wide range of binding affinities was generated. Second, a struc...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400400m

    authors: Matsoukas MT,Cordomí A,Ríos S,Pardo L,Tselios T

    更新日期:2013-11-25 00:00:00

  • Prediction of pH-dependent aqueous solubility of druglike molecules.

    abstract::In the present work, the Henderson-Hasselbalch (HH) equation has been employed for the development of a tool for the prediction of pH-dependent aqueous solubility of drugs and drug candidates. A new prediction method for the intrinsic solubility was developed, based on artificial neural networks that have been trained...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600292q

    authors: Hansen NT,Kouskoumvekaki I,Jørgensen FS,Brunak S,Jónsdóttir SO

    更新日期:2006-11-01 00:00:00

  • Consensus QSAR models: do the benefits outweigh the complexity?

    abstract::This study has assessed the use of consensus regression, as compared to single multiple linear regression, models for the development of quantitative structure-activity relationships (QSARs). To provide a comparison, four data sets of varying size and complexity were analyzed: silastic membrane flux, toxicity of pheno...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700016d

    authors: Hewitt M,Cronin MT,Madden JC,Rowe PH,Johnson C,Obi A,Enoch SJ

    更新日期:2007-07-01 00:00:00

  • Identifying biologically active compound classes using phenotypic screening data and sampling statistics.

    abstract::Scoring the activity of compounds in phenotypic high-throughput assays presents a unique challenge because of the limited resolution and inherent measurement error of these assays. Techniques that leverage the structural similarity of compounds within an assay can be used to improve the hit-recovery rate from screenin...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050087d

    authors: Klekota J,Brauner E,Schreiber SL

    更新日期:2005-11-01 00:00:00

  • Use of 3D QSAR models for database screening: a feasibility study.

    abstract::The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined. A protocol requiring minimal user intervention has been established to align training and test set molecules using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are exemplified stu...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci7002945

    authors: Hillebrecht A,Klebe G

    更新日期:2008-02-01 00:00:00

  • Inner and Outer Recursive Neural Networks for Chemoinformatics Applications.

    abstract::Deep learning methods applied to problems in chemoinformatics often require the use of recursive neural networks to handle data with graphical structure and variable size. We present a useful classification of recursive neural network approaches into two classes, the inner and outer approach. The inner approach uses r...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00384

    authors: Urban G,Subrahmanya N,Baldi P

    更新日期:2018-02-26 00:00:00

  • Support vector regression scoring of receptor-ligand complexes for rank-ordering and virtual screening of chemical libraries.

    abstract::The community structure-activity resource (CSAR) data sets are used to develop and test a support vector machine-based scoring function in regression mode (SVR). Two scoring functions (SVR-KB and SVR-EP) are derived with the objective of reproducing the trend of the experimental binding affinities provided within the ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200078f

    authors: Li L,Wang B,Meroueh SO

    更新日期:2011-09-26 00:00:00

  • Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets.

    abstract::With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100264e

    authors: Kramer C,Gedeck P

    更新日期:2010-11-22 00:00:00

  • Homology modeling and docking evaluation of aminergic G protein-coupled receptors.

    abstract::We report the development of homology models of dopamine (D(2), D(3), and D(4)), serotonin (5-HT(1B), 5-HT(2A), 5-HT(2B), and 5-HT(2C)), histamine (H(1)), and muscarinic (M(1)) receptors, based on the high-resolution structure of the beta(2)-adrenergic receptor. The homology models were built and refined using Prime. ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900444q

    authors: McRobb FM,Capuano B,Crosby IT,Chalmers DK,Yuriev E

    更新日期:2010-04-26 00:00:00

  • BFMP: a method for discretizing and visualizing pyranose conformations.

    abstract::We report a new classification method for pyranose ring conformations called Best-fit, Four-Membered Plane (BFMP), which describes pyranose ring conformations based on reference planes defined by four atoms. The method is able to characterize all asymmetrical and symmetrical shapes of a pyran ring, is readily automate...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500325b

    authors: Makeneni S,Foley BL,Woods RJ

    更新日期:2014-10-27 00:00:00

  • Flexophore, a new versatile 3D pharmacophore descriptor that considers molecular flexibility.

    abstract::A novel pharmacophore descriptor Flexophore is presented, which considers molecular flexibility when comparing descriptor similarities. The descriptor is a complete reduced graph of the underlying molecule. Its nodes are represented by enhanced MM2 atom types, while the edge descriptions encode the molecular flexibili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700359j

    authors: von Korff M,Freyss J,Sander T

    更新日期:2008-04-01 00:00:00

  • CoNTub v2.0--algorithms for constructing C3-symmetric models of three-nanotube junctions.

    abstract::Here, a method is described for easily building three-carbon nanotube junctions. It allows the geometry to be found and bond connectivity of C(3) symmetric nanotube junctions to be established. Such junctions may present a variable degree of pyramidalization and are composed of three identical carbon nanotubes with ar...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200056p

    authors: Melchor S,Martin-Martinez FJ,Dobado JA

    更新日期:2011-06-27 00:00:00

  • Direct Observation of β-Barrel Intermediates in the Self-Assembly of Toxic SOD128-38 and Absence in Nontoxic Glycine Mutants.

    abstract::Soluble low-molecular-weight oligomers formed during the early stage of amyloid aggregation are considered the major toxic species in amyloidosis. The structure-function relationship between oligomeric assemblies and the cytotoxicity in amyloid diseases are still elusive due to the heterogeneous and transient nature o...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01319

    authors: Sun Y,Huang J,Duan X,Ding F

    更新日期:2021-01-14 00:00:00

  • Binding Interactions of Ergotamine and Dihydroergotamine to 5-Hydroxytryptamine Receptor 1B (5-HT1b) Using Molecular Dynamics Simulations and Dynamic Network Analysis.

    abstract::Ergotamine (ERG) and dihydroergotamine (DHE), common migraine drugs, have small structural differences but lead to clinically important distinctions in their pharmacological profiles. For example, DHE is less potent than ERG by about 10-fold at the 5-hydroxytrptamine receptor 1B (5-HT1B). Although the high-resolution ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01082

    authors: Sullivan HJ,Tursi A,Moore K,Campbell A,Floyd C,Wu C

    更新日期:2020-03-23 00:00:00

  • TAMkin: a versatile package for vibrational analysis and chemical kinetics.

    abstract::TAMkin is a program for the calculation and analysis of normal modes, thermochemical properties and chemical reaction rates. At present, the output from the frequently applied software programs ADF, CHARMM, CPMD, CP2K, Gaussian, Q-Chem, and VASP can be analyzed. The normal-mode analysis can be performed using a broad ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100099g

    authors: Ghysels A,Verstraelen T,Hemelsoet K,Waroquier M,Van Speybroeck V

    更新日期:2010-09-27 00:00:00

  • Fragment-Based Computational Method for Designing GPCR Ligands.

    abstract::G protein-coupled receptors (GPCRs) are the largest family of cell surface receptors, which is arguably the most important family of drug target. With the technology breakthroughs in X-ray crystallography and cryo-electron microscopy, more than 300 GPCR-ligand complex structures have been publicly reported since 2007,...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00699

    authors: Li Y,Sun Y,Song Y,Dai D,Zhao Z,Zhang Q,Zhong W,Hu LA,Ma Y,Li X,Wang R

    更新日期:2020-09-28 00:00:00

  • Enrichment factor analyses on G-protein coupled receptors with known crystal structure.

    abstract::G-protein coupled receptors (GPCRs) are highly relevant drug targets. Four GPCRs with known crystal structure were analyzed with docking (AutoDock4) and postdocking (MM-PBSA) in order to evaluate the ability to recognize known antagonists from a larger database of molecular decoys and to predict correct binding modes....

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4000745

    authors: Anighoro A,Rastelli G

    更新日期:2013-04-22 00:00:00

  • Combined 3D-QSAR modeling and molecular docking study on indolinone derivatives as inhibitors of 3-phosphoinositide-dependent protein kinase-1.

    abstract::3-Phosphoinositide-dependent protein kinase-1 (PDK1) is a promising target for developing novel anticancer drugs. In order to understand the structure-activity correlation of indolinone-based PDK1 inhibitors, we have carried out a combined molecular docking and three-dimensional quantitative structure-activity relatio...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800147v

    authors: AbdulHameed MD,Hamza A,Liu J,Zhan CG

    更新日期:2008-09-01 00:00:00

  • Allosteric Modulation of Human Hsp90α Conformational Dynamics.

    abstract::Central to Hsp90's biological function is its ability to interconvert between various conformational states. Drug targeting of Hsp90's regulatory mechanisms, including its modulation by cochaperone association, presents as an attractive therapeutic strategy for Hsp90 associated pathologies. In this study, we utilized ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00630

    authors: Penkler DL,Atilgan C,Tastan Bishop Ö

    更新日期:2018-02-26 00:00:00

  • Computational Prediction and Biochemical Analyses of New Inverse Agonists for the CB1 Receptor.

    abstract::Human cannabinoid type 1 (CB1) G-protein coupled receptor is a potential therapeutic target for obesity. The previously predicted and experimentally validated ensemble of ligand-free conformations of CB1 [Scott, C. E. et al. Protein Sci. 2013 , 22 , 101 - 113 ; Ahn, K. H. et al. Proteins 2013 , 81 , 1304 - 1317] are u...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00581

    authors: Scott CE,Ahn KH,Graf ST,Goddard WA 3rd,Kendall DA,Abrol R

    更新日期:2016-01-25 00:00:00

  • Pathway analysis for drug repositioning based on public database mining.

    abstract::Sixteen FDA-approved drugs were investigated to elucidate their mechanisms of action (MOAs) and clinical functions by pathway analysis based on retrieved drug targets interacting with or affected by the investigated drugs. Protein and gene targets and associated pathways were obtained by data-mining of public database...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci4005354

    authors: Pan Y,Cheng T,Wang Y,Bryant SH

    更新日期:2014-02-24 00:00:00

  • Molecular Dynamics Simulations of Supramolecular Anticancer Nanotubes.

    abstract::We report here on long-time all-atomistic molecular dynamics simulations of functional supramolecular nanotubes composed by the self-assembly of peptide-drug amphiphiles (DAs). These DAs have been shown to possess an inherently high drug loading of the hydrophobic anticancer drug camptothecin. We probe the self-assemb...

    journal_title:Journal of chemical information and modeling

    pub_type: 信件

    doi:10.1021/acs.jcim.8b00193

    authors: Kang M,Chakraborty K,Loverde SM

    更新日期:2018-06-25 00:00:00

  • Force Field Benchmark of Amino Acids. 2. Partition Coefficients between Water and Organic Solvents.

    abstract::The partitioning of amino acids between water and apolar environments is of vital importance in protein function and drug delivery. Here we present an extensive benchmark for octanol/water (log Poct), chloroform/water (log Pclf), and cyclohexane/water (log Pchx) partition coefficients of neutral amino acid side chain ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00493

    authors: Zhang H,Jiang Y,Cui Z,Yin C

    更新日期:2018-08-27 00:00:00

  • Imputation of Assay Bioactivity Data Using Deep Learning.

    abstract::We describe a novel deep learning neural network method and its application to impute assay pIC50 values. Unlike conventional machine learning approaches, this method is trained on sparse bioactivity data as input, typical of that found in public and commercial databases, enabling it to learn directly from correlation...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00768

    authors: Whitehead TM,Irwin BWJ,Hunt P,Segall MD,Conduit GJ

    更新日期:2019-03-25 00:00:00

  • Long-range effects of a peripheral mutation on the enzymatic activity of cytochrome P450 1A2.

    abstract::The human cytochrome P450 1A2 is an important drug metabolizing and procarcinogen activating enzyme. An experimental study found that a peripheral mutation, F186L, at ∼26 Å away from the enzyme's active site, caused a significant reduction in the enzymatic activity of 1A2 deethylation reactions. In this paper, we expl...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200112b

    authors: Zhang T,Liu LA,Lewis DF,Wei DQ

    更新日期:2011-06-27 00:00:00

  • Geometric accuracy of three-dimensional molecular overlays.

    abstract::This study examines the dependence of molecular alignment accuracy on a variety of factors including the choice of molecular template, alignment method, conformational flexibility, and type of protein target. We used eight test systems for which X-ray data on 145 ligand-protein complexes were available. The use of X-r...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060134h

    authors: Chen Q,Higgs RE,Vieth M

    更新日期:2006-09-01 00:00:00

  • Dihedral-based segment identification and classification of biopolymers I: proteins.

    abstract::A new structure classification scheme for biopolymers is introduced, which is solely based on main-chain dihedral angles. It is shown that by dividing a biopolymer into segments containing two central residues, a local classification can be performed. The method is referred to as DISICL, short for Dihedral-based Segme...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400541d

    authors: Nagy G,Oostenbrink C

    更新日期:2014-01-27 00:00:00