Effect of data standardization on chemical clustering and similarity searching.

Abstract:

:Standardization is used to ensure that the variables in a similarity calculation make an equal contribution to the computed similarity value. This paper compares the use of seven different methods that have been suggested previously for the standardization of integer-valued or real-valued data, comparing the results with unstandardized data. Sets of structures from the MDL Drug Data Report and IDAlert databases and represented by Pipeline Pilot physicochemical parameters, molecular holograms and Molconn-Z parameters are clustered using the k-means and Ward's clustering methods. The resulting classifications are evaluated in terms of the degree of clustering of active compounds selected from eleven different biological activity classes, with these classes also being used in similarity searches. It is shown that there is no consistent pattern when the various standardization methods are ranked in order of decreasing effectiveness and that there is no obvious performance benefit (when compared to unstandardized data) that is likely to be obtained from the use of any particular standardization method.

journal_name

J Chem Inf Model

authors

Chu CW,Holliday JD,Willett P

doi

10.1021/ci800224h

subject

Has Abstract

pub_date

2009-02-01 00:00:00

pages

155-61

issue

2

eissn

1549-9596

issn

1549-960X

pii

10.1021/ci800224h

journal_volume

49

pub_type

杂志文章
  • Protein flexibility in virtual screening: the BACE-1 case study.

    abstract::Simulating protein flexibility is a major issue in the docking-based drug-design process for which a single methodological solution does not exist. In our search of new anti-Alzheimer ligands, we were faced with the challenge of including receptor plasticity in a virtual screening campaign aimed at finding new β-secre...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300390h

    authors: Cosconati S,Marinelli L,Di Leva FS,La Pietra V,De Simone A,Mancini F,Andrisano V,Novellino E,Goodsell DS,Olson AJ

    更新日期:2012-10-22 00:00:00

  • In vitro drug sensitivity-gene expression correlations involve a tissue of origin dependency.

    abstract::A major concern of chemogenomics is to associate drug activity with biological variables. Several reports have clustered cell line drug activity profiles as well as drug activity-gene expression correlation profiles and noted that the resulting groupings differ but still reflect mechanism of action. The present paper ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060073n

    authors: Andersson CR,Fryknäs M,Rickardson L,Larsson R,Isaksson A,Gustafsson MG

    更新日期:2007-01-01 00:00:00

  • SARANEA: a freely available program to mine structure-activity and structure-selectivity relationship information in compound data sets.

    abstract::We introduce SARANEA, an open-source Java application for interactive exploration of structure-activity relationship (SAR) and structure-selectivity relationship (SSR) information in compound sets of any source. SARANEA integrates various SAR and SSR analysis functions and utilizes a network-like similarity graph data...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900416a

    authors: Lounkine E,Wawer M,Wassermann AM,Bajorath J

    更新日期:2010-01-01 00:00:00

  • Ligand coordinate analysis of SC-558 from the active site to the surface of COX-2: a molecular dynamics study.

    abstract::We have performed a ligand coordinate analysis to monitor the movement of the inhibitor SC-558 from the active site of the COX-2 protein to the exterior using molecular dynamics techniques. This study provides an insight into the intermolecular interactions formed by the ligand during this journey. The published cryst...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050142i

    authors: Sai Ram KV,Rambabu G,Sarma JA,Desiraju GR

    更新日期:2006-07-01 00:00:00

  • COSMOsar3D: molecular field analysis based on local COSMO σ-profiles.

    abstract::The COSMO surface polarization charge density σ resulting from quantum chemical calculations combined with a virtual conductor embedding has been widely proven to be a very suitable descriptor for the quantification of interactions of molecules in liquids. In a preceding paper, grid-based local histograms of σ have be...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300231t

    authors: Klamt A,Thormann M,Wichmann K,Tosco P

    更新日期:2012-08-27 00:00:00

  • iUmami-SCM: A Novel Sequence-Based Predictor for Prediction and Analysis of Umami Peptides Using a Scoring Card Method with Propensity Scores of Dipeptides.

    abstract::Umami or the taste of monosodium glutamate represents one of the major attractive taste modalities in humans. Therefore, knowledge about biophysical and biochemical properties of the umami taste is important for both scientific research and the food industry. Experimental approaches for predicting umami peptides are l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00707

    authors: Charoenkwan P,Yana J,Nantasenamat C,Hasan MM,Shoombuatong W

    更新日期:2020-12-28 00:00:00

  • ReFlex3D: Refined Flexible Alignment of Molecules Using Shape and Electrostatics.

    abstract::We present an algorithm, ReFlex3D, for the refinement of flexible molecular alignments based on their three-dimensional shape and electrostatic properties. The algorithm is designed to be used with fast conformer generators to refine an initial overlay between two molecules and thus to obtain improved overlaps as judg...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00618

    authors: Schmidt TC,Cosgrove DA,Boström J

    更新日期:2018-04-23 00:00:00

  • TAMkin: a versatile package for vibrational analysis and chemical kinetics.

    abstract::TAMkin is a program for the calculation and analysis of normal modes, thermochemical properties and chemical reaction rates. At present, the output from the frequently applied software programs ADF, CHARMM, CPMD, CP2K, Gaussian, Q-Chem, and VASP can be analyzed. The normal-mode analysis can be performed using a broad ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100099g

    authors: Ghysels A,Verstraelen T,Hemelsoet K,Waroquier M,Van Speybroeck V

    更新日期:2010-09-27 00:00:00

  • Structure-based CoMFA as a predictive model - CYP2C9 inhibitors as a test case.

    abstract::In this study, we tried to establish a general scheme to create a model that could predict the affinity of small compounds to their target proteins. This scheme consists of a search for ligand-binding sites on a protein, a generation of bound conformations (poses) of ligands in each of the sites by docking, identifica...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800313h

    authors: Yasuo K,Yamaotsu N,Gouda H,Tsujishita H,Hirono S

    更新日期:2009-04-01 00:00:00

  • Exploring Tunable Hyperparameters for Deep Neural Networks with Industrial ADME Data Sets.

    abstract::Deep learning has drawn significant attention in different areas including drug discovery. It has been proposed that it could outperform other machine learning algorithms, especially with big data sets. In the field of pharmaceutical industry, machine learning models are built to understand quantitative structure-acti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00671

    authors: Zhou Y,Cahya S,Combs SA,Nicolaou CA,Wang J,Desai PV,Shen J

    更新日期:2019-03-25 00:00:00

  • Customizable Generation of Synthetically Accessible, Local Chemical Subspaces.

    abstract::Screening large libraries of chemicals has been an efficient strategy to discover bioactive compounds; however a portion of the potential for success is limited to the available libraries. Synergizing combinatorial and computational chemistries has emerged as a time-efficient strategy to explore the chemical space mor...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00648

    authors: Pottel J,Moitessier N

    更新日期:2017-03-27 00:00:00

  • RDChiral: An RDKit Wrapper for Handling Stereochemistry in Retrosynthetic Template Extraction and Application.

    abstract::There is a renewed interest in computer-aided synthesis planning, where the vast majority of approaches require the application of retrosynthetic reaction templates. Here we introduce RDChiral, an open-source Python wrapper for RDKit designed to provide consistent handling of stereochemical information in applying ret...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00286

    authors: Coley CW,Green WH,Jensen KF

    更新日期:2019-06-24 00:00:00

  • Comparative Dynamics and Functional Mechanisms of the CYP17A1 Tunnels Regulated by Ligand Binding.

    abstract::As an important member of cytochrome P450 (CYP) enzymes, CYP17A1 is a dual-function monooxygenase with a critical role in the synthesis of many human steroid hormones, making it an attractive therapeutic target. The emerging structural information about CYP17A1 and the growing number of inhibitors for these enzymes ca...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00447

    authors: Xiao F,Song X,Tian P,Gan M,Verkhivker GM,Hu G

    更新日期:2020-07-27 00:00:00

  • 3D-QSAR and docking studies of selective GSK-3beta inhibitors. Comparison with a thieno[2,3-b]pyrrolizinone derivative, a new potential lead for GSK-3beta ligands.

    abstract::The three-dimensional structures of 3-anilino-4-arylmaleimides, selective GSK-3beta inhibitors, were correlated to their biological affinities by 3D-QSAR studies (CoMFA method). The cocrystallographic data of GSK-3beta vs 3-anilino-4-arylmaleimide allowed us to compare 3D-QSAR results to experimental intermolecular in...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050008y

    authors: Lescot E,Bureau R,Sopkova-de Oliveira Santos J,Rochais C,Lisowski V,Lancelot JC,Rault S

    更新日期:2005-05-01 00:00:00

  • Adaptive configuring of radial basis function network by hybrid particle swarm algorithm for QSAR studies of organic compounds.

    abstract::The configuring of a radial basis function network (RBFN) consists of selecting the network parameters (centers and widths in RBF units and weights between the hidden and output layers) and network architecture. The issues of suboptimum and overfitting, however, often occur in RBFN configuring. This paper presented a ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci600218d

    authors: Zhou YP,Jiang JH,Lin WQ,Zou HY,Wu HL,Shen GL,Yu RQ

    更新日期:2006-11-01 00:00:00

  • In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Naïve Bayes and Parzen-Rosenblatt window.

    abstract::In this study, two probabilistic machine-learning algorithms were compared for in silico target prediction of bioactive molecules, namely the well-established Laplacian-modified Naïve Bayes classifier (NB) and the more recently introduced (to Cheminformatics) Parzen-Rosenblatt Window. Both classifiers were trained in ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300435j

    authors: Koutsoukas A,Lowe R,Kalantarmotamedi Y,Mussa HY,Klaffke W,Mitchell JB,Glen RC,Bender A

    更新日期:2013-08-26 00:00:00

  • Computational Design of Biologically Active Anticancer Peptides and Their Interactions with Heterogeneous POPC/POPS Lipid Membranes.

    abstract::Over the last few decades, anticancer peptides (ACPs) have turned into potential warheads against cancer. Apart from small molecules and monoclonal antibodies, ACPs have been proven to be effective against cancer cells. ACPs are small cationic peptides that selectively bind to the negatively charged cancer cell membra...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00348

    authors: Singh M,Kumar V,Sikka K,Thakur R,Harioudh MK,Mishra DP,Ghosh JK,Siddiqi MI

    更新日期:2020-01-27 00:00:00

  • Systematic analysis of enzyme-catalyzed reaction patterns and prediction of microbial biodegradation pathways.

    abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700006f

    authors: Oh M,Yamada T,Hattori M,Goto S,Kanehisa M

    更新日期:2007-07-01 00:00:00

  • Homology model-guided 3D-QSAR studies of HIV-1 integrase inhibitors.

    abstract::In the present study, we report the exploration of binding modes of potent HIV-1 integrase (IN) inhibitors MK-0518 (raltegravir) and GS-9137 (elvitegravir) as well as chalcone and related amide IN inhibitors we recently synthesized and the development of 3D-QSAR models for integrase inhibition. Homology models of DNA-...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200485a

    authors: Sharma H,Cheng X,Buolamwini JK

    更新日期:2012-02-27 00:00:00

  • Structural insight into the unique binding properties of pyridylethanol(phenylethyl)amine inhibitor in human CYP51.

    abstract::Sterol 14α-demethylase (CYP51) is the main drug target for the treatment of fungal infections. The discovery of new efficient fungal CYP51 inhibitors requires an understanding of the structural requirements for selectivity for the fungal over the human ortholog. In this study, a binding mode of the pyridylethanol(phen...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500556k

    authors: Zelenko U,Hodošček M,Rozman D,Golič Grdadolnik S

    更新日期:2014-12-22 00:00:00

  • FOG: Fragment Optimized Growth algorithm for the de novo generation of molecules occupying druglike chemical space.

    abstract::An essential feature of all practical de novo molecule generating programs is the ability to focus the potential combinatorial explosion of grown molecules on a desired chemical space. It is a daunting task to balance the generation of new molecules with limitations on growth that produce desired features such as stab...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9000458

    authors: Kutchukian PS,Lou D,Shakhnovich EI

    更新日期:2009-07-01 00:00:00

  • A Grid Map Based Approach to Identify Nonobvious Ligand Design Opportunities in 3D Protein Structure Ensembles.

    abstract::Three-dimensional protein structures are a key requisite for structure-based drug discovery. For many highly relevant targets, medicinal chemists are confronted with large numbers of target structures in their apo-forms or in complex with a wealth of different ligands. To exploit the full potential of such structure e...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00051

    authors: Schmalhorst PS,Bergner A

    更新日期:2020-04-27 00:00:00

  • Structural basis for the mutation-induced dysfunction of human CYP2J2: a computational study.

    abstract::Arachidonic acid is an essential fatty acid in cells, acting as a key inflammatory intermediate in inflammatory reactions. In cardiac tissues, CYP2J2 can adopt arachidonic acid as a major substrate to produce epoxyeicosatrienoic acids (EETs), which can protect endothelial cells from ischemic or hypoxic injuries and ha...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400003p

    authors: Cong S,Ma XT,Li YX,Wang JF

    更新日期:2013-06-24 00:00:00

  • Multifingerprint based similarity searches for targeted class compound selection.

    abstract::Molecular fingerprints are widely used for similarity-based virtual screening in drug discovery projects. In this paper we discuss the performance and the complementarity of nine two-dimensional fingerprints (Daylight, Unity, AlFi, Hologram, CATS, TRUST, Molprint 2D, ChemGPS, and ALOGP) in retrieving active molecules ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci0504723

    authors: Kogej T,Engkvist O,Blomberg N,Muresan S

    更新日期:2006-05-01 00:00:00

  • Long-range effects of a peripheral mutation on the enzymatic activity of cytochrome P450 1A2.

    abstract::The human cytochrome P450 1A2 is an important drug metabolizing and procarcinogen activating enzyme. An experimental study found that a peripheral mutation, F186L, at ∼26 Å away from the enzyme's active site, caused a significant reduction in the enzymatic activity of 1A2 deethylation reactions. In this paper, we expl...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200112b

    authors: Zhang T,Liu LA,Lewis DF,Wei DQ

    更新日期:2011-06-27 00:00:00

  • BiKi Life Sciences: A New Suite for Molecular Dynamics and Related Methods in Drug Discovery.

    abstract::In this paper, we introduce the BiKi Life Sciences suite. This software makes it easy for computational medicinal chemists to run ad hoc molecular dynamics protocols in a novel and task-oriented environment; as a notebook, BiKi (acronym of Binding Kinetics) keeps memory of any activity together with dependencies among...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00680

    authors: Decherchi S,Bottegoni G,Spitaleri A,Rocchia W,Cavalli A

    更新日期:2018-02-26 00:00:00

  • Evaluation and Characterization of Trk Kinase Inhibitors for the Treatment of Pain: Reliable Binding Affinity Predictions from Theory and Computation.

    abstract::Optimization of ligand binding affinity to the target protein of interest is a primary objective in small-molecule drug discovery. Until now, the prediction of binding affinities by computational methods has not been widely applied in the drug discovery process, mainly because of its lack of accuracy and reproducibili...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00780

    authors: Wan S,Bhati AP,Skerratt S,Omoto K,Shanmugasundaram V,Bagal SK,Coveney PV

    更新日期:2017-04-24 00:00:00

  • In silico analysis of the thermodynamic stability changes of psychrophilic and mesophilic alpha-amylases upon exhaustive single-site mutations.

    abstract::Identifying sequence modifications that distinguish psychrophilic from mesophilic proteins is important for designing enzymes with different thermodynamic stabilities and to understand the underlying mechanisms. The PoPMuSiC algorithm is used to introduce, in silico, all the single-site mutations in four mesophilic an...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050473v

    authors: Gilis D

    更新日期:2006-05-01 00:00:00

  • Modeling compound-target interaction network of traditional Chinese medicines for type II diabetes mellitus: insight for polypharmacology and drug design.

    abstract::In this study, in order to elucidate the action mechanism of traditional Chinese medicines (TCMs) that exhibit clinical efficacy for type II diabetes mellitus (T2DM), an integrated protocol that combines molecular docking and pharmacophore mapping was employed to find the potential inhibitors from TCM for the T2DM-rel...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400146u

    authors: Tian S,Li Y,Li D,Xu X,Wang J,Zhang Q,Hou T

    更新日期:2013-07-22 00:00:00

  • Critical Assessment of the Hildebrand and Hansen Solubility Parameters for Polymers.

    abstract::Solubility parameter models are widely used to select suitable solvents/nonsolvents for polymers in a variety of processing and engineering applications. In this study, we focus on two well-established models, namely, the Hildebrand and Hansen solubility parameter models. Both models are built on the basis of the noti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00656

    authors: Venkatram S,Kim C,Chandrasekaran A,Ramprasad R

    更新日期:2019-10-28 00:00:00