Descriptor collision and confusion: toward the design of descriptors to mask chemical structures.

Abstract:

:We examined "descriptor collision" for several chemical fingerprint systems (MDL 320, Daylight, SMDL), and for a 2D-based descriptor set. For large databases (ChemNavigator and WOMBAT), the smallest collision rate remains around 5%. We systematically increase the "descriptor collision" rate (here termed "descriptor confusion"), in order to design a set of "descriptors to mask chemical structures", DMCS. If effective, a DMCS system would not allow third parties to determine the original chemical structures used to derive the DMCS set (i.e., reverse engineering). Using SMDL keys, the "confusion" rate is increased to 45.6% by eliminating those keys that have a low frequency of occurrence in WOMBAT structures. We applied an automated PLS engine, WB-PLS [Olah et al., J. Comput. Aided Mol. Des., 18 (2004) 437], to 1277 series of structures from 948 targets in WOMBAT, in order to validate the biological relevance of the SMDL descriptors as a potential DMCS set. The "reduced set" of SMDL descriptors has a small loss of modeling power (around 20%) compared to the initial descriptor set, while the collision rate is significantly increased. These results indicate that the development of an effective DMCS is possible. If well documented, DMCS systems would encourage private sector data release (e.g., related to water solubility) and directly benefit public sector science.

journal_name

J Comput Aided Mol Des

authors

Bologa C,Allu TK,Olah M,Kappler MA,Oprea TI

doi

10.1007/s10822-005-9020-4

subject

Has Abstract

pub_date

2005-09-01 00:00:00

pages

625-35

issue

9-10

eissn

0920-654X

issn

1573-4951

journal_volume

19

pub_type

杂志文章
  • Improving small molecule force fields by identifying and characterizing small molecules with inconsistent parameters.

    abstract::Many molecular simulation methods use force fields to help model and simulate molecules and their behavior in various environments. Force fields are sets of functions and parameters used to calculate the potential energy of a chemical system as a function of the atomic coordinates. Despite the widespread use of force ...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-020-00367-1

    authors: Ehrman JN,Lim VT,Bannan CC,Thi N,Kyu DY,Mobley DL

    更新日期:2021-01-28 00:00:00

  • Automated site-directed drug design: searches of the Cambridge Structural Database for bond lengths in molecular fragments to be used for automated structure assembly.

    abstract::In this paper a database of small frequently occurring molecular fragments is used for the determination of fragment bond lengths from the Cambridge Structural Database. A large number of bond types are described that have not been reported previously. ...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/BF00125946

    authors: Chau PL,Dean PM

    更新日期:1992-08-01 00:00:00

  • New designs for MRI contrast agents.

    abstract::New designs for Magnetic Resonance Imaging contrast agents are presented. Essentially, they all are host-guest inclusion complexes between y-cyclodextrins and polyazamacrocycles of gadolinium (III) ion. Substitutions have been made to the host to optimise the host-guest association. Molecular mechanics calculations ha...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1023/a:1027347527385

    authors: Fernandes PA,Carvalho AT,Marques AT,Pereira AL,Madeira AP,Ribeiro AS,Carvalho AF,Ricardo ET,Pinto FJ,Santos HA,Mangericão HD,Martins HM,Pinto HD,Santos HR,Moreira IS,Azeredo MJ,Abreu RP,Oliveira RM,Sousa SF,Silva RJ

    更新日期:2003-07-01 00:00:00

  • Property distribution of drug-related chemical databases.

    abstract::The process of compound selection and prioritization is crucial for both combinatorial chemistry (CBC) and high throughput screening (HTS). Compound libraries have to be screened for unwanted chemical structures, as well as for unwanted chemical properties. Property extrema can be eliminated by using property filters,...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1023/a:1008130001697

    authors: Oprea TI

    更新日期:2000-03-01 00:00:00

  • 3D-QSAR and docking studies on 4-anilinoquinazoline and 4-anilinoquinoline epidermal growth factor receptor (EGFR) tyrosine kinase inhibitors.

    abstract::The overexpression and/or mutation of the epidermal growth factor receptor (EGFR) tyrosine kinase has been observed in many human solid tumors, and is under intense investigation as a novel anticancer molecular target. Comparative 3D-QSAR analyses using different alignments were undertaken employing comparative molecu...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1023/b:jcam.0000004622.13865.4f

    authors: Assefa H,Kamath S,Buolamwini JK

    更新日期:2003-08-01 00:00:00

  • QSPR ensemble modelling of the 1:1 and 1:2 complexation of Co²⁺, Ni²⁺, and Cu²⁺ with organic ligands: relationships between stability constants.

    abstract::Quantitative structure-property relationship (QSPR) modeling of stability constants for the metal:ligand ratio 1:1 (logK) and 1:2 (logβ2) complexes of 3 transition metal ions with diverse organic ligands in aqueous solution was performed using ensemble multiple linear regression analysis and substructural molecular fr...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-014-9741-3

    authors: Solov'ev V,Varnek A,Tsivadze A

    更新日期:2014-05-01 00:00:00

  • Can we separate active from inactive conformations?

    abstract::Molecular modeling methodologies such as molecular docking, pharmacophore modeling, and 3D-QSAR, rely on conformational searches of small molecules as a starting point. All of these methodologies seek conformations of the small molecules as they bind to target proteins, i.e., their active conformations. Thus the quest...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1023/a:1016320106741

    authors: Diller DJ,Merz KM Jr

    更新日期:2002-02-01 00:00:00

  • SAMPL6: calculation of macroscopic pKa values from ab initio quantum mechanical free energies.

    abstract::Macroscopic pKa values were calculated for all compounds in the SAMPL6 blind prediction challenge, based on quantum chemical calculations with a continuum solvation model and a linear correction derived from a small training set. Microscopic pKa values were derived from the gas-phase free energy difference between pro...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-018-0138-6

    authors: Selwa E,Kenney IM,Beckstein O,Iorga BI

    更新日期:2018-10-01 00:00:00

  • Comparative molecular field analysis and energy interaction studies of thrombin-inhibitor complexes.

    abstract::A Comparative Molecular Field Analysis (CoMFA) and an interaction energy-based method were applied on a database holding the 3D structures of 29 thrombin-inhibitor complexes. Several parameters were optimized in both methods in order to obtain the best correlation between theoretical and experimentally determined bind...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1023/a:1008010016362

    authors: Bursi R,Grootenhuis PD

    更新日期:1999-05-01 00:00:00

  • Simple knowledge-based descriptors to predict protein-ligand interactions. methodology and validation.

    abstract::A new type of shape descriptor is proposed to describe the spatial orientation for non-covalent interactions. It is built from simple, anisotropic Gaussian contributions that are parameterised by 10 adjustable values. The descriptors have been used to fit propensity distributions derived from scatter data stored in th...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1023/a:1008109717641

    authors: Nissink JWM,Verdonk ML,Klebe G

    更新日期:2000-11-01 00:00:00

  • Using a pharmacophore representation concept to elucidate molecular similarity of dopamine antagonists.

    abstract::The pharmacophoric concept plays an important role in ligand-based drug design methods to describe the similarity and diversity of molecules, and could also be exploited as a molecular representation scheme. A three-point pharmacophore method was used as a molecular representation perception. This procedure was implem...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-007-9110-6

    authors: Atlamazoglou V,Thireou T,Eliopoulos E

    更新日期:2007-05-01 00:00:00

  • Molecular modeling of the intestinal bile acid carrier: a comparative molecular field analysis study.

    abstract::A structure-binding activity relationship for the intestinal bile acid transporter has been developed using data from a series of bile acid analogs in a comparative molecular field analysis (CoMFA). The studied compounds consisted of a series of bile acid-peptide conjugates, with modifications at the 24 position of th...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1023/a:1007919704457

    authors: Swaan PW,Szoka FC Jr,Oie S

    更新日期:1997-11-01 00:00:00

  • Computational studies of new potential antimalarial compounds--stereoelectronic complementarity with the receptor.

    abstract::One of the most important pharmacological mechanisms of antimalarial action is the inhibition of the aggregation of hematin into hemozoin. We present a group of new potential antimalarial molecules for which we have performed a DFT study of their stereoelectronic properties. Additionally, the same calculations were ca...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1023/b:jcam.0000005754.24588.a0

    authors: Portela C,Afonso CM,Pinto MM,Ramos MJ

    更新日期:2003-09-01 00:00:00

  • First virtual screening and experimental validation of inhibitors targeting GES-5 carbapenemase.

    abstract::The worldwide spread of beta-lactamases with hydrolytic activity extended to last resort carbapenems is aggravating the antibiotic resistance problem and endangers the successful antimicrobial treatment of clinically relevant pathogens. As recently highlighted by the World Health Organization, new strategies to contai...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-018-0182-2

    authors: Spyrakis F,Bellio P,Quotadamo A,Linciano P,Benedetti P,D'Arrigo G,Baroni M,Cendron L,Celenza G,Tondi D

    更新日期:2019-02-01 00:00:00

  • Protein-ligand docking using FFT based sampling: D3R case study.

    abstract::Fast Fourier transform (FFT) based approaches have been successful in application to modeling of relatively rigid protein-protein complexes. Recently, we have been able to adapt the FFT methodology to treatment of flexible protein-peptide interactions. Here, we report our latest attempt to expand the capabilities of t...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-017-0069-7

    authors: Padhorny D,Hall DR,Mirzaei H,Mamonov AB,Moghadasi M,Alekseenko A,Beglov D,Kozakov D

    更新日期:2018-01-01 00:00:00

  • Geometry optimization method versus predictive ability in QSPR modeling for ionic liquids.

    abstract::Computational techniques, such as Quantitative Structure-Property Relationship (QSPR) modeling, are very useful in predicting physicochemical properties of various chemicals. Building QSPR models requires calculating molecular descriptors and the proper choice of the geometry optimization method, which will be dedicat...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-016-9894-3

    authors: Rybinska A,Sosnowska A,Barycki M,Puzyn T

    更新日期:2016-02-01 00:00:00

  • Human topoisomerase I poisoning: docking protoberberines into a structure-based binding site model.

    abstract::Using the X-ray crystal structure of the human topoisomerase I (top1) - DNA cleavable complex and the Sybyl software package, we have developed a general model for the ternary cleavable complex formed with four protoberberine alkaloids differing in the substitution on the terminal phenyl rings and covering a broad ran...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-004-7878-1

    authors: Kettmann V,Kost'álová D,Höltje HD

    更新日期:2004-12-01 00:00:00

  • A proposed common spatial pharmacophore and the corresponding active conformations of some peptide leukotriene receptor antagonists.

    abstract::Molecular modeling studies were carried out by a combined use of conformational analysis and 3D-QSAR methods of identify molecular features common to a series of hydroxyacetophenone (HAP) and non-hydroxyacetophenone (non-HAP) peptide leukotriene (pLT) receptor antagonists. In attempts to develop a ligand-binding model...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/BF00124498

    authors: Hariprasad V,Kulkarni VM

    更新日期:1996-08-01 00:00:00

  • The IUPAC aqueous and non-aqueous experimental pKa data repositories of organic acids and bases.

    abstract::Accurate and well-curated experimental pKa data of organic acids and bases in both aqueous and non-aqueous media are invaluable in many areas of chemical research, including pharmaceutical, agrochemical, specialty chemical and property prediction research. In pharmaceutical research, pKa data are relevant in ligand de...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-014-9764-9

    authors: Slater AM

    更新日期:2014-10-01 00:00:00

  • Computational study on mechanism of G-quartet oligonucleotide T40214 selectively targeting Stat3.

    abstract::The mounting evidences have shown that signal transducer and activator of transcription 3 (Stat3) is a critical target for cancer therapy. Recently, we developed a G-quartet oligonucleotide T40214 as a novel and potent Stat3 inhibitor. T40214 specifically inhibited DNA-binding activity of Stat3 and significantly suppr...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-007-9147-6

    authors: Zhu Q,Jing N

    更新日期:2007-10-01 00:00:00

  • The impact of data integrity on decision making in early lead discovery.

    abstract::Data driven decision making is a key element of today's pharmaceutical research, including early drug discovery. It comprises questions like which target to pursue, which chemical series to pursue, which compound to make next, or which compound to select for advanced profiling and promotion to pre-clinical development...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-015-9871-2

    authors: Beck B,Seeliger D,Kriegl JM

    更新日期:2015-09-01 00:00:00

  • Cavity search: an algorithm for the isolation and display of cavity-like binding regions.

    abstract::A set of algorithms designed to enhance the display of protein binding cavities is presented. These algorithms, collectively entitled CAVITY SEARCH, allow the user to isolate and fully define the extent of a particular cavity. Solid modeling techniques are employed to produce a detailed cast of the active site region,...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/BF00117400

    authors: Ho CM,Marshall GR

    更新日期:1990-12-01 00:00:00

  • The structure-activity relationship of inhibitors of serotonin uptake and receptor binding.

    abstract::An analysis of five different datasets of inhibitors of serotonin uptake has yielded quantitative structure/activity relationships (QSARs) which delineate the role of steric and hydrophobic properties essential for inhibition by phenylethylamine-type analogues. ...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/BF00125664

    authors: Hansch C,Caldwell J

    更新日期:1991-10-01 00:00:00

  • Intermediate states in the binding process of folic acid to folate receptor α: insights by molecular dynamics and metadynamics.

    abstract::Folate receptor α (FRα) is a cell surface, glycophosphatidylinositol-anchored protein which has focussed attention as a therapeutic target and as a marker for the diagnosis of cancer. It has a high affinity for the dietary supplemented folic acid (FOL), carrying out endocytic transport across the cell membrane and del...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-014-9801-8

    authors: Della-Longa S,Arcovito A

    更新日期:2015-01-01 00:00:00

  • Functionality map analysis of the active site cleft of human thrombin.

    abstract::The Multiple Copy Simultaneous Search methodology has been used to construct functionality maps for an extended region of human thrombin, including the active site. This method allows the determination of energetically favorable positions and orientations for functional groups defined by the user on the three-dimensio...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/BF00124460

    authors: Grootenhuis PD,Karplus M

    更新日期:1996-02-01 00:00:00

  • Identification of novel target sites and an inhibitor of the dengue virus E protein.

    abstract::Dengue and related flaviviruses represent a significant global health threat. The envelope glycoprotein E mediates virus attachment to a host cell and the subsequent fusion of viral and host cell membranes. The fusion process is driven by conformational changes in the E protein and is an essential step in the virus li...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-009-9263-6

    authors: Yennamalli R,Subbarao N,Kampmann T,McGeary RP,Young PR,Kobe B

    更新日期:2009-06-01 00:00:00

  • Toward the discovery of inhibitors of babesipain-1, a Babesia bigemina cysteine protease: in vitro evaluation, homology modeling and molecular docking studies.

    abstract::Babesia bigemina is a protozoan parasite that causes babesiosis, a disease with a world-wide distribution in mammals, principally affecting cattle and man. The unveiling of the genome of B. bigemina is a project in active progress that has already revealed a number of new targets with potential interest for the design...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-013-9682-2

    authors: Pérez B,Antunes S,Gonçalves LM,Domingos A,Gomes JR,Gomes P,Teixeira C

    更新日期:2013-09-01 00:00:00

  • Modelling of carbohydrate-aromatic interactions: ab initio energetics and force field performance.

    abstract::Aromatic amino acid residues are often present in carbohydrate-binding sites of proteins. These binding sites are characterized by a placement of a carbohydrate moiety in a stacking orientation to an aromatic ring. This arrangement is an example of CH/pi interactions. Ab initio interaction energies for 20 carbohydrate...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-005-9033-z

    authors: Spiwok V,Lipovová P,Skálová T,Vondrácková E,Dohnálek J,Hasek J,Králová B

    更新日期:2005-12-01 00:00:00

  • The computational design of test compounds with potentially specific biological activity: histamine-H2 agonists derived from 5-HT/H2 antagonists.

    abstract::The previously proposed models for the recognition and activation of 5-HT and histamine-H2 receptors, which were employed to explain the antagonist activity of LSD at both of these receptors, as well as the selective antagonism for H2 receptors by SKF-10856 and 9,10-dihydro-LSD, are used herein to design a compound to...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/BF00124342

    authors: Topiol S,Sabio M

    更新日期:1991-06-01 00:00:00

  • Why relevant chemical information cannot be exchanged without disclosing structures.

    abstract::Both society and industry are interested in increasing the safety of pharmaceuticals. Potentially dangerous compounds could be filtered out at early stages of R&D by computer prediction of biological activity and ADMET characteristics. Accuracy of such predictions strongly depends on the quality & quantity of informat...

    journal_title:Journal of computer-aided molecular design

    pub_type: 杂志文章

    doi:10.1007/s10822-005-9014-2

    authors: Filimonov D,Poroikov V

    更新日期:2005-09-01 00:00:00