Abstract:
:Deep learning has demonstrated significant potential in advancing state of the art in many problem domains, especially those benefiting from automated feature extraction. Yet, the methodology has seen limited adoption in the field of ligand-based virtual screening (LBVS) as traditional approaches typically require large, target-specific training sets, which limits their value in most prospective applications. Here, we report the development of a neural network architecture and a learning framework designed to yield a generally applicable tool for LBVS. Our approach uses the molecular graph as input and involves learning a representation that places compounds of similar biological profiles in close proximity within a hyperdimensional feature space; this is achieved by simultaneously leveraging historical screening data against a multitude of targets during training. Cosine distance between molecules in this space becomes a general similarity metric and can readily be used to rank order database compounds in LBVS workflows. We demonstrate the resulting model generalizes exceptionally well to compounds and targets not used in its training. In three commonly employed LBVS benchmarks, our method outperforms popular fingerprinting algorithms without the need for any target-specific training. Moreover, we show the learned representation yields superior performance in scaffold hopping tasks and is largely orthogonal to existing fingerprints. Summarily, we have developed and validated a framework for learning a molecular representation that is applicable to LBVS in a target-agnostic fashion, with as few as one query compound. Our approach can also enable organizations to generate additional value from large screening data repositories, and to this end we are making its implementation freely available at https://github.com/totient-bio/gatnn-vs.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Stojanović L,Popović M,Tijanić N,Rakočević G,Kalinić Mdoi
10.1021/acs.jcim.0c00622subject
Has Abstractpub_date
2020-10-26 00:00:00pages
4629-4639issue
10eissn
1549-9596issn
1549-960Xjournal_volume
60pub_type
杂志文章abstract::Halogen bonds (XBs) are attracting increasing attention in biological systems. Protein Data Bank (PDB) archives experimentally determined XBs in biological macromolecules. However, no software for structure refinement in X-ray crystallography takes into account XBs, which might result in the weakening or even vanishin...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00235
更新日期:2017-07-24 00:00:00
abstract::Thermoplastic polyurethanes (TPUs) are designed using a large variety of basic building blocks but are only synthesized in a limited number of solvent systems. Understanding the behavior of the copolymers in a selected solvent system is of particular interest to tune the intricate balance of microphase separation/mixi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00781
更新日期:2019-05-28 00:00:00
abstract::Methods that rapidly evaluate molecular complexity and synthetic feasibility are becoming increasingly important for in silico chemistry. We propose a new metric based on relative atomic electronegativities and bond parameters that evaluate both synthetic and molecular complexity (SMCM) starting from chemical structur...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0501387
更新日期:2005-09-01 00:00:00
abstract::Fragment-based methods have emerged in the last two decades as alternatives to traditional high throughput screenings for the identification of chemical starting points in drug discovery. One arguable yet popular assumption about fragment-based design is that the fragment binding mode remains conserved upon chemical e...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300355p
更新日期:2012-12-21 00:00:00
abstract::With the hope of discovering effective analgesics with fewer side effects, attention has recently shifted to allosteric modulators of the opioid receptors. In the past two years, the first chemotypes of positive or silent allosteric modulators (PAMs or SAMs, respectively) of μ- and δ-opioid receptor types have been re...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00388
更新日期:2015-09-28 00:00:00
abstract::This study examines the dependence of molecular alignment accuracy on a variety of factors including the choice of molecular template, alignment method, conformational flexibility, and type of protein target. We used eight test systems for which X-ray data on 145 ligand-protein complexes were available. The use of X-r...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060134h
更新日期:2006-09-01 00:00:00
abstract::Molecular fingerprints are widely used for similarity-based virtual screening in drug discovery projects. In this paper we discuss the performance and the complementarity of nine two-dimensional fingerprints (Daylight, Unity, AlFi, Hologram, CATS, TRUST, Molprint 2D, ChemGPS, and ALOGP) in retrieving active molecules ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0504723
更新日期:2006-05-01 00:00:00
abstract::Small-molecule protein docking is an essential tool in drug design and to understand molecular recognition. In the present work we introduce FlexAID, a small-molecule docking algorithm that accounts for target side-chain flexibility and utilizes a soft scoring function, i.e. one that is not highly dependent on specifi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00078
更新日期:2015-07-27 00:00:00
abstract::The effects of paclitaxel (PTX) loading fraction and spatial PTX arrangement on poly(γ-glutamyl-glutamate) paclitaxel (PGG-PTX) aggregation were explored using coarse-grained molecular dynamics. Results show that the PTX loading fraction does not significantly impact aggregation, and the spatial PTX arrangement only a...
journal_title:Journal of chemical information and modeling
pub_type: 信件
doi:10.1021/ci200214m
更新日期:2011-12-27 00:00:00
abstract::Activity cliffs are formed by structurally similar compounds having large potency differences. Coordinated activity cliffs evolve when compounds within groups of structural neighbors form multiple cliffs with different partners, giving rise to local networks of cliffs in a data set. Using particle swarm optimization, ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci3000503
更新日期:2012-04-23 00:00:00
abstract::We introduce SARANEA, an open-source Java application for interactive exploration of structure-activity relationship (SAR) and structure-selectivity relationship (SSR) information in compound sets of any source. SARANEA integrates various SAR and SSR analysis functions and utilizes a network-like similarity graph data...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900416a
更新日期:2010-01-01 00:00:00
abstract::A compound's synthetic accessibility (SA) is an important aspect of drug design, since in some cases computer-designed compounds cannot be synthesized. There have been several reports on SA prediction, most of which have focused on the difficulties of synthetic reactions based on retro-synthesis analyses, reaction dat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500568d
更新日期:2014-12-22 00:00:00
abstract::Membrane fusion, a key step in the early stages of virus propagation, allows the release of the viral genome in the host cell cytoplasm. The process is initiated by fusion peptides that are small, hydrophobic components of viral membrane-embedded glycoproteins and are typically conserved within virus families. Here, w...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c01231
更新日期:2021-01-25 00:00:00
abstract::The interaction between small molecules and proteins is one of the major concerns for structure-based drug design because the principles of protein-ligand interactions and molecular recognition are not thoroughly understood. Fortunately, the analysis of protein-ligand complexes in the Protein Data Bank (PDB) enables u...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100386y
更新日期:2011-04-25 00:00:00
abstract::We have applied the two most commonly used methods for automatic matched pair identification, obtained the optimum settings, and discovered that the two methods are synergistic. A turbocharging approach to matched pair analysis is advocated in which a first round (a conservative categorical approach that uses an analo...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00335
更新日期:2017-10-23 00:00:00
abstract::The primary goal of this project was to evaluate the performance of the Standard and Enforced Geometry Optimization (SEGO) method which we have recently developed. The SEGO method has been designed for an automatic location of multiple minima on the molecular Potential Energy Surface (PES), and its usefulness has been...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00352
更新日期:2019-08-26 00:00:00
abstract::Advances in the development of high-throughput screening and automated chemistry have rapidly accelerated the production of chemical and biological data, much of them freely accessible through literature aggregator services such as ChEMBL and PubChem. Here, we explore how to use this comprehensive mapping of chemical ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00526
更新日期:2019-11-25 00:00:00
abstract::Generalization of an earlier algorithm has led to the development of new local structural alignment algorithms for prediction of protein-protein binding sites. The algorithms use maximum cliques on protein graphs to define structurally similar protein regions. The search for structural neighbors in the new algorithms ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100265x
更新日期:2010-10-25 00:00:00
abstract::Cytochrome P450 3A4 metabolizes nearly 50% of the drugs currently in clinical use with a broad range of substrate specificity. Early prediction of metabolites of xenobiotic compounds is crucial for cost efficient drug discovery and development. We developed a new combined model, MLite, for the prediction of regioselec...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7003576
更新日期:2008-03-01 00:00:00
abstract::Throughout the drug discovery process, discovery teams are compelled to use statistics for making decisions using data from a variety of inputs. For instance, teams are asked to prioritize compounds for subsequent stages of the drug discovery process, given results from multiple screens. To assist in the prioritizatio...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600556v
更新日期:2007-05-01 00:00:00
abstract::The Torsion Library contains hundreds of rules for small molecule conformations which have been derived from the Cambridge Structural Database (CSD) and are curated by molecular design experts. The torsion rules are encoded as SMARTS patterns and categorize rotatable bonds via a traffic light coloring scheme. We have ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00522
更新日期:2016-01-25 00:00:00
abstract::As a key player in cell adhesion, the glycoprotein fibronectin is involved in the complex mechanobiology of the extracellular matrix. Although the function of many modules in the fibronectin molecule has already been understood, the structure and biological relevance of the C-terminal cross-linked region (CTXL) still ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00555
更新日期:2019-10-28 00:00:00
abstract::Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00249
更新日期:2017-08-28 00:00:00
abstract::The antiproliferative factor (APF) involved in interstitial cystitis is a glycosylated nonapeptide (TVPAAVVVA) containing a sialylated core 1 α-O-disaccharide linked to the N-terminal threonine. The chemical structure of APF was deduced using spectroscopic techniques and confirmed using total synthesis. The synthetic ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400147s
更新日期:2013-05-24 00:00:00
abstract::The solvation layer surrounding a protein is clearly an intrinsic part of protein structure-dynamics-function, and our understanding of how the hydration dynamics influences protein function is emerging. We have recently reported simulations indicating a correlation between regional hydration dynamics and the structur...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00009
更新日期:2019-05-28 00:00:00
abstract::The configuring of a radial basis function network (RBFN) consists of selecting the network parameters (centers and widths in RBF units and weights between the hidden and output layers) and network architecture. The issues of suboptimum and overfitting, however, often occur in RBFN configuring. This paper presented a ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci600218d
更新日期:2006-11-01 00:00:00
abstract::The SZMAP method computes binding free energies and the corresponding thermodynamic components for water molecules in the binding site of a protein structure [ SZMAP, 1.0.0 ; OpenEye Scientific Software Inc. : Santa Fe, NM, USA , 2011 ]. In this work, the ability of SZMAP to predict water structure and thermodynamic s...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500746d
更新日期:2015-08-24 00:00:00
abstract::The momentum gained by research on biologics has not been met yet with equal thrust on the informatics side. There is a noticeable lack of software for data management that empowers the bench scientists working on the development of biologic therapeutics. SARvision|Biologics is a tool to analyze data associated with b...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400333x
更新日期:2013-10-28 00:00:00
abstract::We introduce TICRA (transplant-insert-constrain-relax-assemble), a method for modeling the structure of unknown protein-ligand complexes using the X-ray crystal structures of homologous proteins and ligands with known activity. We present results from modeling the structures of protein kinase-inhibitor complexes using...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100256u
更新日期:2011-01-24 00:00:00
abstract::Our recent studies show that the single Tyr residue in the sequence of amyloid-β42 (Aβ42) is reactive toward various ligands, including metals and adenosine trisphospate (see: Coskuner , O. J. Biol. Inorg. Chem. 2016 , 21 , 957 - 973 and Coskuner , O. ; Murray , I. V. J. J. Alzheimer's Dis. 2014 , 41 , 561 - 574 ). Ho...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00761
更新日期:2017-06-26 00:00:00