Abstract:
:On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, and the ability to share such models still remains a major challenge limiting drug discovery. We describe the creation of a reference implementation of a Bayesian model-building software module, which we have released as an open source component that is now included in the Chemistry Development Kit (CDK) project, as well as implemented in the CDD Vault and in several mobile apps. We use this implementation to build an array of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties. We show that these models possess cross-validation receiver operator curve values comparable to those generated previously in prior publications using alternative tools. We have now described how the implementation of Bayesian models with FCFP6 descriptors generated in the CDD Vault enables the rapid production of robust machine learning models from public data or the user's own datasets. The current study sets the stage for generating models in proprietary software (such as CDD) and exporting these models in a format that could be run in open source software using CDK components. This work also demonstrates that we can enable biocomputation across distributed private or public datasets to enhance drug discovery.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Clark AM,Dole K,Coulon-Spektor A,McNutt A,Grass G,Freundlich JS,Reynolds RC,Ekins Sdoi
10.1021/acs.jcim.5b00143subject
Has Abstractpub_date
2015-06-22 00:00:00pages
1231-45issue
6eissn
1549-9596issn
1549-960Xjournal_volume
55pub_type
杂志文章abstract::The solvation layer surrounding a protein is clearly an intrinsic part of protein structure-dynamics-function, and our understanding of how the hydration dynamics influences protein function is emerging. We have recently reported simulations indicating a correlation between regional hydration dynamics and the structur...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00009
更新日期:2019-05-28 00:00:00
abstract::In this DFT study, activities of 11 different N2O4, N2O3, and NO2 core containing Zr(IV) complexes, 4,13-diaza-18-crown-6 (I'N2O4), 1,4,10-trioxa-7,13-diazacyclopentadecane (I'N2O3), and 2-(2-methoxy)ethanol (I'NO2), respectively, and their analogues in peptide hydrolysis have been investigated. Based on the experimen...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00781
更新日期:2017-05-22 00:00:00
abstract::The homodimeric catabolite activator protein (CAP) regulates the transcription of several bacterial genes based on the cellular concentration of cyclic adenosine monophosphate (cAMP). The binding of cAMP to CAP triggers allosteric communication between the cAMP binding domains (CBD) and DNA binding domains (DBD) of CA...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00617
更新日期:2020-12-28 00:00:00
abstract::The accurate prediction of the adsorption energies of unsaturated molecules on graphene in the presence of water is essential for the design of molecules that can modify its properties and that can aid its processability. We here show that a semiempirical MO method corrected for dispersive interactions (PM6-DH2) can p...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci5003729
更新日期:2014-08-25 00:00:00
abstract::Docking into multiple receptor conformations ("ensemble docking") has been proposed, and employed, in the hope that it may account for receptor flexibility in virtual screening and thus provide higher enrichments than docking into single rigid receptor structures. The statistical analyses presented in this paper provi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900407c
更新日期:2010-04-26 00:00:00
abstract::Protein-ligand binding is essential to almost all life processes. The understanding of protein-ligand interactions is fundamentally important to rational drug and protein design. Based on large scale data sets, we show that protein rigidity strengthening or flexibility reduction is a mechanism in protein-ligand bindin...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00226
更新日期:2017-07-24 00:00:00
abstract::Virtual screening is a powerful methodology to search for new small molecule inhibitors against a desired molecular target. Usually, it involves evaluating thousands of compounds (derived from large databases) in order to select a set of potential binders that will be tested in the wet-lab. The number of tested compou...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00241
更新日期:2017-08-28 00:00:00
abstract::We present our newly developed and highly efficient lossless compression algorithm for trajectories of atom positions and volumetric data. The algorithm is designed as a two-step approach. In the first step, efficient polynomial extrapolation schemes reduce the information entropy of the data by exploiting both spatia...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00501
更新日期:2018-10-22 00:00:00
abstract::Advances in the development of high-throughput screening and automated chemistry have rapidly accelerated the production of chemical and biological data, much of them freely accessible through literature aggregator services such as ChEMBL and PubChem. Here, we explore how to use this comprehensive mapping of chemical ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00526
更新日期:2019-11-25 00:00:00
abstract::Cathepsin A is a mammalian lysosomal enzyme that catalyzes the hydrolysis of the carboxy-terminal amino acids of polypeptides and also regulates beta-galactosidase and neuraminidase-1 activities through the formation of a multienzymic complex in lysosomes. Human cathepsin A (hCathA), yeast carboxypeptidase (CPY), and ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci060093p
更新日期:2006-09-01 00:00:00
abstract::A comprehensive data set of aligned ligands with highly similar binding pockets from the Protein Data Bank has been built. Based on this data set, a scoring function for recognizing good alignment poses for small molecules has been developed. This function is based on atoms and hydrogen-bond projected features. The co...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci100227h
更新日期:2010-09-27 00:00:00
abstract::Metal-ligand (M-L) bond lengths for a range of ligands (carboxylates, chlorides, pyridines, water, tertiary phosphines, and alkenes) and a variety of metals have been retrieved from the Cambridge Structural Database, CSD. Analysis of the factors which affect M-L bond lengths (for example, ligand coordination mode, oxi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0500785
更新日期:2005-11-01 00:00:00
abstract::Sixteen FDA-approved drugs were investigated to elucidate their mechanisms of action (MOAs) and clinical functions by pathway analysis based on retrieved drug targets interacting with or affected by the investigated drugs. Protein and gene targets and associated pathways were obtained by data-mining of public database...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci4005354
更新日期:2014-02-24 00:00:00
abstract::Curvularia lunata is a dark pigmented fungus that is the causative agent of several diseases in plants and in both immunodeficient and immunocompetent patients. 1,8-Dihydroxynaphthalene-melanin is found in the cell wall of C. lunata and is believed to be the important virulence factor of dematiaceous fungi. Trihydroxy...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci2001499
更新日期:2011-07-25 00:00:00
abstract::The programs Phase and Catalyst HypoGen are compared for their performance in determining three-dimensional quantitative structure-activity relationships. Eight sets of compounds with measured activity were collected from the public literature and partitioned into suitable training and test sets by an automated proced...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7000082
更新日期:2007-05-01 00:00:00
abstract::Spin diffusion is a formidable problem when interpreting NMR data of chemical compounds. We developed a method to reconstruct the conformational ensemble of flexible molecules displaying spin diffusion, which minimizes the subjective bias in the interpretation of experimental data and which can be used routinely to ob...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00259
更新日期:2019-06-24 00:00:00
abstract::Dynamical properties of proteins play an essential role in their function exertion. The elastic network model (ENM) is an effective and efficient tool in characterizing the intrinsic dynamical properties encoded in biomacromolecule structures. The Gaussian network model (GNM) and anisotropic network model (ANM) are th...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c01178
更新日期:2021-01-26 00:00:00
abstract::The applicability and scope of 3D QSAR methods (CoMFA, CoMSIA) to screen databases are examined. A protocol requiring minimal user intervention has been established to align training and test set molecules using FlexS. As model system isozymes of human carbonic anhydrase (hCA) are used, all results are exemplified stu...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7002945
更新日期:2008-02-01 00:00:00
abstract::As computational drug design becomes increasingly reliant on virtual screening and on high-throughput 3D modeling, the need for fast, robust, and reliable methods for sampling molecular conformations has become greater than ever. Furthermore, chemical novelty is at a premium, forcing medicinal chemists to explore more...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900238a
更新日期:2009-10-01 00:00:00
abstract::The knowledge of the capacity of a data set to be modeled in the first stages of the building of quantitative structure-activity relationship (QSAR) prediction models is an important issue because it might reduce the effort and time necessary to select or reject data sets and in refining the data set's composition. Th...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00188
更新日期:2018-09-24 00:00:00
abstract::Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00249
更新日期:2017-08-28 00:00:00
abstract::This paper is an overview of the most significant and impactful interpretation approaches of quantitative structure-activity relationship (QSAR) models, their development, and application. The evolution of the interpretation paradigm from "model → descriptors → (structure)" to "model → structure" is indicated. The lat...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章,评审
doi:10.1021/acs.jcim.7b00274
更新日期:2017-11-27 00:00:00
abstract::Reversible covalent inhibitors have drawn increasing attention in drug design, as they are likely more potent than noncovalent inhibitors and less toxic than covalent inhibitors. Despite those advantages, the computational prediction of reversible covalent binding presents a formidable challenge because the binding pr...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00959
更新日期:2019-05-28 00:00:00
abstract::In this study, two probabilistic machine-learning algorithms were compared for in silico target prediction of bioactive molecules, namely the well-established Laplacian-modified Naïve Bayes classifier (NB) and the more recently introduced (to Cheminformatics) Parzen-Rosenblatt Window. Both classifiers were trained in ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300435j
更新日期:2013-08-26 00:00:00
abstract::Saturated acyclic alkanes show steric strain if they are highly branched and, in extreme cases, fall apart rapidly at room temperature. Consequently, attempts to count the number of isomeric forms for a given molecular formula that neglect this physical consideration will inevitably overestimate the size of the availa...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700246b
更新日期:2007-11-01 00:00:00
abstract::One of the largest commercial applications of enzymes and surfactants is as main components in modern detergents. The high concentration of surfactant compounds usually present in detergents can, however, negatively affect the enzymatic activity. To remedy this drawback, it is of great importance to characterize the i...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00857
更新日期:2019-05-28 00:00:00
abstract::Flavins are versatile biological cofactors which catalyze proton-coupled electron transfers (PCET) with varying number and coupling of electrons. Flavin-mediated oxidations of nicotinamide adenine dinucleotide (NADH) and of succinate, initial redox reactions in cellular respiration, were examined here with multiconfig...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00945
更新日期:2020-12-28 00:00:00
abstract::Protein-protein interactions are central to many biological processes, from intracellular communication to cytoskeleton assembly, and therefore represent an important class of targets for new therapeutics. The most common secondary structure in natural proteins is an α-helix. Small molecules seem to be attractive cand...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci200424a
更新日期:2012-02-27 00:00:00
abstract::Glycan Optimized Dual Empirical Spectrum Simulation (GODESS) is a web service, which has been recently shown to be one of the most accurate tools for simulation of (1)H and (13)C 1D NMR spectra of natural carbohydrates and their derivatives. The new version of GODESS supports visualization of the simulated (1)H and (1...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.6b00083
更新日期:2016-06-27 00:00:00
abstract::Deep learning has demonstrated significant potential in advancing state of the art in many problem domains, especially those benefiting from automated feature extraction. Yet, the methodology has seen limited adoption in the field of ligand-based virtual screening (LBVS) as traditional approaches typically require lar...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00622
更新日期:2020-10-26 00:00:00