Abstract:
:Retrieving molecules with specific structural features is a fundamental requirement of today's molecular database technologies. Estimates claim the chemical space relevant for drug discovery to be around 10⁶⁰ molecules. This figure is many orders of magnitude larger than the amount of molecules conventional databases retain today and will store in the future. An elegant description of such a large chemical space is provided by the concept of fragment spaces. A fragment space comprises fragments that are molecules with open valences and describes rules how to connect these fragments to products. Due to the combinatorial nature of fragment spaces, a complete enumeration of its products is intractable. We present an algorithm to search fragment spaces for generic chemical patterns as present in the SMARTS chemical pattern language. Our method allows specification of the chemical surrounding of an atom in a query and, therefore, enables a chemically intuitive search. During the search, the costly enumeration of products is avoided. The result is a fragment space that exactly describes all possible molecules that contain the user-defined pattern. We evaluated the algorithm in three different drug development use-cases and performed a large scale statistical analysis with 738 SMARTS patterns on three public available fragment spaces. Our results show the ability of the algorithm to explore the chemical space around known active molecules, to analyze fragment spaces for the presence of likely toxic molecules, and to identify complex macromolecular structures under additional structural constraints. By searching the fragment space in its nonenumerated form, spaces covering up to 10¹⁹ molecules can be examined in times ranging between 47 s and 19 min depending on the complexity of the query pattern.
journal_name
J Chem Inf Modeljournal_title
Journal of chemical information and modelingauthors
Ehrlich HC,Henzler AM,Rarey Mdoi
10.1021/ci400107ksubject
Has Abstractpub_date
2013-07-22 00:00:00pages
1676-88issue
7eissn
1549-9596issn
1549-960Xjournal_volume
53pub_type
杂志文章abstract::A generic chemical transformation may often be achieved under various synthetic conditions. However, for any specific reagents, only one or a few among the reported synthetic protocols may be successful. For example, Michael β-addition reactions may proceed under different choices of solvent (e.g., hydrophobic, aproti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500698a
更新日期:2015-02-23 00:00:00
abstract::With a rapid increase in the number of high-resolution protein-ligand structures, the known protein-ligand structures can be used to gain insight into ligand-binding modes in a target protein. On the basis of the fact that the structurally similar binding sites share information about their ligands, we have developed ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci300178e
更新日期:2012-10-22 00:00:00
abstract::Following the theoretical model by Hann et al. moderately complex structures are preferable lead compounds since they lead to specific binding events involving the complete ligand molecule. To make this concept usable in practice for library design, we studied several complexity measures on the biological activity of ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0503558
更新日期:2006-03-01 00:00:00
abstract::The second extracellular loops (ECL2s) of G-protein-coupled receptors (GPCRs) are often involved in GPCR functions, and their structures have important implications in drug discovery. However, structure prediction of ECL2 is difficult because of its long length and the structural diversity among different GPCRs. In th...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00148
更新日期:2018-06-25 00:00:00
abstract::Deep learning has drawn significant attention in different areas including drug discovery. It has been proposed that it could outperform other machine learning algorithms, especially with big data sets. In the field of pharmaceutical industry, machine learning models are built to understand quantitative structure-acti...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00671
更新日期:2019-03-25 00:00:00
abstract::Molecular docking can account for receptor flexibility by combining the docking score over multiple rigid receptor conformations, such as snapshots from a molecular dynamics simulation. Here, we evaluate a number of common snapshot selection strategies using a quality metric from stratified sampling, the efficiency of...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00314
更新日期:2018-09-24 00:00:00
abstract::The ONIOM scheme is one of the most popular QM/MM approaches, but its extended application has been so far hindered by the limited availability of force fields in most practical implementations. This paper describes a simple software code to overcome this limitation, and its application to three representative chemica...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.8b00332
更新日期:2018-09-24 00:00:00
abstract::There is a renewed interest in computer-aided synthesis planning, where the vast majority of approaches require the application of retrosynthetic reaction templates. Here we introduce RDChiral, an open-source Python wrapper for RDKit designed to provide consistent handling of stereochemical information in applying ret...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00286
更新日期:2019-06-24 00:00:00
abstract::Small molecules targeting peripheral CB1 receptors have therapeutic potential in a variety of disorders including obesity-related, hormonal, and metabolic abnormalities, while avoiding the psychoactive effects in the central nervous system. We applied our in-house algorithm, iterative stochastic elimination, to produc...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00577
更新日期:2019-09-23 00:00:00
abstract::While critically reviewing the current status of what is known about C28H14 and C30H14 benzenoid isomers, which are ubiquitous pyrolytic constituents, some new insights will be presented. Representative isomers belonging to these benzenoid hydrocarbons are at the crossroads to homologous series that extend to infinite...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050298i
更新日期:2006-03-01 00:00:00
abstract::The roles of chemical compounds in biological systems are now systematically analyzed by high-throughput experimental technologies. To automate the processing and interpretation of large-scale data it is necessary to develop bioinformatics methods to extract information from the chemical structures of these small mole...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700006f
更新日期:2007-07-01 00:00:00
abstract::Side-chain modeling is critical for protein structure prediction since the uniqueness of the protein structure is largely determined by its side-chain packing conformation. In this paper, differing from most approaches that rely on rotamer library sampling, we first propose a novel side-chain rotamer prediction method...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00951
更新日期:2020-12-28 00:00:00
abstract::The hepatitis C virus (HCV) NS5B RNA-dependent RNA polymerase (RdRP) is a crucial and unique component of the HCV RNA replication machinery and a validated target for drug discovery. Multiple crystal structures of NS5B inhibitor complexes have facilitated the identification of novel compound scaffolds through in silic...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci400644r
更新日期:2014-02-24 00:00:00
abstract::Up to now, publicly available data sets to build and evaluate Ames mutagenicity prediction tools have been very limited in terms of size and chemical space covered. In this report we describe a new unique public Ames mutagenicity data set comprising about 6500 nonconfidential compounds (available as SMILES strings and...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci900161g
更新日期:2009-09-01 00:00:00
abstract::The Common Instrument Middleware Architecture (CIMA) aims at Grid-enabling a wide range of scientific instruments and sensors to enable easy access to and sharing and storage of data produced by these instruments and sensors. This paper describes the implementation of CIMA applied to the field of single-crystal X-ray ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci050368l
更新日期:2006-05-01 00:00:00
abstract::The modeling of nonlinear descriptor-target relationships is a topic of considerable interest in drug discovery. We, herein, continue reporting the use of the self-organizing map-a nonlinear, topology-preserving pattern recognition technique that exhibits considerable promise in modeling and decoding these relationshi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci0500841
更新日期:2006-01-01 00:00:00
abstract::Protein-protein interactions (PPIs) play vital roles in regulating biological processes, such as cellular and signaling pathways. Hotspots are certain residues located at protein-protein interfaces that contribute more in protein-protein binding than other residues. Research on the mutational effects of hotspots is im...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00966
更新日期:2021-01-25 00:00:00
abstract::A model for prediction of percent intestinal absorption (%Abs) of neutral molecules was developed based upon surface charges of the molecule calculated by density functional theory (DFT). The surface charges are decomposed into sigma moments which are correlated to a partition coefficient representing transfer of the ...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci049653f
更新日期:2005-09-01 00:00:00
abstract::Flavins are versatile biological cofactors which catalyze proton-coupled electron transfers (PCET) with varying number and coupling of electrons. Flavin-mediated oxidations of nicotinamide adenine dinucleotide (NADH) and of succinate, initial redox reactions in cellular respiration, were examined here with multiconfig...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.0c00945
更新日期:2020-12-28 00:00:00
abstract::Compounds with high-confidence target annotations and activity measurements in the original and current release of the ChEMBL database have been compared to better understand how the growth of compound activity data might influence the spectrum of ligand-target interactions and the degree of target promiscuity among a...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci3003304
更新日期:2012-10-22 00:00:00
abstract::Metabolic stability is an important property of drug molecules that should-optimally-be taken into account early on in the drug design process. Along with numerous medium- or high-throughput assays being implemented in early drug discovery, a prediction tool for this property could be of high value. However, metabolic...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700142c
更新日期:2008-04-01 00:00:00
abstract::In the preceding paper (Duca, J. S.; Madison, V. S.; Voigt, J. H. J. Chem. Inf. Model. 2008, 48, 659-668), the accuracy of docking and affinity predictions of the Gold and Glide programs were investigated using single protein conformations spanning 150 CDK2/inhibitor crystallographic complexes. High docking accuracy w...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci700428d
更新日期:2008-03-01 00:00:00
abstract::In a departure from conventional chemical approaches, data-driven models of chemical reactions have recently been shown to be statistically successful using machine learning. These models, however, are largely black box in character and have not provided the kind of chemical insights that historically advanced the fie...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00721
更新日期:2020-03-23 00:00:00
abstract::Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations ar...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci500580y
更新日期:2015-01-26 00:00:00
abstract::Reactive oxygen species such as superoxide are potentially harmful byproducts of the aerobic metabolism in the inner mitochondrial membrane, and complexes I, II, III of the electron transport chain have been identified as primary sources. The mitochondrial fatty acid b-oxidation pathway may also play a yet uncharacter...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.9b00702
更新日期:2019-11-25 00:00:00
abstract::Interfacial hydration strongly influences interactions between biomolecules. For example, drug-target complexes are often stabilized by hydration networks formed between hydrophilic residues and water molecules at the interface. Exhaustive exploration of hydration networks is challenging for experimental as well as th...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00638
更新日期:2016-01-25 00:00:00
abstract::The in silico prediction of unwanted side effects (SEs) caused by the promiscuous behavior of drugs and their targets is highly relevant to the pharmaceutical industry. Considerable effort is now being put into computational and experimental screening of several suspected off-target proteins in the hope that SEs might...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.5b00120
更新日期:2015-09-28 00:00:00
abstract::The possibility of improving the predictive ability of comparative molecular field analysis (CoMFA) by settings optimization has been evaluated to show that CoMFA predictive ability can be improved. Ten different CoMFA settings are evaluated, producing a total of 6120 models. This method has been applied to nine diffe...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci049612j
更新日期:2006-01-01 00:00:00
abstract::Calcium and magnesium ions play important roles in many physicochemical processes. To facilitate the investigation of phenomena related to these ions that occur over large length and time scales, a coarse-grained force field (CGFF) is developed for MgCl2 and CaCl2 aqueous solutions. The ions are modeled by CG beads wi...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/acs.jcim.7b00206
更新日期:2017-07-24 00:00:00
abstract::Cytochrome P450 3A4 metabolizes nearly 50% of the drugs currently in clinical use with a broad range of substrate specificity. Early prediction of metabolites of xenobiotic compounds is crucial for cost efficient drug discovery and development. We developed a new combined model, MLite, for the prediction of regioselec...
journal_title:Journal of chemical information and modeling
pub_type: 杂志文章
doi:10.1021/ci7003576
更新日期:2008-03-01 00:00:00