Visualization of Solar Cell Library Space by Dimensionality Reduction Methods.

Abstract:

:Visualizing high-dimensional data by projecting them into a two- or three-dimensional space is a popular approach in many scientific fields, including computer-aided drug design and cheminformatics. In contrast, dimensionality reduction techniques have been far less explored for materials informatics. Nevertheless, similar to their usefulness in analyzing the space of, e.g., drug-like molecules, such techniques could provide useful insights on materials space, including an intuitive grasp of the overall distribution of samples, the identification of interesting trends, including the formation of materials clusters and the presence of activity cliffs and outliers, and rational navigation through this space in the search for new materials. Here we present the first application of four dimensionality reduction techniques, namely, principal component analysis (PCA), kernel PCA, Isomap, and diffusion map, to visualize and analyze a part of the materials space populated by solar cells made of metal oxides. Solar cells in general and metal-oxide-based solar cells in particular hold the promise of contributing to the world's search for clean and affordable energy resources. With the exception of PCA, these methods have seldom been used to visualize chemistry space and almost never been used to visualize materials space. For this purpose, we integrated five metal-oxide-based solar cell libraries into a uniform database and subjected it to dimensionality reduction by all four methods, comparing their performances using various criteria such as maintaining the local environment of samples and the clustering structure in the low-dimensional space. We also looked at the number of outliers produced by each method and analyzed common outliers. We found that PCA performs best in terms of the ability to correctly maintain the local environment of samples, whereas Isomap does the best job of assigning class membership on the basis of the identities of nearest neighbors (i.e., it is the best classifier). We also found that many of the outliers identified by all of the methods could be rationalized. We suggest that the methods used in this work could be extended to study other types of solar cells, thereby setting the ground for further analysis of the photovoltaic (PV) space as well as other regions of materials space.

journal_name

J Chem Inf Model

authors

Kaspi O,Yosipof A,Senderowitz H

doi

10.1021/acs.jcim.8b00552

subject

Has Abstract

pub_date

2018-12-24 00:00:00

pages

2428-2439

issue

12

eissn

1549-9596

issn

1549-960X

journal_volume

58

pub_type

杂志文章
  • Ligand-Based Discovery of a New Scaffold for Allosteric Modulation of the μ-Opioid Receptor.

    abstract::With the hope of discovering effective analgesics with fewer side effects, attention has recently shifted to allosteric modulators of the opioid receptors. In the past two years, the first chemotypes of positive or silent allosteric modulators (PAMs or SAMs, respectively) of μ- and δ-opioid receptor types have been re...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.5b00388

    authors: Bisignano P,Burford NT,Shang Y,Marlow B,Livingston KE,Fenton AM,Rockwell K,Budenholzer L,Traynor JR,Gerritz SW,Alt A,Filizola M

    更新日期:2015-09-28 00:00:00

  • Similarity perception of reactions catalyzed by oxidoreductases and hydrolases using different classification methods.

    abstract::In this work, the perception of similarity of reactions catalyzed by hydrolases and oxidoreductases on the basis of the overall breaking and making of bonds of reactions is investigated. Six physicochemical properties for the reacting bond in the substrate of each enzymatic reaction were calculated to describe the cha...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci9004833

    authors: Hu X,Yan A,Tan T,Sacher O,Gasteiger J

    更新日期:2010-06-28 00:00:00

  • Computational Design of Biologically Active Anticancer Peptides and Their Interactions with Heterogeneous POPC/POPS Lipid Membranes.

    abstract::Over the last few decades, anticancer peptides (ACPs) have turned into potential warheads against cancer. Apart from small molecules and monoclonal antibodies, ACPs have been proven to be effective against cancer cells. ACPs are small cationic peptides that selectively bind to the negatively charged cancer cell membra...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00348

    authors: Singh M,Kumar V,Sikka K,Thakur R,Harioudh MK,Mishra DP,Ghosh JK,Siddiqi MI

    更新日期:2020-01-27 00:00:00

  • RED: a set of molecular descriptors based on Renyi entropy.

    abstract::New molecular descriptors, RED (Renyi entropy descriptors), based on the generalized entropies introduced by Renyi are presented. Topological descriptors based on molecular features have proven to be useful for describing molecular profiles. Renyi entropy is used as a variability measure to contract a feature-pair dis...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900275w

    authors: Delgado-Soler L,Toral R,Tomás MS,Rubio-Martinez J

    更新日期:2009-11-01 00:00:00

  • Comparison of Implicit and Explicit Solvation Models for Iota-Cyclodextrin Conformation Analysis from Replica Exchange Molecular Dynamics.

    abstract::Large ring cyclodextrins have become increasingly important for drug delivery applications. In this work, we have performed replica-exchange molecular dynamics simulations using both implicit and explicit water solvation models to study the conformational diversity of iota-cyclodextrin containing 14 α-1,4 glycosidic l...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00595

    authors: Khuntawee W,Kunaseth M,Rungnim C,Intagorn S,Wolschann P,Kungwan N,Rungrotmongkol T,Hannongbua S

    更新日期:2017-04-24 00:00:00

  • Development of novel statistical potentials describing cation-pi interactions in proteins and comparison with semiempirical and quantum chemistry approaches.

    abstract::Novel statistical potentials derived from known protein structures are presented. They are designed to describe cation-pi and amino-pi interactions between a positively charged amino acid or an amino acid carrying a partially charged amino group and an aromatic moiety. These potentials are based on the propensity of r...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci050395b

    authors: Gilis D,Biot C,Buisine E,Dehouck Y,Rooman M

    更新日期:2006-03-01 00:00:00

  • Trans and Cis Conformations of the Antihypertensive Drug Valsartan Respectively Lock the Inactive and Active-like States of Angiotensin II Type 1 Receptor: A Molecular Dynamics Study.

    abstract::Angiotensin II type 1 receptor (AT1R) is the principal regulator of blood pressure in humans. The overactivation of AT1R by the stimulation of angiotensin II would result in high blood pressure. To prevent hypertension, nonpeptide "sartan" drugs, such as valsartan (VST), have been developed to competitively block the ...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00364

    authors: Wang L,Yan F

    更新日期:2018-10-22 00:00:00

  • Combinatorial × computational × cheminformatics (C3) approach to characterization of congeneric libraries of organic pollutants.

    abstract::Congeners are molecules based on the same carbon skeleton but are different by the number of substituents and/or a substitution pattern. Examples are 1-chloronaphthalene, 1,4-dichloronaphthalene, and 1,3,8-trichloronaphthalene. Various persistent organic pollutants (POPs) exist in the environment as families of congen...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300289b

    authors: Haranczyk M,Urbaszek P,Ng EG,Puzyn T

    更新日期:2012-11-26 00:00:00

  • Rigorous Computational Study Reveals What Docking Overlooks: Double Trouble from Membrane Association in Protein Kinase C Modulators.

    abstract::Increasing protein kinase C (PKC) activity is of potential therapeutic value. Its activation involves an interaction between the C1 domain and diacylglycerol (DAG) at intracellular membrane surfaces; DAG mimetics hold promise as new drugs. We previously developed the isophthalate derivative HMI-1a3, an effective but h...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00624

    authors: Lautala S,Provenzani R,Koivuniemi A,Kulig W,Talman V,Róg T,Tuominen RK,Yli-Kauhaluoma J,Bunker A

    更新日期:2020-11-23 00:00:00

  • Computational evidence for the role of Arabidopsis thaliana UVR8 as UV-B photoreceptor and identification of its chromophore amino acids.

    abstract::A homology model of the Arabidopsis thaliana UV resistance locus 8 (UVR8) protein is presented herein, showing a seven-bladed β-propeller conformation similar to the globular structure of RCC1. The UVR8 amino acid sequence contains a very high amount of conserved tryptophans, and the homology model shows that seven of...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci200017f

    authors: Wu M,Grahn E,Eriksson LA,Strid A

    更新日期:2011-06-27 00:00:00

  • Scaling predictive modeling in drug development with cloud computing.

    abstract::Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations ar...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500580y

    authors: Moghadam BT,Alvarsson J,Holm M,Eklund M,Carlsson L,Spjuth O

    更新日期:2015-01-26 00:00:00

  • Novel inhibitors of human histone deacetylase (HDAC) identified by QSAR modeling of known inhibitors, virtual screening, and experimental validation.

    abstract::Inhibitors of histone deacetylases (HDACIs) have emerged as a new class of drugs for the treatment of human cancers and other diseases because of their effects on cell growth, differentiation, and apoptosis. In this study we have developed several quantitative structure-activity relationship (QSAR) models for 59 chemi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci800366f

    authors: Tang H,Wang XS,Huang XP,Roth BL,Butler KV,Kozikowski AP,Jung M,Tropsha A

    更新日期:2009-02-01 00:00:00

  • Free energy calculations give insight into the stereoselective hydroxylation of α-ionones by engineered cytochrome P450 BM3 mutants.

    abstract::Previously, stereoselective hydroxylation of α-ionone by Cytochrome P450 BM3 mutants M01 A82W and M11 L437N was observed. While both mutants hydroxylate α-ionone in a regioselective manner at the C3 position, M01 A82W catalyzes formation of trans-3-OH-α-ionone products whereas M11 L437N exhibits opposite stereoselecti...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300243n

    authors: de Beer SB,Venkataraman H,Geerke DP,Oostenbrink C,Vermeulen NP

    更新日期:2012-08-27 00:00:00

  • Random forest models to predict aqueous solubility.

    abstract::Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueou...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci060164k

    authors: Palmer DS,O'Boyle NM,Glen RC,Mitchell JB

    更新日期:2007-01-01 00:00:00

  • RASA: a rapid retrosynthesis-based scoring method for the assessment of synthetic accessibility of drug-like molecules.

    abstract::In this account, a rapid retrosynthesis-based scoring method for the assessment of synthetic accessibility of drug-like molecules, called RASA (Retrosynthesis-based Assessment of Synthetic Accessibility) is devised. RASA first constructs a synthesis tree for the target molecule based on retrosynthetic analysis; in thi...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci100216g

    authors: Huang Q,Li LL,Yang SY

    更新日期:2011-10-24 00:00:00

  • Holistic Approach to Partial Covalent Interactions in Protein Structure Prediction and Design with Rosetta.

    abstract::Partial covalent interactions (PCIs) in proteins, which include hydrogen bonds, salt bridges, cation-π, and π-π interactions, contribute to thermodynamic stability and facilitate interactions with other biomolecules. Several score functions have been developed within the Rosetta protein modeling framework that identif...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.7b00398

    authors: Combs SA,Mueller BK,Meiler J

    更新日期:2018-05-29 00:00:00

  • DiSCuS: an open platform for (not only) virtual screening results management.

    abstract::DiSCuS, a "Database System for Compound Selection", has been developed. The primary goal of DiSCuS is to aid researchers in the steps subsequent to generating high-throughput virtual screening (HTVS) results, such as selection of compounds for further study, purchase, or synthesis. To do so, DiSCuS provides (1) a stor...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400587f

    authors: Wójcikowski M,Zielenkiewicz P,Siedlecki P

    更新日期:2014-01-27 00:00:00

  • Exploration of the accessible chemical space of acyclic alkanes.

    abstract::Saturated acyclic alkanes show steric strain if they are highly branched and, in extreme cases, fall apart rapidly at room temperature. Consequently, attempts to count the number of isomeric forms for a given molecular formula that neglect this physical consideration will inevitably overestimate the size of the availa...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci700246b

    authors: Paton RS,Goodman JM

    更新日期:2007-11-01 00:00:00

  • Locating sweet spots for screening hits and evaluating pan-assay interference filters from the performance analysis of two lead-like libraries.

    abstract::The efficiency of automated compound screening is heavily influenced by the design and the quality of the screening libraries used. We recently reported on the assembly of one diverse and one target-focused lead-like screening library. Using data from 15 enzyme-based screenings conducted using these libraries, their p...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci300382f

    authors: Mok NY,Maxe S,Brenk R

    更新日期:2013-03-25 00:00:00

  • Cyclohexane-Based Scaffold Molecules Acting as Anion Transport, Anionophores, via Noncovalent Interactions.

    abstract::A theoretical study of a variety of cyclohexane-based anion transporters interacting with the chloride anion has been conducted using density functional theory. The calculations have been performed in the gas phase but also, in order to describe the solvation effects on the interaction, two different solvents-chlorofo...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00154

    authors: Sánchez-Sanz G,Trujillo C

    更新日期:2019-05-28 00:00:00

  • Impact of template choice on homology model efficiency in virtual screening.

    abstract::Homology modeling is a reliable method of predicting the three-dimensional structures of proteins that lack NMR or X-ray crystallographic data. It employs the assumption that a structural resemblance exists between closely related proteins. Despite the availability of many crystal structures of possible templates, onl...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci500001f

    authors: Rataj K,Witek J,Mordalski S,Kosciolek T,Bojarski AJ

    更新日期:2014-06-23 00:00:00

  • Simulation of 2D NMR Spectra of Carbohydrates Using GODESS Software.

    abstract::Glycan Optimized Dual Empirical Spectrum Simulation (GODESS) is a web service, which has been recently shown to be one of the most accurate tools for simulation of (1)H and (13)C 1D NMR spectra of natural carbohydrates and their derivatives. The new version of GODESS supports visualization of the simulated (1)H and (1...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00083

    authors: Kapaev RR,Toukach PV

    更新日期:2016-06-27 00:00:00

  • Machine Learning Enhanced Spectrum Recognition Based on Computer Vision (SRCV) for Intelligent NMR Data Extraction.

    abstract::A machine learning enhanced spectrum recognition system called spectrum recognition based on computer vision (SRCV) for data extraction from previously analyzed 13C and 1H NMR spectra has been developed. The intelligent system was designed with four function modules to extract data from three areas of NMR images, incl...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c01046

    authors: Jia W,Yang Z,Yang M,Cheng L,Lei Z,Wang X

    更新日期:2021-01-25 00:00:00

  • Multiview Joint Learning-Based Method for Identifying Small-Molecule-Associated MiRNAs by Integrating Pharmacological, Genomics, and Network Knowledge.

    abstract::The emergence of a large amount of pharmacological, genomic, and network knowledge data provides new challenges and opportunities for drug discovery and development. Identification of real small-molecule drug (SM)-miRNA associations is not only important in the development of effective drug repositioning but also cruc...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.0c00244

    authors: Shen C,Luo J,Lai Z,Ding P

    更新日期:2020-08-24 00:00:00

  • Imputation of Assay Bioactivity Data Using Deep Learning.

    abstract::We describe a novel deep learning neural network method and its application to impute assay pIC50 values. Unlike conventional machine learning approaches, this method is trained on sparse bioactivity data as input, typical of that found in public and commercial databases, enabling it to learn directly from correlation...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.8b00768

    authors: Whitehead TM,Irwin BWJ,Hunt P,Segall MD,Conduit GJ

    更新日期:2019-03-25 00:00:00

  • In silico prediction of aqueous solubility: the solubility challenge.

    abstract::The dissolution of a chemical into water is a process fundamental to both chemistry and biology. The persistence of a chemical within the environment and the effects of a chemical within the body are dependent primarily upon aqueous solubility. With the well-documented limitations hindering the accurate experimental d...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci900286s

    authors: Hewitt M,Cronin MT,Enoch SJ,Madden JC,Roberts DW,Dearden JC

    更新日期:2009-11-01 00:00:00

  • Ab Initio Investigation of CO2 Adsorption on 13-Atom 4d Clusters.

    abstract::In this work, we report an ab initio investigation based on density functional theory calculations within van der Waals D3 corrections to investigate the adsorption properties and activation of CO2 on transition-metal (TM) 13-atom clusters (TM = Ru, Rh, Pd, Ag), which is a key step for the development of subnano catal...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b00792

    authors: Batista KEA,Ocampo-Restrepo VK,Soares MD,Quiles MG,Piotrowski MJ,Da Silva JLF

    更新日期:2020-02-24 00:00:00

  • Ligand binding determinants for angiotensin II type 1 receptor from computer simulations.

    abstract::The ligand binding determinants for the angiotensin II type 1 receptor (AT1R), a G protein-coupled receptor (GPCR), have been characterized by means of computer simulations. As a first step, a pharmacophore model of various known AT1R ligands exhibiting a wide range of binding affinities was generated. Second, a struc...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/ci400400m

    authors: Matsoukas MT,Cordomí A,Ríos S,Pardo L,Tselios T

    更新日期:2013-11-25 00:00:00

  • Evaluation of Generalized Born Models for Large Scale Affinity Prediction of Cyclodextrin Host-Guest Complexes.

    abstract::Binding affinity prediction with implicit solvent models remains a challenge in virtual screening for drug discovery. In order to assess the predictive power of implicit solvent models in docking techniques with Amber scoring, three generalized Born models (GBHCT, GBOBCI, and GBOBCII) available in Dock 6.7 were utiliz...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.6b00418

    authors: Zhang H,Yin C,Yan H,van der Spoel D

    更新日期:2016-10-24 00:00:00

  • Assessment of the Cruzain Cysteine Protease Reversible and Irreversible Covalent Inhibition Mechanism.

    abstract::Reversible and irreversible covalent ligands are advanced cysteine protease inhibitors in the drug development pipeline. K777 is an irreversible inhibitor of cruzain, a necessary enzyme for the survival of the Trypanosoma cruzi (T. cruzi) parasite, the causative agent of Chagas disease. Despite their importance, irrev...

    journal_title:Journal of chemical information and modeling

    pub_type: 杂志文章

    doi:10.1021/acs.jcim.9b01138

    authors: Silva JRA,Cianni L,Araujo D,Batista PHJ,de Vita D,Rosini F,Leitão A,Lameira J,Montanari CA

    更新日期:2020-03-23 00:00:00