Multivariate linear QSPR/QSAR models: Rigorous evaluation of variable selection for PLS.

Abstract:

:Basic chemometric methods for making empirical regression models for QSPR/QSAR are briefly described from a user's point of view. Emphasis is given to PLS regression, simple variable selection and a careful and cautious evaluation of the performance of PLS models by repeated double cross validation (rdCV). A demonstration example is worked out for QSPR models that predict gas chromatographic retention indices (values between 197 and 504 units) of 209 polycyclic aromatic compounds (PAC) from molecular descriptors generated by Dragon software. Most favorable models were obtained from data sets containing also descriptors from 3D structures with all H-atoms (computed by Corina software), using stepwise variable selection (reducing 2688 descriptors to a subset of 22). The final QSPR model has typical prediction errors for the retention index of ±12 units (95% tolerance interval, for test set objects). Programs and data are provided as supplementary material for the open source R software environment.

authors

Varmuza K,Filzmoser P,Dehmer M

doi

10.5936/csbj.201302007

subject

Has Abstract

pub_date

2013-03-02 00:00:00

pages

e201302007

issn

2001-0370

pii

CSBJ-5-e201302007

journal_volume

5

pub_type

杂志文章,评审
  • Learning distributed representations of RNA and protein sequences and its application for predicting lncRNA-protein interactions.

    abstract::The long noncoding RNAs (lncRNAs) are ubiquitous in organisms and play crucial role in a variety of biological processes and complex diseases. Emerging evidences suggest that lncRNAs interact with corresponding proteins to perform their regulatory functions. Therefore, identifying interacting lncRNA-protein pairs is t...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2019.11.004

    authors: Yi HC,You ZH,Cheng L,Zhou X,Jiang TH,Li X,Wang YB

    更新日期:2019-11-30 00:00:00

  • Of mice and men: Dissecting the interaction between Listeria monocytogenes Internalin A and E-cadherin.

    abstract::We report a study of the interaction between internalin A (inlA) and human or murine E-cadherin (Ecad). inlA is used by Listeria monocytogenes to internalize itself into host cell, but the bacterium is unable to invade murine cells, which has been attributed to the difference in sequence between hEcad and mEcad. Using...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.5936/csbj.201303022

    authors: Genheden S,Eriksson LA

    更新日期:2013-12-15 00:00:00

  • Directional Switching Mechanism of the Bacterial Flagellar Motor.

    abstract::Bacteria sense temporal changes in extracellular stimuli via sensory signal transducers and move by rotating flagella towards into a favorable environment for their survival. Each flagellum is a supramolecular motility machine consisting of a bi-directional rotary motor, a universal joint and a helical propeller. The ...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2019.07.020

    authors: Minamino T,Kinoshita M,Namba K

    更新日期:2019-07-31 00:00:00

  • Dissecting complex polyketide biosynthesis.

    abstract::Numerous bioactive natural products are synthesised by modular polyketide synthases. These compounds can be made in high yield by native multienzyme assembly lines. However, formation of analogues by genetically engineered systems is often considerably less efficient. Biochemical studies on intact polyketide synthase ...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.5936/csbj.201210010

    authors: Caffrey P

    更新日期:2012-11-17 00:00:00

  • Ancestral zinc-finger bearing protein MucR in alpha-proteobacteria: A novel xenogeneic silencer?

    abstract::The MucR/Ros family protein is conserved in alpha-proteobacteria and characterized by its zinc-finger motif that has been proposed as the ancestral domain from which the eukaryotic C2H2 zinc-finger structure evolved. In the past decades, accumulated evidences have revealed MucR as a pleiotropic transcriptional regulat...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2020.11.028

    authors: Jiao J,Tian CF

    更新日期:2020-11-19 00:00:00

  • In Silico Prediction of Large-Scale Microbial Production Performance: Constraints for Getting Proper Data-Driven Models.

    abstract::Industrial bioreactors range from 10.000 to 700.000 L and characteristically show different zones of substrate availabilities, dissolved gas concentrations and pH values reflecting physical, technical and economic constraints of scale-up. Microbial producers are fluctuating inside the bioreactors thereby experiencing ...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2018.06.002

    authors: Zieringer J,Takors R

    更新日期:2018-07-06 00:00:00

  • Tribology of bio-inspired nanowrinkled films on ultrasoft substrates.

    abstract::Biomimetic design of new materials uses nature as antetype, learning from billions of years of evolution. This work emphasizes the mechanical and tribological properties of skin, combining both hardness and wear resistance of its surface (the stratum corneum) with high elasticity of the bulk (epidermis, dermis, hypode...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.5936/csbj.201303002

    authors: Lackner JM,Waldhauser W,Major L,Teichert C,Hartmann P

    更新日期:2013-05-08 00:00:00

  • Asymmetric Spontaneous Intercalation of Lutein into a Phospholipid Bilayer, a Computational Study.

    abstract::Lutein, a hydroxylated carotenoid, is a pigment synthesised by plants and bacteria. Animals are unable to synthesise lutein, nevertheless, it is present in animal tissues, where its only source is dietary intake. Both in plants and animals, carotenoids are associated mainly with membranes where they carry out importan...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2019.04.001

    authors: Makuch K,Markiewicz M,Pasenkiewicz-Gierula M

    更新日期:2019-04-06 00:00:00

  • Multi-population cohort meta-analysis of human intestinal microbiota in early life reveals the existence of infant community state types (ICSTs).

    abstract::Appropriate development of the intestinal microbiota during infancy is known to be important for human health. In fact, aberrant alterations of the microbial composition during childhood may cause short- and/or long-term negative health effects. Many factors influence the initial assembly and subsequent progression of...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.08.028

    authors: Mancabelli L,Tarracchini C,Milani C,Lugli GA,Fontana F,Turroni F,van Sinderen D,Ventura M

    更新日期:2020-09-15 00:00:00

  • Structure-based discovery of neoandrographolide as a novel inhibitor of Rab5 to suppress cancer growth.

    abstract::Rab5 is a small GTPase that plays a crucial role in oncogenic signal transduction, which was considered as an attractive target for cancer therapy. Rapid GDP/GTP exchange in the packet of Rab5 sustains its high activity for promoting cancer progression. However, Rab5 currently remains undruggable due to the lack of sp...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.11.033

    authors: Zhang J,Sun Y,Zhong LY,Yu NN,Ouyang L,Fang RD,Wang Y,He QY

    更新日期:2020-11-30 00:00:00

  • SNP2Structure: A Public and Versatile Resource for Mapping and Three-Dimensional Modeling of Missense SNPs on Human Protein Structures.

    abstract::One of the long-standing challenges in biology is to understand how non-synonymous single nucleotide polymorphisms (nsSNPs) change protein structure and further affect their function. While it is impractical to solve all the mutated protein structures experimentally, it is quite feasible to model the mutated structure...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2015.09.002

    authors: Wang D,Song L,Singh V,Rao S,An L,Madhavan S

    更新日期:2015-09-30 00:00:00

  • Synthesis, LSD1 Inhibitory Activity, and LSD1 Binding Model of Optically Pure Lysine-PCPA Conjugates.

    abstract::Compounds that inhibit the catalytic function of lysine-specific demethylase 1 (LSD1) are interesting as therapeutic agents. Recently, we identified three lysine-phenylcyclopropylamine conjugates, NCD18, NCD25, and NCD41, which are potent LSD1 inactivators. However, in our previous study, because we tested those compo...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.5936/csbj.201402002

    authors: Itoh Y,Ogasawara D,Ota Y,Mizukami T,Suzuki T

    更新日期:2014-02-15 00:00:00

  • The ecogenomics of dsDNA bacteriophages in feces of stabled and feral horses.

    abstract::The viromes of the mammalian lower gut were shown to be heavily dominated by bacteriophages; however, only for humans were the composition and intervariability of the bacteriophage communities studied in depth. Here we present an ecogenomics survey of dsDNA bacteriophage diversity in the feces of horses (Equus caballu...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.10.036

    authors: Babenko VV,Millard A,Kulikov EE,Spasskaya NN,Letarova MA,Konanov DN,Belalov IS,Letarov AV

    更新日期:2020-11-10 00:00:00

  • Transcriptome and proteome analyses reveal the regulatory networks and metabolite biosynthesis pathways during the development of Tolypocladium guangdongense.

    abstract::Tolypocladium guangdongense has a similar metabolite profile to Ophiocordyceps sinensis, a highly regarded fungus used for traditional Chinese medicine with high nutritional and medicinal value. Although the genome sequence of T. guangdongense has been reported, relatively little is known about the regulatory networks...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.07.014

    authors: Wang G,Li M,Zhang C,Cheng H,Gao Y,Deng W,Li T

    更新日期:2020-07-25 00:00:00

  • Assessment of automated analyses of cell migration on flat and nanostructured surfaces.

    abstract::Motility studies of cells often rely on computer software that analyzes time-lapse recorded movies and establishes cell trajectories fully automatically. This raises the question of reproducibility of results, since different programs could yield significantly different results of such automated analysis. The fact tha...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.5936/csbj.201207004

    authors: Grădinaru C,Lopacińska JM,Huth J,Kestler HA,Flyvbjerg H,Mølhave K

    更新日期:2012-11-21 00:00:00

  • Transcriptomics in the tropics: Total RNA-based profiling of Costa Rican bromeliad-associated communities.

    abstract::RNA-Seq was used to examine the microbial, eukaryotic, and viral communities in water catchments ('tanks') formed by tropical bromeliads from Costa Rica. In total, transcripts with taxonomic affiliation to a wide array of bacteria, archaea, and eukaryotes, were observed, as well as RNA-viruses that appeared related to...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2014.12.001

    authors: Goffredi SK,Jang GE,Haroon MF

    更新日期:2014-12-13 00:00:00

  • A structural overview of GH61 proteins - fungal cellulose degrading polysaccharide monooxygenases.

    abstract::Recent years have witnessed a spurt of activities in the elucidation of the molecular function of a class of proteins with great potential in biomass degradation. GH61 proteins are of fungal origin and were originally classified in family 61 of the glycoside hydrolases. From the beginning they were strongly suspected ...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.5936/csbj.201209019

    authors: Lo Leggio L,Welner D,De Maria L

    更新日期:2012-11-30 00:00:00

  • From plant genomes to protein families: computational tools.

    abstract::The development of new high-throughput sequencing technologies has increased dramatically the number of successful genomic projects. Thus, draft genomic sequences of more than 60 plant species are currently available. Suitable bioinformatics tools are being developed to assemble, annotate and analyze the enormous numb...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.5936/csbj.201307001

    authors: Martinez M

    更新日期:2013-08-14 00:00:00

  • Statistical methods for the analysis of high-throughput metabolomics data.

    abstract::Metabolomics is a relatively new high-throughput technology that aims at measuring all endogenous metabolites within a biological sample in an unbiased fashion. The resulting metabolic profiles may be regarded as functional signatures of the physiological state, and have been shown to comprise effects of genetic regul...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.5936/csbj.201301009

    authors: Bartel J,Krumsiek J,Theis FJ

    更新日期:2013-03-22 00:00:00

  • IKKα Promotes the Progression and Metastasis of Non-Small Cell Lung Cancer Independently of its Subcellular Localization.

    abstract::Lung cancer is the leading worldwide cause of cancer mortality, however, neither curative treatments nor substantial prolonged survival has been achieved, highlighting the need for investigating new proteins responsible for its development and progression. IKKα is an essential protein for cell survival and differentia...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2019.02.003

    authors: Page A,Ortega A,Alameda JP,Navarro M,Paramio JM,Saiz-Pardo M,Almeida EI,Hernández P,Fernández-Aceñero MJ,García-Fernández RA,Casanova ML

    更新日期:2019-02-07 00:00:00

  • Prediction of Ligand Transport along Hydrophobic Enzyme Nanochannels.

    abstract::Buried active sites of enzymes are connected to the bulk solvent through a network of hydrophobic channels. We developed a discretized model that can accurately predict ligand transport along hydrophobic channels up to six orders of magnitude faster than any other existing method. The non-dimensional nature of the mod...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2019.06.001

    authors: Escalante DE,Aksan A

    更新日期:2019-06-11 00:00:00

  • Network analysis of human post-mortem microarrays reveals novel genes, microRNAs, and mechanistic scenarios of potential importance in fighting huntington's disease.

    abstract::Huntington's disease is a progressive neurodegenerative disorder characterized by motor disturbances, cognitive decline, and neuropsychiatric symptoms. In this study, we utilized network-based analysis in an attempt to explore and understand the underlying molecular mechanism and to identify critical molecular players...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2016.02.001

    authors: Chandrasekaran S,Bonchev D

    更新日期:2016-02-10 00:00:00

  • Ethical issues related to research on genome editing in human embryos.

    abstract::Although the potential advantages of clinical germline genome editing (GGE) over currently available methods are limited, the implementation of GGE in the clinic has been proposed and discussed. Ethical issues related to such an application have been extensively debated, meanwhile, seemingly less attention has been pa...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2020.03.014

    authors: Niemiec E,Howard HC

    更新日期:2020-03-21 00:00:00

  • Homology modeling in the time of collective and artificial intelligence.

    abstract::Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and ...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2020.11.007

    authors: Hameduh T,Haddad Y,Adam V,Heger Z

    更新日期:2020-11-14 00:00:00

  • The role of protein interaction networks in systems biomedicine.

    abstract::The challenging task of studying and modeling complex dynamics of biological systems in order to describe various human diseases has gathered great interest in recent years. Major biological processes are mediated through protein interactions, hence there is a need to understand the chaotic network that forms these pr...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2014.08.008

    authors: Sevimoglu T,Arga KY

    更新日期:2014-09-03 00:00:00

  • Stage-specific control of stem cell niche architecture in the Drosophila testis by the posterior Hox gene Abd-B.

    abstract::A fundamental question in biology is how complex structures are maintained after their initial specification. We address this question by reviewing the role of the Hox gene Abd-B in Drosophila testis organogenesis, which proceeds through embryonic, larval and pupal stages to reach maturation in adult stages. The data ...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2015.01.001

    authors: Papagiannouli F,Lohmann I

    更新日期:2015-01-21 00:00:00

  • Mini-review: Strategies for Variation and Evolution of Bacterial Antigens.

    abstract::Across the eubacteria, antigenic variation has emerged as a strategy to evade host immunity. However, phenotypic variation in some of these antigens also allows the bacteria to exploit variable host niches as well. The specific mechanisms are not shared-derived characters although there is considerable convergent evol...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2015.07.002

    authors: Foley J

    更新日期:2015-07-26 00:00:00

  • FangNet: Mining herb hidden knowledge from TCM clinical effective formulas using structure network algorithm.

    abstract::The use of herbs to treat various human diseases has been recorded for thousands of years. In Asia's current medical system, numerous herbal formulas have been repeatedly verified to confirm their effectiveness in different periods, which is a great resource for drug innovation and discovery. Through the mining of the...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.11.036

    authors: Bu D,Xia Y,Zhang J,Cao W,Huo P,Wang Z,He Z,Ding L,Wu Y,Zhang S,Gao K,Yu H,Liu T,Ding X,Gu X,Zhao Y

    更新日期:2020-12-04 00:00:00

  • Predicting the impacts of mutations on protein-ligand binding affinity based on molecular dynamics simulations and machine learning methods.

    abstract:Purpose:Mutation-induced variation of protein-ligand binding affinity is the key to many genetic diseases and the emergence of drug resistance, and therefore predicting such mutation impacts is of great importance. In this work, we aim to predict the mutation impacts on protein-ligand binding affinity using efficient s...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.02.007

    authors: Wang DD,Ou-Yang L,Xie H,Zhu M,Yan H

    更新日期:2020-02-20 00:00:00

  • BOG: R-package for Bacterium and virus analysis of Orthologous Groups.

    abstract::BOG (Bacterium and virus analysis of Orthologous Groups) is a package for identifying groups of differentially regulated genes in the light of gene functions for various virus and bacteria genomes. It is designed to identify Clusters of Orthologous Groups (COGs) that are enriched among genes that have gone through sig...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2015.05.002

    authors: Park J,Taslim C,Lin S

    更新日期:2015-05-21 00:00:00