GlyStruct: glycation prediction using structural properties of amino acid residues.

Abstract:

BACKGROUND:Glycation is a one of the post-translational modifications (PTM) where sugar molecules and residues in protein sequences are covalently bonded. It has become one of the clinically important PTM in recent times attributed to many chronic and age related complications. Being a non-enzymatic reaction, it is a great challenge when it comes to its prediction due to the lack of significant bias in the sequence motifs. RESULTS:We developed a classifier, GlyStruct based on support vector machine, to predict glycated and non-glycated lysine residues using structural properties of amino acid residues. The features used were secondary structure, accessible surface area and the local backbone torsion angles. For this work, a benchmark dataset was extracted containing 235 glycated and 303 non-glycated lysine residues. GlyStruct demonstrated improved performance of approximately 10% in comparison to benchmark method of Gly-PseAAC. The performance for GlyStruct on the metrics, sensitivity, specificity, accuracy and Mathew's correlation coefficient were 0.7013, 0.7989, 0.7562, and 0.5065, respectively for 10-fold cross-validation. CONCLUSION:Glycation has emerged to be one of the clinically important PTM of proteins in recent times. Therefore, the development of computational tools become necessary to predict glycation, which could help medical professionals administer drugs and manage patients more effectively. The proposed predictor manages to classify glycated and non-glycated lysine residues with promising results consistently on various cross-validation schemes and outperforms other state of the art methods.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Reddy HM,Sharma A,Dehzangi A,Shigemizu D,Chandra AA,Tsunoda T

doi

10.1186/s12859-018-2547-x

subject

Has Abstract

pub_date

2019-02-04 00:00:00

pages

547

issue

Suppl 13

issn

1471-2105

pii

10.1186/s12859-018-2547-x

journal_volume

19

pub_type

杂志文章
  • knnAUC: an open-source R package for detecting nonlinear dependence between one continuous variable and one binary variable.

    abstract:BACKGROUND:Testing the dependence of two variables is one of the fundamental tasks in statistics. In this work, we developed an open-source R package (knnAUC) for detecting nonlinear dependence between one continuous variable X and one binary dependent variables Y (0 or 1). RESULTS:We addressed this problem by using k...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2427-4

    authors: Li Y,Liu X,Ma Y,Wang Y,Zhou W,Hao M,Yuan Z,Liu J,Xiong M,Shugart YY,Wang J,Jin L

    更新日期:2018-11-22 00:00:00

  • Optimal neighborhood indexing for protein similarity search.

    abstract:BACKGROUND:Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional informa...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-534

    authors: Peterlongo P,Noé L,Lavenier D,Nguyen VH,Kucherov G,Giraud M

    更新日期:2008-12-16 00:00:00

  • GObar: a gene ontology based analysis and visualization tool for gene sets.

    abstract:BACKGROUND:Microarray experiments, as well as other genomic analyses, often result in large gene sets containing up to several hundred genes. The biological significance of such sets of genes is, usually, not readily apparent. Identification of the functions of the genes in the set can help highlight features of intere...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-189

    authors: Lee JS,Katari G,Sachidanandam R

    更新日期:2005-07-25 00:00:00

  • Time-course analysis of genome-wide gene expression data from hormone-responsive human breast cancer cells.

    abstract:BACKGROUND:Microarray experiments enable simultaneous measurement of the expression levels of virtually all transcripts present in cells, thereby providing a 'molecular picture' of the cell state. On the other hand, the genomic responses to a pharmacological or hormonal stimulus are dynamic molecular processes, where t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S2-S12

    authors: Mutarelli M,Cicatiello L,Ferraro L,Grober OM,Ravo M,Facchiano AM,Angelini C,Weisz A

    更新日期:2008-03-26 00:00:00

  • Rule-based knowledge aggregation for large-scale protein sequence analysis of influenza A viruses.

    abstract:BACKGROUND:The explosive growth of biological data provides opportunities for new statistical and comparative analyses of large information sets, such as alignments comprising tens of thousands of sequences. In such studies, sequence annotations frequently play an essential role, and reliable results depend on metadata...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S1-S7

    authors: Miotto O,Tan TW,Brusic V

    更新日期:2008-01-01 00:00:00

  • An SVD-based comparison of nine whole eukaryotic genomes supports a coelomate rather than ecdysozoan lineage.

    abstract:BACKGROUND:Eukaryotic whole genome sequences are accumulating at an impressive rate. Effective methods for comparing multiple whole eukaryotic genomes on a large scale are needed. Most attempted solutions involve the production of large scale alignments, and many of these require a high stringency pre-screen for putati...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-204

    authors: Stuart GW,Berry MW

    更新日期:2004-12-17 00:00:00

  • Assessment of the relationship between pre-chip and post-chip quality measures for Affymetrix GeneChip expression data.

    abstract:BACKGROUND:Gene expression microarray experiments are expensive to conduct and guidelines for acceptable quality control at intermediate steps before and after the samples are hybridised to chips are vague. We conducted an experiment hybridising RNA from human brain to 117 U133A Affymetrix GeneChips and used these data...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-211

    authors: Jones L,Goldstein DR,Hughes G,Strand AD,Collin F,Dunnett SB,Kooperberg C,Aragaki A,Olson JM,Augood SJ,Faull RL,Luthi-Carter R,Moskvina V,Hodges AK

    更新日期:2006-04-19 00:00:00

  • TGF-beta signaling proteins and the Protein Ontology.

    abstract:BACKGROUND:The Protein Ontology (PRO) is designed as a formal and principled Open Biomedical Ontologies (OBO) Foundry ontology for proteins. The components of PRO extend from a classification of proteins on the basis of evolutionary relationships at the homeomorphic level to the representation of the multiple protein f...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S5-S3

    authors: Arighi CN,Liu H,Natale DA,Barker WC,Drabkin H,Blake JA,Smith B,Wu CH

    更新日期:2009-05-06 00:00:00

  • Mapping transcription mechanisms from multimodal genomic data.

    abstract:BACKGROUND:Identification of expression quantitative trait loci (eQTLs) is an emerging area in genomic study. The task requires an integrated analysis of genome-wide single nucleotide polymorphism (SNP) data and gene expression data, raising a new computational challenge due to the tremendous size of data. RESULTS:We ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S9-S2

    authors: Chang HH,McGeachie M,Alterovitz G,Ramoni MF

    更新日期:2010-10-28 00:00:00

  • Phylogenomics and sequence-structure-function relationships in the GmrSD family of Type IV restriction enzymes.

    abstract:BACKGROUND:GmrSD is a modification-dependent restriction endonuclease that specifically targets and cleaves glucosylated hydroxymethylcytosine (glc-HMC) modified DNA. It is encoded either as two separate single-domain GmrS and GmrD proteins or as a single protein carrying both domains. Previous studies suggested that G...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0773-z

    authors: Machnicka MA,Kaminska KH,Dunin-Horkawicz S,Bujnicki JM

    更新日期:2015-10-23 00:00:00

  • PuFFIN--a parameter-free method to build nucleosome maps from paired-end reads.

    abstract:BACKGROUND:We introduce a novel method, called PuFFIN, that takes advantage of paired-end short reads to build genome-wide nucleosome maps with larger numbers of detected nucleosomes and higher accuracy than existing tools. In contrast to other approaches that require users to optimize several parameters according to t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S9-S11

    authors: Polishko A,Bunnik EM,Le Roch KG,Lonardi S

    更新日期:2014-01-01 00:00:00

  • NOXclass: prediction of protein-protein interaction types.

    abstract:BACKGROUND:Structural models determined by X-ray crystallography play a central role in understanding protein-protein interactions at the molecular level. Interpretation of these models requires the distinction between non-specific crystal packing contacts and biologically relevant interactions. This has been investiga...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-27

    authors: Zhu H,Domingues FS,Sommer I,Lengauer T

    更新日期:2006-01-19 00:00:00

  • scDC: single cell differential composition analysis.

    abstract:BACKGROUND:Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite de...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3211-9

    authors: Cao Y,Lin Y,Ormerod JT,Yang P,Yang JYH,Lo KK

    更新日期:2019-12-24 00:00:00

  • Bayesian neural networks for detecting epistasis in genetic association studies.

    abstract:BACKGROUND:Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions. RESULTS:A non-parametric Bayesian ap...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0368-0

    authors: Beam AL,Motsinger-Reif A,Doyle J

    更新日期:2014-11-21 00:00:00

  • Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data.

    abstract:BACKGROUND:The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly asse...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-192

    authors: Mosén-Ansorena D,Aransay AM,Rodríguez-Ezpeleta N

    更新日期:2012-08-07 00:00:00

  • Nanopore-based kinetics analysis of individual antibody-channel and antibody-antigen interactions.

    abstract:BACKGROUND:The UNO/RIC Nanopore Detector provides a new way to study the binding and conformational changes of individual antibodies. Many critical questions regarding antibody function are still unresolved, questions that can be approached in a new way with the nanopore detector. RESULTS:We present evidence that diff...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S7-S20

    authors: Winters-Hilt S,Morales E,Amin I,Stoyanov A

    更新日期:2007-11-01 00:00:00

  • Partition-based optimization model for generative anatomy modeling language (POM-GAML).

    abstract:BACKGROUND:This paper presents a novel approach for Generative Anatomy Modeling Language (GAML). This approach automatically detects the geometric partitions in 3D anatomy that in turn speeds up integrated non-linear optimization model in GAML for 3D anatomy modeling with constraints (e.g. joints). This integrated non-...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2626-7

    authors: Demirel D,Cetinsaya B,Halic T,Kockara S,Ahmadi S

    更新日期:2019-03-14 00:00:00

  • GOAL: a software tool for assessing biological significance of genes groups.

    abstract:BACKGROUND:Modern high throughput experimental techniques such as DNA microarrays often result in large lists of genes. Computational biology tools such as clustering are then used to group together genes based on their similarity in expression profiles. Genes in each group are probably functionally related. The functi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-229

    authors: Tchagang AB,Gawronski A,Bérubé H,Phan S,Famili F,Pan Y

    更新日期:2010-05-06 00:00:00

  • Prediction of MHC class I binding peptides, using SVMHC.

    abstract:BACKGROUND:T-cells are key players in regulating a specific immune response. Activation of cytotoxic T-cells requires recognition of specific peptides bound to Major Histocompatibility Complex (MHC) class I molecules. MHC-peptide complexes are potential tools for diagnosis and treatment of pathogens and cancer, as well...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-3-25

    authors: Dönnes P,Elofsson A

    更新日期:2002-09-11 00:00:00

  • CellProfiler Tracer: exploring and validating high-throughput, time-lapse microscopy image data.

    abstract:BACKGROUND:Time-lapse analysis of cellular images is an important and growing need in biology. Algorithms for cell tracking are widely available; what researchers have been missing is a single open-source software package to visualize standard tracking output (from software like CellProfiler) in a way that allows conve...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0759-x

    authors: Bray MA,Carpenter AE

    更新日期:2015-11-04 00:00:00

  • Structator: fast index-based search for RNA sequence-structure patterns.

    abstract:BACKGROUND:The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running ti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-214

    authors: Meyer F,Kurtz S,Backofen R,Will S,Beckstette M

    更新日期:2011-05-27 00:00:00

  • Is EC class predictable from reaction mechanism?

    abstract:BACKGROUND:We investigate the relationships between the EC (Enzyme Commission) class, the associated chemical reaction, and the reaction mechanism by building predictive models using Support Vector Machine (SVM), Random Forest (RF) and k-Nearest Neighbours (kNN). We consider two ways of encoding the reaction mechanism ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-60

    authors: Nath N,Mitchell JB

    更新日期:2012-04-24 00:00:00

  • Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis.

    abstract:BACKGROUND:In mass spectrometry (MS) based proteomic data analysis, peak detection is an essential step for subsequent analysis. Recently, there has been significant progress in the development of various peak detection algorithms. However, neither a comprehensive survey nor an experimental comparison of these algorith...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-4

    authors: Yang C,He Z,Yu W

    更新日期:2009-01-06 00:00:00

  • Efficient error correction for next-generation sequencing of viral amplicons.

    abstract:BACKGROUND:Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S10-S6

    authors: Skums P,Dimitrova Z,Campo DS,Vaughan G,Rossi L,Forbi JC,Yokosawa J,Zelikovsky A,Khudyakov Y

    更新日期:2012-06-25 00:00:00

  • Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms.

    abstract:BACKGROUND:Predicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annotations at a genomic sc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S4-S14

    authors: Falda M,Toppo S,Pescarolo A,Lavezzo E,Di Camillo B,Facchinetti A,Cilia E,Velasco R,Fontana P

    更新日期:2012-03-28 00:00:00

  • ConEVA: a toolbox for comprehensive assessment of protein contacts.

    abstract:BACKGROUND:In recent years, successful contact prediction methods and contact-guided ab initio protein structure prediction methods have highlighted the importance of incorporating contact information into protein structure prediction methods. It is also observed that for almost all globular proteins, the quality of co...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1404-z

    authors: Adhikari B,Nowotny J,Bhattacharya D,Hou J,Cheng J

    更新日期:2016-12-07 00:00:00

  • metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences.

    abstract::Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initiatives to generate E...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S5-S2

    authors: Ander C,Schulz-Trieglaff OB,Stoye J,Cox AJ

    更新日期:2013-01-01 00:00:00

  • EGNAS: an exhaustive DNA sequence design algorithm.

    abstract:BACKGROUND:The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA) is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of seq...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-138

    authors: Kick A,Bönsch M,Mertig M

    更新日期:2012-06-20 00:00:00

  • Coverage statistics for sequence census methods.

    abstract:BACKGROUND:We study the statistical properties of fragment coverage in genome sequencing experiments. In an extension of the classic Lander-Waterman model, we consider the effect of the length distribution of fragments. We also introduce a coding of the shape of the coverage depth function as a tree and explain how thi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-430

    authors: Evans SN,Hower V,Pachter L

    更新日期:2010-08-18 00:00:00

  • Probe-specific mixed-model approach to detect copy number differences using multiplex ligation-dependent probe amplification (MLPA).

    abstract:BACKGROUND:MLPA method is a potentially useful semi-quantitative method to detect copy number alterations in targeted regions. In this paper, we propose a method for the normalization procedure based on a non-linear mixed-model, as well as a new approach for determining the statistical significance of altered probes ba...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-261

    authors: González JR,Carrasco JL,Armengol L,Villatoro S,Jover L,Yasui Y,Estivill X

    更新日期:2008-06-04 00:00:00