Predicting MoRFs in protein sequences using HMM profiles.

Abstract:

BACKGROUND:Intrinsically Disordered Proteins (IDPs) lack an ordered three-dimensional structure and are enriched in various biological processes. The Molecular Recognition Features (MoRFs) are functional regions within IDPs that undergo a disorder-to-order transition on binding to a partner protein. Identifying MoRFs in IDPs using computational methods is a challenging task. METHODS:In this study, we introduce hidden Markov model (HMM) profiles to accurately identify the location of MoRFs in disordered protein sequences. Using windowing technique, HMM profiles are utilised to extract features from protein sequences and support vector machines (SVM) are used to calculate a propensity score for each residue. Two different SVM kernels with high noise tolerance are evaluated with a varying window size and the scores of the SVM models are combined to generate the final propensity score to predict MoRF residues. The SVM models are designed to extract maximal information between MoRF residues, its neighboring regions (Flanks) and the remainder of the sequence (Others). RESULTS:To evaluate the proposed method, its performance was compared to that of other MoRF predictors; MoRFpred and ANCHOR. The results show that the proposed method outperforms these two predictors. CONCLUSIONS:Using HMM profile as a source of feature extraction, the proposed method indicates improvement in predicting MoRFs in disordered protein sequences.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Sharma R,Kumar S,Tsunoda T,Patil A,Sharma A

doi

10.1186/s12859-016-1375-0

subject

Has Abstract

pub_date

2016-12-22 00:00:00

pages

504

issue

Suppl 19

issn

1471-2105

pii

10.1186/s12859-016-1375-0

journal_volume

17

pub_type

杂志文章
  • Efficient reconstruction of biological networks via transitive reduction on general purpose graphics processors.

    abstract:BACKGROUND:Techniques for reconstruction of biological networks which are based on perturbation experiments often predict direct interactions between nodes that do not exist. Transitive reduction removes such relations if they can be explained by an indirect path of influences. The existing algorithms for transitive re...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-281

    authors: Bošnački D,Odenbrett MR,Wijs A,Ligtenberg W,Hilbers P

    更新日期:2012-10-30 00:00:00

  • tacg--a grep for DNA.

    abstract:BACKGROUND:Pattern matching is the core of bioinformatics; it is used in database searching, restriction enzyme mapping, and finding open reading frames. It is done repeatedly over increasingly long sequences, thus codes must be efficient and insensitive to sequence length. Such patterns of interest include simple moti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-3-8

    authors: Mangalam HJ

    更新日期:2002-01-01 00:00:00

  • Maximizing Kolmogorov Complexity for accurate and robust bright field cell segmentation.

    abstract:BACKGROUND:Analysis of cellular processes with microscopic bright field defocused imaging has the advantage of low phototoxicity and minimal sample preparation. However bright field images lack the contrast and nuclei reporting available with florescent approaches and therefore present a challenge to methods that segme...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-32

    authors: Mohamadlou H,Shope JC,Flann NS

    更新日期:2014-01-30 00:00:00

  • Finding sRNA generative locales from high-throughput sequencing data with NiBLS.

    abstract:BACKGROUND:Next-generation sequencing technologies allow researchers to obtain millions of sequence reads in a single experiment. One important use of the technology is the sequencing of small non-coding regulatory RNAs and the identification of the genomic locales from which they originate. Currently, there is a pauci...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-93

    authors: MacLean D,Moulton V,Studholme DJ

    更新日期:2010-02-18 00:00:00

  • The reactive metabolite target protein database (TPDB)--a web-accessible resource.

    abstract:BACKGROUND:The toxic effects of many simple organic compounds stem from their biotransformation to chemically reactive metabolites which bind covalently to cellular proteins. To understand the mechanisms of cytotoxic responses it may be important to know which proteins become adducted and whether some may be common tar...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-95

    authors: Hanzlik RP,Koen YM,Theertham B,Dong Y,Fang J

    更新日期:2007-03-16 00:00:00

  • A method of predicting changes in human gene splicing induced by genetic variants in context of cis-acting elements.

    abstract:BACKGROUND:Polymorphic variants and mutations disrupting canonical splicing isoforms are among the leading causes of human hereditary disorders. While there is a substantial evidence of aberrant splicing causing Mendelian diseases, the implication of such events in multi-genic disorders is yet to be well understood. We...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-22

    authors: Churbanov A,Vorechovský I,Hicks C

    更新日期:2010-01-12 00:00:00

  • Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.

    abstract:BACKGROUND:High-throughput sequencing technologies, such as the Illumina Genome Analyzer, are powerful new tools for investigating a wide range of biological and medical questions. Statistical and computational methods are key for drawing meaningful and accurate conclusions from the massive and complex datasets generat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-94

    authors: Bullard JH,Purdom E,Hansen KD,Dudoit S

    更新日期:2010-02-18 00:00:00

  • IILLS: predicting virus-receptor interactions based on similarity and semi-supervised learning.

    abstract:BACKGROUND:Viral infectious diseases are the serious threat for human health. The receptor-binding is the first step for the viral infection of hosts. To more effectively treat human viral infectious diseases, the hidden virus-receptor interactions must be discovered. However, current computational methods for predicti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3278-3

    authors: Yan C,Duan G,Wu FX,Wang J

    更新日期:2019-12-27 00:00:00

  • Relation extraction between bacteria and biotopes from biomedical texts with attention mechanisms and domain-specific contextual representations.

    abstract:BACKGROUND:The Bacteria Biotope (BB) task is a biomedical relation extraction (RE) that aims to study the interaction between bacteria and their locations. This task is considered to pertain to fundamental knowledge in applied microbiology. Some previous investigations conducted the study by applying feature-based mode...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3217-3

    authors: Jettakul A,Wichadakul D,Vateekul P

    更新日期:2019-12-03 00:00:00

  • Efficient use of unlabeled data for protein sequence classification: a comparative study.

    abstract:BACKGROUND:Recent studies in computational primary protein sequence analysis have leveraged the power of unlabeled data. For example, predictive models based on string kernels trained on sequences known to belong to particular folds or superfamilies, the so-called labeled data set, can attain significantly improved acc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S4-S2

    authors: Kuksa P,Huang PH,Pavlovic V

    更新日期:2009-04-29 00:00:00

  • ProbPS: a new model for peak selection based on quantifying the dependence of the existence of derivative peaks on primary ion intensity.

    abstract:BACKGROUND:The analysis of mass spectra suggests that the existence of derivative peaks is strongly dependent on the intensity of the primary peaks. Peak selection from tandem mass spectrum is used to filter out noise and contaminant peaks. It is widely accepted that a valid primary peak tends to have high intensity an...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-346

    authors: Zhang S,Wang Y,Bu D,Zhang H,Sun S

    更新日期:2011-08-17 00:00:00

  • Intestinal microbiota domination under extreme selective pressures characterized by metagenomic read cloud sequencing and assembly.

    abstract:BACKGROUND:Low diversity of the gut microbiome, often progressing to the point of intestinal domination by a single species, has been linked to poor outcomes in patients undergoing hematopoietic cell transplantation (HCT). Our ability to understand how certain organisms attain intestinal domination over others has been...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3073-1

    authors: Kang JB,Siranosian BA,Moss EL,Banaei N,Andermann TM,Bhatt AS

    更新日期:2019-12-02 00:00:00

  • An optimized TOPS+ comparison method for enhanced TOPS models.

    abstract:BACKGROUND:Although methods based on highly abstract descriptions of protein structures, such as VAST and TOPS, can perform very fast protein structure comparison, the results can lack a high degree of biological significance. Previously we have discussed the basic mechanisms of our novel method for structure compariso...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-138

    authors: Veeramalai M,Gilbert D,Valiente G

    更新日期:2010-03-17 00:00:00

  • BioIMAX: a Web 2.0 approach for easy exploratory and collaborative access to multivariate bioimage data.

    abstract:BACKGROUND:Innovations in biological and biomedical imaging produce complex high-content and multivariate image data. For decision-making and generation of hypotheses, scientists need novel information technology tools that enable them to visually explore and analyze the data and to discuss and communicate results or f...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-297

    authors: Loyek C,Rajpoot NM,Khan M,Nattkemper TW

    更新日期:2011-07-21 00:00:00

  • LAVA: an open-source approach to designing LAMP (loop-mediated isothermal amplification) DNA signatures.

    abstract:BACKGROUND:We developed an extendable open-source Loop-mediated isothermal AMPlification (LAMP) signature design program called LAVA (LAMP Assay Versatile Analysis). LAVA was created in response to limitations of existing LAMP signature programs. RESULTS:LAVA identifies combinations of six primer regions for basic LAM...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-240

    authors: Torres C,Vitalis EA,Baker BR,Gardner SN,Torres MW,Dzenitis JM

    更新日期:2011-06-16 00:00:00

  • A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks.

    abstract:BACKGROUND:Reverse engineering of transcriptional regulatory networks (TRN) from genomics data has always represented a computational challenge in System Biology. The major issue is modeling the complex crosstalk among transcription factors (TFs) and their target genes, with a method able to handle both the high number...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3510-1

    authors: Sauta E,Demartini A,Vitali F,Riva A,Bellazzi R

    更新日期:2020-05-29 00:00:00

  • PoGO: Prediction of Gene Ontology terms for fungal proteins.

    abstract:BACKGROUND:Automated protein function prediction methods are the only practical approach for assigning functions to genes obtained from model organisms. Many of the previously reported function annotation methods are of limited utility for fungal protein annotation. They are often trained only to one species, are not a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-215

    authors: Jung J,Yi G,Sukno SA,Thon MR

    更新日期:2010-04-29 00:00:00

  • Using expression arrays for copy number detection: an example from E. coli.

    abstract:BACKGROUND:The sequencing of many genomes and tiling arrays consisting of millions of DNA segments spanning entire genomes have made high-resolution copy number analysis possible. Microarray-based comparative genomic hybridization (array CGH) has enabled the high-resolution detection of DNA copy number aberrations. Whi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-203

    authors: Skvortsov D,Abdueva D,Stitzer ME,Finkel SE,Tavaré S

    更新日期:2007-06-14 00:00:00

  • Developing optimal input design strategies in cancer systems biology with applications to microfluidic device engineering.

    abstract:BACKGROUND:Mechanistic models are becoming more and more popular in Systems Biology; identification and control of models underlying biochemical pathways of interest in oncology is a primary goal in this field. Unfortunately the scarce availability of data still limits our understanding of the intrinsic characteristics...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S12-S4

    authors: Menolascina F,Bellomo D,Maiwald T,Bevilacqua V,Ciminelli C,Paradiso A,Tommasi S

    更新日期:2009-10-15 00:00:00

  • Generating confidence intervals on biological networks.

    abstract:BACKGROUND:In the analysis of networks we frequently require the statistical significance of some network statistic, such as measures of similarity for the properties of interacting nodes. The structure of the network may introduce dependencies among the nodes and it will in general be necessary to account for these de...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-467

    authors: Thorne T,Stumpf MP

    更新日期:2007-11-30 00:00:00

  • pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties.

    abstract:BACKGROUND:Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal peptides) of proteins. Ho...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-152

    authors: Sarda D,Chua GH,Li KB,Krishnan A

    更新日期:2005-06-17 00:00:00

  • Detection of transposable elements by their compositional bias.

    abstract:BACKGROUND:Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-94

    authors: Andrieu O,Fiston AS,Anxolabéhère D,Quesneville H

    更新日期:2004-07-13 00:00:00

  • Correlation analysis reveals the emergence of coherence in the gene expression dynamics following system perturbation.

    abstract::Time course gene expression experiments are a popular means to infer co-expression. Many methods have been proposed to cluster genes or to build networks based on similarity measures of their expression dynamics. In this paper we apply a correlation based approach to network reconstruction to three datasets of time se...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S1-S16

    authors: Neretti N,Remondini D,Tatar M,Sedivy JM,Pierini M,Mazzatti D,Powell J,Franceschi C,Castellani GC

    更新日期:2007-03-08 00:00:00

  • SSWAP: A Simple Semantic Web Architecture and Protocol for semantic web services.

    abstract:BACKGROUND:SSWAP (Simple Semantic Web Architecture and Protocol; pronounced "swap") is an architecture, protocol, and platform for using reasoning to semantically integrate heterogeneous disparate data and services on the web. SSWAP was developed as a hybrid semantic web services technology to overcome limitations foun...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-309

    authors: Gessler DD,Schiltz GS,May GD,Avraham S,Town CD,Grant D,Nelson RT

    更新日期:2009-09-23 00:00:00

  • Usability of human Infinium MethylationEPIC BeadChip for mouse DNA methylation studies.

    abstract:BACKGROUND:The advent of array-based genome-wide DNA methylation methods has enabled quantitative measurement of single CpG methylation status at relatively low cost and sample input. Whereas the use of Infinium Human Methylation BeadChips has shown great utility in clinical studies, no equivalent tool is available for...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1870-y

    authors: Needhamsen M,Ewing E,Lund H,Gomez-Cabrero D,Harris RA,Kular L,Jagodic M

    更新日期:2017-11-15 00:00:00

  • A novel similarity-measure for the analysis of genetic data in complex phenotypes.

    abstract:BACKGROUND:Recent technological advances in DNA sequencing and genotyping have led to the accumulation of a remarkable quantity of data on genetic polymorphisms. However, the development of new statistical and computational tools for effective processing of these data has not been equally as fast. In particular, Machin...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S6-S24

    authors: Lagani V,Montesanto A,Di Cianni F,Moreno V,Landi S,Conforti D,Rose G,Passarino G

    更新日期:2009-06-16 00:00:00

  • Survival models with preclustered gene groups as covariates.

    abstract:BACKGROUND:An important application of high dimensional gene expression measurements is the risk prediction and the interpretation of the variables in the resulting survival models. A major problem in this context is the typically large number of genes compared to the number of observations (individuals). Feature selec...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-478

    authors: Kammers K,Lang M,Hengstler JG,Schmidt M,Rahnenführer J

    更新日期:2011-12-16 00:00:00

  • Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data.

    abstract:BACKGROUND:This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulatio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-310

    authors: Barros RC,Winck AT,Machado KS,Basgalupp MP,de Carvalho AC,Ruiz DD,de Souza ON

    更新日期:2012-11-21 00:00:00

  • Prediction of specificity-determining residues for small-molecule kinase inhibitors.

    abstract:BACKGROUND:Designing small-molecule kinase inhibitors with desirable selectivity profiles is a major challenge in drug discovery. A high-throughput screen for inhibitors of a given kinase will typically yield many compounds that inhibit more than one kinase. A series of chemical modifications are usually required befor...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-491

    authors: Caffrey DR,Lunney EA,Moshinsky DJ

    更新日期:2008-11-25 00:00:00

  • Measure of synonymous codon usage diversity among genes in bacteria.

    abstract:BACKGROUND:In many bacteria, intragenomic diversity in synonymous codon usage among genes has been reported. However, no quantitative attempt has been made to compare the diversity levels among different genomes. Here, we introduce a mean dissimilarity-based index (Dmean) for quantifying the level of diversity in synon...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-167

    authors: Suzuki H,Saito R,Tomita M

    更新日期:2009-06-01 00:00:00