Pripper: prediction of caspase cleavage sites from whole proteomes.

Abstract:

BACKGROUND:Caspases are a family of proteases that have central functions in programmed cell death (apoptosis) and inflammation. Caspases mediate their effects through aspartate-specific cleavage of their target proteins, and at present almost 400 caspase substrates are known. There are several methods developed to predict caspase cleavage sites from individual proteins, but currently none of them can be used to predict caspase cleavage sites from multiple proteins or entire proteomes, or to use several classifiers in combination. The possibility to create a database from predicted caspase cleavage products for the whole genome could significantly aid in identifying novel caspase targets from tandem mass spectrometry based proteomic experiments. RESULTS:Three different pattern recognition classifiers were developed for predicting caspase cleavage sites from protein sequences. Evaluation of the classifiers with quality measures indicated that all of the three classifiers performed well in predicting caspase cleavage sites, and when combining different classifiers the accuracy increased further. A new tool, Pripper, was developed to utilize the classifiers and predict the caspase cut sites from an arbitrary number of input sequences. A database was constructed with the developed tool, and it was used to identify caspase target proteins from tandem mass spectrometry data from two different proteomic experiments. Both known caspase cleavage products as well as novel cleavage products were identified using the database demonstrating the usefulness of the tool. Pripper is not restricted to predicting only caspase cut sites, but it gives the possibility to scan protein sequences for any given motif(s) and predict cut sites once a suitable cut site prediction model for any other protease has been developed. Pripper is freely available and can be downloaded from http://users.utu.fi/mijopi/Pripper. CONCLUSIONS:We have developed Pripper, a tool for reading an arbitrary number of proteins in FASTA format, predicting their caspase cleavage sites and outputting the cleaved sequences to a new FASTA format sequence file. We show that Pripper is a valuable tool in identifying novel caspase target proteins from modern proteomics experiments.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Piippo M,Lietzén N,Nevalainen OS,Salmi J,Nyman TA

doi

10.1186/1471-2105-11-320

subject

Has Abstract

pub_date

2010-06-15 00:00:00

pages

320

issn

1471-2105

pii

1471-2105-11-320

journal_volume

11

pub_type

杂志文章
  • Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.

    abstract:BACKGROUND:One of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-410

    authors: Scheeff ED,Bourne PE

    更新日期:2006-09-14 00:00:00

  • SegCorr a statistical procedure for the detection of genomic regions of correlated expression.

    abstract:BACKGROUND:Detecting local correlations in expression between neighboring genes along the genome has proved to be an effective strategy to identify possible causes of transcriptional deregulation in cancer. It has been successfully used to illustrate the role of mechanisms such as copy number variation (CNV) or epigene...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1742-5

    authors: Delatola EI,Lebarbier E,Mary-Huard T,Radvanyi F,Robin S,Wong J

    更新日期:2017-07-11 00:00:00

  • Attenuating dependence on structural data in computing protein energy landscapes.

    abstract:BACKGROUND:Nearly all cellular processes involve proteins structurally rearranging to accommodate molecular partners. The energy landscape underscores the inherent nature of proteins as dynamic molecules interconverting between structures with varying energies. In principle, reconstructing a protein's energy landscape ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2822-5

    authors: Morris D,Maximova T,Plaku E,Shehu A

    更新日期:2019-06-06 00:00:00

  • Stepwise kinetic equilibrium models of quantitative polymerase chain reaction.

    abstract:BACKGROUND:Numerous models for use in interpreting quantitative PCR (qPCR) data are present in recent literature. The most commonly used models assume the amplification in qPCR is exponential and fit an exponential model with a constant rate of increase to a select part of the curve. Kinetic theory may be used to model...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-203

    authors: Cobbs G

    更新日期:2012-08-16 00:00:00

  • Simultaneous fitting of real-time PCR data with efficiency of amplification modeled as Gaussian function of target fluorescence.

    abstract:BACKGROUND:In real-time PCR, it is necessary to consider the efficiency of amplification (EA) of amplicons in order to determine initial target levels properly. EAs can be deduced from standard curves, but these involve extra effort and cost and may yield invalid EAs. Alternatively, EA can be extracted from individual ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-95

    authors: Batsch A,Noetel A,Fork C,Urban A,Lazic D,Lucas T,Pietsch J,Lazar A,Schömig E,Gründemann D

    更新日期:2008-02-12 00:00:00

  • GOPET: a tool for automated predictions of Gene Ontology terms.

    abstract:BACKGROUND:Vast progress in sequencing projects has called for annotation on a large scale. A Number of methods have been developed to address this challenging task. These methods, however, either apply to specific subsets, or their predictions are not formalised, or they do not provide precise confidence values for th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-161

    authors: Vinayagam A,del Val C,Schubert F,Eils R,Glatting KH,Suhai S,König R

    更新日期:2006-03-20 00:00:00

  • The GMOseek matrix: a decision support tool for optimizing the detection of genetically modified plants.

    abstract:BACKGROUND:Since their first commercialization, the diversity of taxa and the genetic composition of transgene sequences in genetically modified plants (GMOs) are constantly increasing. To date, the detection of GMOs and derived products is commonly performed by PCR-based methods targeting specific DNA sequences introd...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-256

    authors: Block A,Debode F,Grohmann L,Hulin J,Taverniers I,Kluga L,Barbau-Piednoir E,Broeders S,Huber I,Van den Bulcke M,Heinze P,Berben G,Busch U,Roosens N,Janssen E,Žel J,Gruden K,Morisset D

    更新日期:2013-08-22 00:00:00

  • Googling DNA sequences on the World Wide Web.

    abstract:BACKGROUND:New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S14-S4

    authors: Hajibabaei M,Singer GA

    更新日期:2009-11-10 00:00:00

  • FITBAR: a web tool for the robust prediction of prokaryotic regulons.

    abstract:BACKGROUND:The binding of regulatory proteins to their specific DNA targets determines the accurate expression of the neighboring genes. The in silico prediction of new binding sites in completely sequenced genomes is a key aspect in the deeper understanding of gene regulatory networks. Several algorithms have been des...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-554

    authors: Oberto J

    更新日期:2010-11-11 00:00:00

  • Reranking candidate gene models with cross-species comparison for improved gene prediction.

    abstract:BACKGROUND:Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-433

    authors: Liu Q,Crammer K,Pereira FC,Roos DS

    更新日期:2008-10-14 00:00:00

  • Constructing a meaningful evolutionary average at the phylogenetic center of mass.

    abstract:BACKGROUND:As a consequence of the evolutionary process, data collected from related species tend to be similar. This similarity by descent can obscure subtler signals in the data such as the evidence of constraint on variation due to shared selective pressures. In comparative sequence analysis, for example, sequence s...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-222

    authors: Stone EA,Sidow A

    更新日期:2007-06-26 00:00:00

  • Discrimination of cell cycle phases in PCNA-immunolabeled cells.

    abstract:BACKGROUND:Protein function in eukaryotic cells is often controlled in a cell cycle-dependent manner. Therefore, the correct assignment of cellular phenotypes to cell cycle phases is a crucial task in cell biology research. Nuclear proteins whose localization varies during the cell cycle are valuable and frequently use...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0618-9

    authors: Schönenberger F,Deutzmann A,Ferrando-May E,Merhof D

    更新日期:2015-05-29 00:00:00

  • Variable cellular decision-making behavior in a constant synthetic network topology.

    abstract:BACKGROUND:Modules of interacting components arranged in specific network topologies have evolved to perform a diverse array of cellular functions. For a network with a constant topological structure, its function within a cell may still be tuned by changing the number of instances of a particular component (e.g., gene...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2866-6

    authors: Shah NA,Sarkar CA

    更新日期:2019-05-14 00:00:00

  • CONFOLD2: improved contact-driven ab initio protein structure modeling.

    abstract:BACKGROUND:Contact-guided protein structure prediction methods are becoming more and more successful because of the latest advances in residue-residue contact prediction. To support contact-driven structure prediction, effective tools that can quickly build tertiary structural models of good quality from predicted cont...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2032-6

    authors: Adhikari B,Cheng J

    更新日期:2018-01-25 00:00:00

  • DeepCryoPicker: fully automated deep neural network for single protein particle picking in cryo-EM.

    abstract:BACKGROUND:Cryo-electron microscopy (Cryo-EM) is widely used in the determination of the three-dimensional (3D) structures of macromolecules. Particle picking from 2D micrographs remains a challenging early step in the Cryo-EM pipeline due to the diversity of particle shapes and the extremely low signal-to-noise ratio ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03809-7

    authors: Al-Azzawi A,Ouadou A,Max H,Duan Y,Tanner JJ,Cheng J

    更新日期:2020-11-09 00:00:00

  • SeqVISTA: a graphical tool for sequence feature visualization and comparison.

    abstract:BACKGROUND:Many readers will sympathize with the following story. You are viewing a gene sequence in Entrez, and you want to find whether it contains a particular sequence motif. You reach for the browser's "find in page" button, but those darn spaces every 10 bp get in the way. And what if the motif is on the opposite...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-1

    authors: Hu Z,Frith M,Niu T,Weng Z

    更新日期:2003-01-04 00:00:00

  • A note on generalized Genome Scan Meta-Analysis statistics.

    abstract:BACKGROUND:Wise et al. introduced a rank-based statistical technique for meta-analysis of genome scans, the Genome Scan Meta-Analysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed dif...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-32

    authors: Koziol JA,Feng AC

    更新日期:2005-02-17 00:00:00

  • Multi-label literature classification based on the Gene Ontology graph.

    abstract:BACKGROUND:The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-525

    authors: Jin B,Muller B,Zhai C,Lu X

    更新日期:2008-12-08 00:00:00

  • Identification of properties important to protein aggregation using feature selection.

    abstract:BACKGROUND:Protein aggregation is a significant problem in the biopharmaceutical industry (protein drug stability) and is associated medically with over 40 human diseases. Although a number of computational models have been developed for predicting aggregation propensity and identifying aggregation-prone regions in pro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-314

    authors: Fang Y,Gao S,Tai D,Middaugh CR,Fang J

    更新日期:2013-10-28 00:00:00

  • TCGA2BED: extracting, extending, integrating, and querying The Cancer Genome Atlas.

    abstract:BACKGROUND:Data extraction and integration methods are becoming essential to effectively access and take advantage of the huge amounts of heterogeneous genomics and clinical data increasingly available. In this work, we focus on The Cancer Genome Atlas, a comprehensive archive of tumoral data containing the results of ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1419-5

    authors: Cumbo F,Fiscon G,Ceri S,Masseroli M,Weitschek E

    更新日期:2017-01-03 00:00:00

  • LSX: automated reduction of gene-specific lineage evolutionary rate heterogeneity for multi-gene phylogeny inference.

    abstract:BACKGROUND:Lineage rate heterogeneity can be a major source of bias, especially in multi-gene phylogeny inference. We had previously tackled this issue by developing LS3, a data subselection algorithm that, by removing fast-evolving sequences in a gene-specific manner, identifies subsets of sequences that evolve at a r...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3020-1

    authors: Rivera-Rivera CJ,Montoya-Burgos JI

    更新日期:2019-08-13 00:00:00

  • Sigma-RF: prediction of the variability of spatial restraints in template-based modeling by random forest.

    abstract:BACKGROUND:In template-based modeling when using a single template, inter-atomic distances of an unknown protein structure are assumed to be distributed by Gaussian probability density functions, whose center peaks are located at the distances between corresponding atoms in the template structure. The width of the Gaus...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0526-z

    authors: Lee J,Lee K,Joung I,Joo K,Brooks BR,Lee J

    更新日期:2015-03-21 00:00:00

  • BIOZON: a system for unification, management and analysis of heterogeneous biological data.

    abstract:BACKGROUND:Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that one has to face are integrity, consistency, redundancy, connectivity, expressiveness and updatability. DESCRIPTION:Here we present...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-70

    authors: Birkland A,Yona G

    更新日期:2006-02-15 00:00:00

  • The PowerAtlas: a power and sample size atlas for microarray experimental design and research.

    abstract:BACKGROUND:Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips need...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-84

    authors: Page GP,Edwards JW,Gadbury GL,Yelisetti P,Wang J,Trivedi P,Allison DB

    更新日期:2006-02-22 00:00:00

  • Moiety modeling framework for deriving moiety abundances from mass spectrometry measured isotopologues.

    abstract:BACKGROUND:Stable isotope tracing can follow individual atoms through metabolic transformations through the detection of the incorporation of stable isotope within metabolites. This resulting data can be interpreted in terms related to metabolic flux. However, detection of a stable isotope in metabolites by mass spectr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3096-7

    authors: Jin H,Moseley HNB

    更新日期:2019-10-28 00:00:00

  • Species-specific analysis of protein sequence motifs using mutual information.

    abstract:BACKGROUND:Protein sequence motifs are by definition short fragments of conserved amino acids, often associated with a specific function. Accordingly protein sequence profiles derived from multiple sequence alignments provide an alternative description of functional motifs characterizing families of related sequences. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-164

    authors: Hummel J,Keshvari N,Weckwerth W,Selbig J

    更新日期:2005-06-29 00:00:00

  • Ab-origin: an enhanced tool to identify the sourcing gene segments in germline for rearranged antibodies.

    abstract:BACKGROUND:In the adaptive immune system, variable regions of immunoglobulin (IG) are encoded by random recombination of variable (V), diversity (D), and joining (J) gene segments in the germline. Partitioning the functional antibody sequences to their sourcing germline gene segments is vital not only for understanding...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S12-S20

    authors: Wang X,Wu D,Zheng S,Sun J,Tao L,Li Y,Cao Z

    更新日期:2008-12-12 00:00:00

  • An automated framework for understanding structural variations in the binding grooves of MHC class II molecules.

    abstract:BACKGROUND:MHC/HLA class II molecules are important components of the immune system and play a critical role in processes such as phagocytosis. Understanding peptide recognition properties of the hundreds of MHC class II alleles is essential to appreciate determinants of antigenicity and ultimately to predict epitopes....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S1-S55

    authors: Yeturu K,Utriainen T,Kemp GJ,Chandra N

    更新日期:2010-01-18 00:00:00

  • New mini- zincin structures provide a minimal scaffold for members of this metallopeptidase superfamily.

    abstract:BACKGROUND:The Acel_2062 protein from Acidothermus cellulolyticus is a protein of unknown function. Initial sequence analysis predicted that it was a metallopeptidase from the presence of a motif conserved amongst the Asp-zincins, which are peptidases that contain a single, catalytic zinc ion ligated by the histidines ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-1

    authors: Trame CB,Chang Y,Axelrod HL,Eberhardt RY,Coggill P,Punta M,Rawlings ND

    更新日期:2014-01-03 00:00:00

  • Scoredist: a simple and robust protein sequence distance estimator.

    abstract:BACKGROUND:Distance-based methods are popular for reconstructing evolutionary trees thanks to their speed and generality. A number of methods exist for estimating distances from sequence alignments, which often involves some sort of correction for multiple substitutions. The problem is to accurately estimate the number...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-108

    authors: Sonnhammer EL,Hollich V

    更新日期:2005-04-27 00:00:00