Pairwise protein expression classifier for candidate biomarker discovery for early detection of human disease prognosis.

Abstract:

BACKGROUND:An approach to molecular classification based on the comparative expression of protein pairs is presented. The method overcomes some of the present limitations in using peptide intensity data for class prediction for problems such as the detection of a disease, disease prognosis, or for predicting treatment response. Data analysis is particularly challenging in these situations due to sample size (typically tens) being much smaller than the large number of peptides (typically thousands). Methods based upon high dimensional statistical models, machine learning or other complex classifiers generate decisions which may be very accurate but can be complex and difficult to interpret in simple or biologically meaningful terms. A classification scheme, called ProtPair, is presented that generates simple decision rules leading to accurate classification which is based on measurement of very few proteins and requires only relative expression values, providing specific targeted hypotheses suitable for straightforward validation. RESULTS:ProtPair has been tested against clinical data from 21 patients following a bone marrow transplant, 13 of which progress to idiopathic pneumonia syndrome (IPS). The approach combines multiple peptide pairs originating from the same set of proteins, with each unique peptide pair providing an independent measure of discriminatory power. The prediction rate of the ProtPair for IPS study as measured by leave-one-out CV is 69.1%, which can be very beneficial for clinical diagnosis as it may flag patients in need of closer monitoring. The "top ranked" proteins provided by ProtPair are known to be associated with the biological processes and pathways intimately associated with known IPS biology based on mouse models. CONCLUSIONS:An approach to biomarker discovery, called ProtPair, is presented. ProtPair is based on the differential expression of pairs of peptides and the associated proteins. Using mass spectrometry data from "bottom up" proteomics methods, functionally related proteins/peptide pairs exhibiting co-ordinated changes expression profile are discovered, which represent a signature for patients progressing to various disease conditions. The method has been tested against clinical data from patients progressing to idiopthatic pneumonia syndrome (IPS) following a bone marrow transplant. The data indicates that patients with improper regulation in the concentration of specific acute phase response proteins at the time of bone marrow transplant are highly likely to develop IPS within few weeks. The results lead to a specific set of protein pairs that can be efficiently verified by investigating the pairwise abundance change in independent cohorts using ELISA or targeted mass spectrometry techniques. This generalized classifier can be extended to other clinical problems in a variety of contexts.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Kaur P,Schlatzer D,Cooke K,Chance MR

doi

10.1186/1471-2105-13-191

subject

Has Abstract

pub_date

2012-08-07 00:00:00

pages

191

issn

1471-2105

pii

1471-2105-13-191

journal_volume

13

pub_type

杂志文章
  • Predikin and PredikinDB: a computational framework for the prediction of protein kinase peptide specificity and an associated database of phosphorylation sites.

    abstract:BACKGROUND:We have previously described an approach to predicting the substrate specificity of serine-threonine protein kinases. The method, named Predikin, identifies key conserved substrate-determining residues in the kinase catalytic domain that contact the substrate in the region of the phosphorylation site and so ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-245

    authors: Saunders NF,Brinkworth RI,Huber T,Kemp BE,Kobe B

    更新日期:2008-05-26 00:00:00

  • Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy.

    abstract:BACKGROUND:Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as si...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-28

    authors: Alexopoulou D,Andreopoulos B,Dietze H,Doms A,Gandon F,Hakenberg J,Khelif K,Schroeder M,Wächter T

    更新日期:2009-01-21 00:00:00

  • Towards barcode markers in Fungi: an intron map of Ascomycota mitochondria.

    abstract:BACKGROUND:A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality. In particular the potential use of mitochondrial DNA species markers has been taken in account. Unfortunately, a serious difficulty in the PCR and bioinform...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S6-S15

    authors: Santamaria M,Vicario S,Pappadà G,Scioscia G,Scazzocchio C,Saccone C

    更新日期:2009-06-16 00:00:00

  • A computational approach for detecting peptidases and their specific inhibitors at the genome level.

    abstract:BACKGROUND:Peptidases are proteolytic enzymes responsible for fundamental cellular activities in all organisms. Apparently about 2-5% of the genes encode for peptidases, irrespectively of the organism source. The basic peptidase function is "protein digestion" and this can be potentially dangerous in living organisms w...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S1-S3

    authors: Bartoli L,Calabrese R,Fariselli P,Mita DG,Casadio R

    更新日期:2007-03-08 00:00:00

  • A compartmentalized approach to the assembly of physical maps.

    abstract:BACKGROUND:Physical maps have been historically one of the cornerstones of genome sequencing and map-based cloning strategies. They also support marker assisted breeding and EST mapping. The problem of building a high quality physical map is computationally challenging due to unavoidable noise in the input fingerprint ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-217

    authors: Bozdag S,Close TJ,Lonardi S

    更新日期:2009-07-15 00:00:00

  • Finite mixture clustering of human tissues with different levels of IGF-1 splice variants mRNA transcripts.

    abstract:BACKGROUND:This study addresses a recurrent biological problem, that is to define a formal clustering structure for a set of tissues on the basis of the relative abundance of multiple alternatively spliced isoforms mRNAs generated by the same gene. To this aim, we have used a model-based clustering approach, based on a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0689-7

    authors: Pelosi M,Alfò M,Martella F,Pappalardo E,Musarò A

    更新日期:2015-09-15 00:00:00

  • An integrative method to normalize RNA-Seq data.

    abstract:BACKGROUND:Transcriptome sequencing is a powerful tool for measuring gene expression, but as well as some other technologies, various artifacts and biases affect the quantification. In order to correct some of them, several normalization approaches have emerged, differing both in the statistical strategy employed and i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-188

    authors: Filloux C,Cédric M,Romain P,Lionel F,Christophe K,Dominique R,Abderrahman M,Daniel P

    更新日期:2014-06-14 00:00:00

  • Analysis and prediction of antibacterial peptides.

    abstract:BACKGROUND:Antibacterial peptides are important components of the innate immune system, used by the host to protect itself from different types of pathogenic bacteria. Over the last few decades, the search for new drugs and drug targets has prompted an interest in these antibacterial peptides. We analyzed 486 antibacte...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-263

    authors: Lata S,Sharma BK,Raghava GP

    更新日期:2007-07-23 00:00:00

  • Discovering motifs that induce sequencing errors.

    abstract:BACKGROUND:Elevated sequencing error rates are the most predominant obstacle in single-nucleotide polymorphism (SNP) detection, which is a major goal in the bulk of current studies using next-generation sequencing (NGS). Beyond routinely handled generic sources of errors, certain base calling errors relate to specific ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S5-S1

    authors: Allhoff M,Schönhuth A,Martin M,Costa IG,Rahmann S,Marschall T

    更新日期:2013-01-01 00:00:00

  • A stochastic context free grammar based framework for analysis of protein sequences.

    abstract:BACKGROUND:In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity of relationship between amino acids have mainly limited...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-323

    authors: Dyrka W,Nebel JC

    更新日期:2009-10-08 00:00:00

  • Odyssey: a semi-automated pipeline for phasing, imputation, and analysis of genome-wide genetic data.

    abstract:BACKGROUND:Genome imputation, admixture resolution and genome-wide association analyses are timely and computationally intensive processes with many composite and requisite steps. Analysis time increases further when building and installing the run programs required for these analyses. For scientists that may not be as...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2964-5

    authors: Eller RJ,Janga SC,Walsh S

    更新日期:2019-06-28 00:00:00

  • Methodology capture: discriminating between the "best" and the rest of community practice.

    abstract:BACKGROUND:The methodologies we use both enable and help define our research. However, as experimental complexity has increased the choice of appropriate methodologies has become an increasingly difficult task. This makes it difficult to keep track of available bioinformatics software, let alone the most suitable proto...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-359

    authors: Eales JM,Pinney JW,Stevens RD,Robertson DL

    更新日期:2008-09-01 00:00:00

  • Model based analysis of real-time PCR data from DNA binding dye protocols.

    abstract:BACKGROUND:Reverse transcription followed by real-time PCR is widely used for quantification of specific mRNA, and with the use of double-stranded DNA binding dyes it is becoming a standard for microarray data validation. Despite the kinetic information generated by real-time PCR, most popular analysis methods assume c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-85

    authors: Alvarez MJ,Vila-Ortiz GJ,Salibe MC,Podhajcer OL,Pitossi FJ

    更新日期:2007-03-09 00:00:00

  • Visualizing complex feature interactions and feature sharing in genomic deep neural networks.

    abstract:BACKGROUND:Visualization tools for deep learning models typically focus on discovering key input features without considering how such low level features are combined in intermediate layers to make decisions. Moreover, many of these methods examine a network's response to specific input examples that may be insufficien...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2957-4

    authors: Liu G,Zeng H,Gifford DK

    更新日期:2019-07-19 00:00:00

  • Advances in translational bioinformatics facilitate revealing the landscape of complex disease mechanisms.

    abstract::Advances of high-throughput technologies have rapidly produced more and more data from DNAs and RNAs to proteins, especially large volumes of genome-scale data. However, connection of the genomic information to cellular functions and biological behaviours relies on the development of effective approaches at higher sys...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S17-I1

    authors: Yang JY,Dunker A,Liu JS,Qin X,Arabnia HR,Yang W,Niemierko A,Chen Z,Luo Z,Wang L,Liu Y,Xu D,Deng Y,Tong W,Yang M

    更新日期:2014-01-01 00:00:00

  • Partial correlation analysis indicates causal relationships between GC-content, exon density and recombination rate in the human genome.

    abstract:BACKGROUND:Several features are known to correlate with the GC-content in the human genome, including recombination rate, gene density and distance to telomere. However, by testing for pairwise correlation only, it is impossible to distinguish direct associations from indirect ones and to distinguish between causes and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S1-S66

    authors: Freudenberg J,Wang M,Yang Y,Li W

    更新日期:2009-01-30 00:00:00

  • Survival models with preclustered gene groups as covariates.

    abstract:BACKGROUND:An important application of high dimensional gene expression measurements is the risk prediction and the interpretation of the variables in the resulting survival models. A major problem in this context is the typically large number of genes compared to the number of observations (individuals). Feature selec...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-478

    authors: Kammers K,Lang M,Hengstler JG,Schmidt M,Rahnenführer J

    更新日期:2011-12-16 00:00:00

  • Computational analysis of gene expression space associated with metastatic cancer.

    abstract:BACKGROUND:Prostate carcinoma is among the most common types of cancer affecting hundreds of thousands people every year. Once the metastatic form of prostate carcinoma is documented, the majority of patients die from their tumors as opposed to other causes. The key to successful treatment is in the earliest possible d...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S11-S6

    authors: Ptitsyn A

    更新日期:2009-10-08 00:00:00

  • Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.

    abstract:BACKGROUND:RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging in the context of dif...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0994-9

    authors: Bi R,Liu P

    更新日期:2016-03-31 00:00:00

  • proTRAC--a software for probabilistic piRNA cluster detection, visualization and analysis.

    abstract:BACKGROUND:Throughout the metazoan lineage, typically gonadal expressed Piwi proteins and their guiding piRNAs (~26-32nt in length) form a protective mechanism of RNA interference directed against the propagation of transposable elements (TEs). Most piRNAs are generated from genomic piRNA clusters. Annotation of experi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-5

    authors: Rosenkranz D,Zischler H

    更新日期:2012-01-10 00:00:00

  • Molecular evolution of dihydrouridine synthases.

    abstract:BACKGROUND:Dihydrouridine (D) is a modified base found in conserved positions in the D-loop of tRNA in Bacteria, Eukaryota, and some Archaea. Despite the abundant occurrence of D, little is known about its biochemical roles in mediating tRNA function. It is assumed that D may destabilize the structure of tRNA and thus ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-153

    authors: Kasprzak JM,Czerwoniec A,Bujnicki JM

    更新日期:2012-06-28 00:00:00

  • Bioinformatics Resource Manager: a systems biology web tool for microRNA and omics data integration.

    abstract:BACKGROUND:The Bioinformatics Resource Manager (BRM) is a web-based tool developed to facilitate identifier conversion and data integration for Homo sapiens (human), Mus musculus (mouse), Rattus norvegicus (rat), Danio rerio (zebrafish), and Macaca mulatta (macaque), as well as perform orthologous conversions among the...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2805-6

    authors: Brown J,Phillips AR,Lewis DA,Mans MA,Chang Y,Tanguay RL,Peterson ES,Waters KM,Tilton SC

    更新日期:2019-05-17 00:00:00

  • μHEM for identification of differentially expressed miRNAs using hypercuboid equivalence partition matrix.

    abstract:BACKGROUND:The miRNAs, a class of short approximately 22-nucleotide non-coding RNAs, often act post-transcriptionally to inhibit mRNA expression. In effect, they control gene expression by targeting mRNA. They also help in carrying out normal functioning of a cell as they play an important role in various cellular proc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-266

    authors: Paul S,Maji P

    更新日期:2013-09-04 00:00:00

  • Visualization methods for statistical analysis of microarray clusters.

    abstract:BACKGROUND:The most common method of identifying groups of functionally related genes in microarray data is to apply a clustering algorithm. However, it is impossible to determine which clustering algorithm is most appropriate to apply, and it is difficult to verify the results of any algorithm due to the lack of a gol...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-115

    authors: Hibbs MA,Dirksen NC,Li K,Troyanskaya OG

    更新日期:2005-05-12 00:00:00

  • Prioritization, clustering and functional annotation of MicroRNAs using latent semantic indexing of MEDLINE abstracts.

    abstract:BACKGROUND:The amount of scientific information about MicroRNAs (miRNAs) is growing exponentially, making it difficult for researchers to interpret experimental results. In this study, we present an automated text mining approach using Latent Semantic Indexing (LSI) for prioritization, clustering and functional annotat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1223-2

    authors: Roy S,Curry BC,Madahian B,Homayouni R

    更新日期:2016-10-06 00:00:00

  • Benchmarking the HLA typing performance of Polysolver and Optitype in 50 Danish parental trios.

    abstract:BACKGROUND:The adaptive immune response intrinsically depends on hypervariable human leukocyte antigen (HLA) genes. Concomitantly, correct HLA phenotyping is crucial for successful donor-patient matching in organ transplantation. The cost and technical limitations of current laboratory techniques, together with advance...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2239-6

    authors: Matey-Hernandez ML,Danish Pan Genome Consortium.,Brunak S,Izarzugaza JMG

    更新日期:2018-06-25 00:00:00

  • MOSBIE: a tool for comparison and analysis of rule-based biochemical models.

    abstract:BACKGROUND:Mechanistic models that describe the dynamical behaviors of biochemical systems are common in computational systems biology, especially in the realm of cellular signaling. The development of families of such models, either by a single research group or by different groups working within the same area, presen...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-316

    authors: Wenskovitch JE Jr,Harris LA,Tapia JJ,Faeder JR,Marai GE

    更新日期:2014-09-25 00:00:00

  • Prediction of HIV-1 protease cleavage site using a combination of sequence, structural, and physicochemical features.

    abstract:BACKGROUND:The human immunodeficiency virus type 1 (HIV-1) aspartic protease is an important enzyme owing to its imperative part in viral development and a causative agent of deadliest disease known as acquired immune deficiency syndrome (AIDS). Development of HIV-1 protease inhibitors can help understand the specifici...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1337-6

    authors: Singh O,Su EC

    更新日期:2016-12-23 00:00:00

  • CNV-WebStore: online CNV analysis, storage and interpretation.

    abstract:BACKGROUND:Microarray technology allows the analysis of genomic aberrations at an ever increasing resolution, making functional interpretation of these vast amounts of data the main bottleneck in routine implementation of high resolution array platforms, and emphasising the need for a centralised and easy to use CNV da...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-4

    authors: Vandeweyer G,Reyniers E,Wuyts W,Rooms L,Kooy RF

    更新日期:2011-01-05 00:00:00

  • Towards a supervised classification of neocortical interneuron morphologies.

    abstract:BACKGROUND:The challenge of classifying cortical interneurons is yet to be solved. Data-driven classification into established morphological types may provide insight and practical value. RESULTS:We trained models using 217 high-quality morphologies of rat somatosensory neocortex interneurons reconstructed by a single...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2470-1

    authors: Mihaljević B,Larrañaga P,Benavides-Piccione R,Hill S,DeFelipe J,Bielza C

    更新日期:2018-12-17 00:00:00