QPath: a method for querying pathways in a protein-protein interaction network.

Abstract:

BACKGROUND:Sequence comparison is one of the most prominent tools in biological research, and is instrumental in studying gene function and evolution. The rapid development of high-throughput technologies for measuring protein interactions calls for extending this fundamental operation to the level of pathways in protein networks. RESULTS:We present a comprehensive framework for protein network searches using pathway queries. Given a linear query pathway and a network of interest, our algorithm, QPath, efficiently searches the network for homologous pathways, allowing both insertions and deletions of proteins in the identified pathways. Matched pathways are automatically scored according to their variation from the query pathway in terms of the protein insertions and deletions they employ, the sequence similarity of their constituent proteins to the query proteins, and the reliability of their constituent interactions. We applied QPath to systematically infer protein pathways in fly using an extensive collection of 271 putative pathways from yeast. QPath identified 69 conserved pathways whose members were both functionally enriched and coherently expressed. The resulting pathways tended to preserve the function of the original query pathways, allowing us to derive a first annotated map of conserved protein pathways in fly. CONCLUSION:Pathway homology searches using QPath provide a powerful approach for identifying biologically significant pathways and inferring their function. The growing amounts of protein interactions in public databases underscore the importance of our network querying framework for mining protein network data.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Shlomi T,Segal D,Ruppin E,Sharan R

doi

10.1186/1471-2105-7-199

subject

Has Abstract

pub_date

2006-04-10 00:00:00

pages

199

issn

1471-2105

pii

1471-2105-7-199

journal_volume

7

pub_type

杂志文章
  • A knowledge discovery object model API for Java.

    abstract:BACKGROUND:Biological data resources have become heterogeneous and derive from multiple sources. This introduces challenges in the management and utilization of this data in software development. Although efforts are underway to create a standard format for the transmission and storage of biological data, this objectiv...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-51

    authors: Zuyderduyn SD,Jones SJ

    更新日期:2003-10-28 00:00:00

  • Filling out the structural map of the NTF2-like superfamily.

    abstract:BACKGROUND:The NTF2-like superfamily is a versatile group of protein domains sharing a common fold. The sequences of these domains are very diverse and they share no common sequence motif. These domains serve a range of different functions within the proteins in which they are found, including both catalytic and non-ca...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-327

    authors: Eberhardt RY,Chang Y,Bateman A,Murzin AG,Axelrod HL,Hwang WC,Aravind L

    更新日期:2013-11-19 00:00:00

  • Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework.

    abstract:BACKGROUND:In omics data integration studies, it is common, for a variety of reasons, for some individuals to not be present in all data tables. Missing row values are challenging to deal with because most statistical methods cannot be directly applied to incomplete datasets. To overcome this issue, we propose a multip...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1273-5

    authors: Voillet V,Besse P,Liaubet L,San Cristobal M,González I

    更新日期:2016-10-03 00:00:00

  • CLU: a new algorithm for EST clustering.

    abstract:BACKGROUND:The continuous flow of EST data remains one of the richest sources for discoveries in modern biology. The first step in EST data mining is usually associated with EST clustering, the process of grouping of original fragments according to their annotation, similarity to known genomic DNA or each other. Cluste...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-S2-S3

    authors: Ptitsyn A,Hide W

    更新日期:2005-07-15 00:00:00

  • Analysis of density based and fuzzy c-means clustering methods on lesion border extraction in dermoscopy images.

    abstract:BACKGROUND:Computer-aided segmentation and border detection in dermoscopic images is one of the core components of diagnostic procedures and therapeutic interventions for skin cancer. Automated assessment tools for dermoscopy images have become an important research field mainly because of inter- and intra-observer var...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S6-S26

    authors: Kockara S,Mete M,Chen B,Aydin K

    更新日期:2010-10-07 00:00:00

  • Automating dChip: toward reproducible sharing of microarray data analysis.

    abstract:BACKGROUND:During the past decade, many software packages have been developed for analysis and visualization of various types of microarrays. We have developed and maintained the widely used dChip as a microarray analysis software package accessible to both biologist and data analysts. However, challenges arise when dC...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-231

    authors: Li C

    更新日期:2008-05-08 00:00:00

  • A stochastic context free grammar based framework for analysis of protein sequences.

    abstract:BACKGROUND:In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity of relationship between amino acids have mainly limited...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-323

    authors: Dyrka W,Nebel JC

    更新日期:2009-10-08 00:00:00

  • Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.

    abstract:BACKGROUND:RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging in the context of dif...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0994-9

    authors: Bi R,Liu P

    更新日期:2016-03-31 00:00:00

  • Fast max-margin clustering for unsupervised word sense disambiguation in biomedical texts.

    abstract:BACKGROUND:We aim to solve the problem of determining word senses for ambiguous biomedical terms with minimal human effort. METHODS:We build a fully automated system for Word Sense Disambiguation by designing a system that does not require manually-constructed external resources or manually-labeled training examples e...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S3-S4

    authors: Duan W,Song M,Yates A

    更新日期:2009-03-19 00:00:00

  • Probe-level linear model fitting and mixture modeling results in high accuracy detection of differential gene expression.

    abstract:BACKGROUND:The identification of differentially expressed genes (DEGs) from Affymetrix GeneChips arrays is currently done by first computing expression levels from the low-level probe intensities, then deriving significance by comparing these expression levels between conditions. The proposed PL-LM (Probe-Level Linear ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-391

    authors: Lemieux S

    更新日期:2006-08-25 00:00:00

  • Fast and robust group-wise eQTL mapping using sparse graphical models.

    abstract:BACKGROUND:Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. The traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression tra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0421-z

    authors: Cheng W,Shi Y,Zhang X,Wang W

    更新日期:2015-01-16 00:00:00

  • AT excursion: a new approach to predict replication origins in viral genomes by locating AT-rich regions.

    abstract:BACKGROUND:Replication origins are considered important sites for understanding the molecular mechanisms involved in DNA replication. Many computational methods have been developed for predicting their locations in archaeal, bacterial and eukaryotic genomes. However, a prediction method designed for a particular kind o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-163

    authors: Chew DS,Leung MY,Choi KP

    更新日期:2007-05-21 00:00:00

  • Integrated olfactory receptor and microarray gene expression databases.

    abstract:BACKGROUND:Gene expression patterns of olfactory receptors (ORs) are an important component of the signal encoding mechanism in the olfactory system since they determine the interactions between odorant ligands and sensory neurons. We have developed the Olfactory Receptor Microarray Database (ORMD) to house OR gene exp...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-231

    authors: Liu N,Crasto CJ,Ma M

    更新日期:2007-06-30 00:00:00

  • Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms.

    abstract:BACKGROUND:Predicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annotations at a genomic sc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S4-S14

    authors: Falda M,Toppo S,Pescarolo A,Lavezzo E,Di Camillo B,Facchinetti A,Cilia E,Velasco R,Fontana P

    更新日期:2012-03-28 00:00:00

  • Correction to: Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage.

    abstract::Following publication of the original article [1], the author reported that there are several errors in the original article. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章,已发布勘误

    doi:10.1186/s12859-019-3318-z

    authors: Ranjard L,Wong TKF,Rodrigo AG

    更新日期:2020-01-22 00:00:00

  • Investigating the concordance of Gene Ontology terms reveals the intra- and inter-platform reproducibility of enrichment analysis.

    abstract:BACKGROUND:Reliability and Reproducibility of differentially expressed genes (DEGs) are essential for the biological interpretation of microarray data. The microarray quality control (MAQC) project launched by US Food and Drug Administration (FDA) elucidated that the lists of DEGs generated by intra- and inter-platform...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-143

    authors: Zhang L,Zhang J,Yang G,Wu D,Jiang L,Wen Z,Li M

    更新日期:2013-04-29 00:00:00

  • Identification of CD8+ T cell epitopes through proteasome cleavage site predictions.

    abstract:BACKGROUND:We previously introduced PCPS (Proteasome Cleavage Prediction Server), a web-based tool to predict proteasome cleavage sites using n-grams. Here, we evaluated the ability of PCPS immunoproteasome cleavage model to discriminate CD8+ T cell epitopes. RESULTS:We first assembled an epitope dataset consisting of...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03782-1

    authors: Gomez-Perosanz M,Ras-Carmona A,Lafuente EM,Reche PA

    更新日期:2020-12-14 00:00:00

  • An extensible six-step methodology to automatically generate fuzzy DSSs for diagnostic applications.

    abstract:BACKGROUND:The diagnosis of many diseases can be often formulated as a decision problem; uncertainty affects these problems so that many computerized Diagnostic Decision Support Systems (in the following, DDSSs) have been developed to aid the physician in interpreting clinical data and thus to improve the quality of th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S1-S4

    authors: d'Acierno A,Esposito M,De Pietro G

    更新日期:2013-01-01 00:00:00

  • A semi-supervised learning approach to predict synthetic genetic interactions by combining functional and topological properties of functional gene network.

    abstract:BACKGROUND:Genetic interaction profiles are highly informative and helpful for understanding the functional linkages between genes, and therefore have been extensively exploited for annotating gene functions and dissecting specific pathway structures. However, our understanding is rather limited to the relationship bet...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-343

    authors: You ZH,Yin Z,Han K,Huang DS,Zhou X

    更新日期:2010-06-24 00:00:00

  • Identifying module biomarker in type 2 diabetes mellitus by discriminative area of functional activity.

    abstract:BACKGROUND:Identifying diagnosis and prognosis biomarkers from expression profiling data is of great significance for achieving personalized medicine and designing therapeutic strategy in complex diseases. However, the reproducibility of identified biomarkers across tissues and experiments is still a challenge for this...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0519-y

    authors: Zhang X,Gao L,Liu ZP,Chen L

    更新日期:2015-03-18 00:00:00

  • RocSampler: regularizing overlapping protein complexes in protein-protein interaction networks.

    abstract:BACKGROUND:In recent years, protein-protein interaction (PPI) networks have been well recognized as important resources to elucidate various biological processes and cellular mechanisms. In this paper, we address the problem of predicting protein complexes from a PPI network. This problem has two difficulties. One is r...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1920-5

    authors: Maruyama O,Kuwahara Y

    更新日期:2017-12-06 00:00:00

  • Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data.

    abstract:BACKGROUND:A survey of presences and absences of specific species across multiple biogeographic units (or bioregions) are used in a broad area of biological studies from ecology to microbiology. Using binary presence-absence data, we evaluate species co-occurrences that help elucidate relationships among organisms and ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3118-5

    authors: Chung NC,Miasojedow B,Startek M,Gambin A

    更新日期:2019-12-24 00:00:00

  • GeneLibrarian: an effective gene-information summarization and visualization system.

    abstract:BACKGROUND:Abundant information about gene products is stored in online searchable databases such as annotation or literature. To efficiently obtain and digest such information, there is a pressing need for automated information-summarization and functional-similarity clustering of genes. RESULTS:We have developed a n...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-392

    authors: Chiang JH,Shin JW,Liu HH,Chin CL

    更新日期:2006-08-29 00:00:00

  • Revealing hidden information in osteoblast's mechanotransduction through analysis of time patterns of critical events.

    abstract:BACKGROUND:Mechanotransduction in bone cells plays a pivotal role in osteoblast differentiation and bone remodelling. Mechanotransduction provides the link between modulation of the extracellular matrix by mechanical load and intracellular activity. By controlling the balance between the intracellular and extracellular...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3394-0

    authors: Ascolani G,Skerry TM,Lacroix D,Dall'Ara E,Shuaib A

    更新日期:2020-03-18 00:00:00

  • PFClust: a novel parameter free clustering algorithm.

    abstract:BACKGROUND:We present the algorithm PFClust (Parameter Free Clustering), which is able automatically to cluster data and identify a suitable number of clusters to group them into without requiring any parameters to be specified by the user. The algorithm partitions a dataset into a number of clusters that share some co...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-213

    authors: Mavridis L,Nath N,Mitchell JB

    更新日期:2013-07-03 00:00:00

  • A compartmentalized approach to the assembly of physical maps.

    abstract:BACKGROUND:Physical maps have been historically one of the cornerstones of genome sequencing and map-based cloning strategies. They also support marker assisted breeding and EST mapping. The problem of building a high quality physical map is computationally challenging due to unavoidable noise in the input fingerprint ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-217

    authors: Bozdag S,Close TJ,Lonardi S

    更新日期:2009-07-15 00:00:00

  • Graph-representation of oxidative folding pathways.

    abstract:BACKGROUND:The process of oxidative folding combines the formation of native disulfide bond with conformational folding resulting in the native three-dimensional fold. Oxidative folding pathways can be described in terms of disulfide intermediate species (DIS) which can also be isolated and characterized. Each DIS corr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-19

    authors: Agoston V,Cemazar M,Kaján L,Pongor S

    更新日期:2005-01-27 00:00:00

  • Application of text-mining for updating protein post-translational modification annotation in UniProtKB.

    abstract:BACKGROUND:The annotation of protein post-translational modifications (PTMs) is an important task of UniProtKB curators and, with continuing improvements in experimental methodology, an ever greater number of articles are being published on this topic. To help curators cope with this growing body of information we have...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-104

    authors: Veuthey AL,Bridge A,Gobeill J,Ruch P,McEntyre JR,Bougueleret L,Xenarios I

    更新日期:2013-03-22 00:00:00

  • 2D electrophoresis image brightness correction based on gradient interval histogram.

    abstract:BACKGROUND:Two-dimensional electrophoresis (2DE) is one of the most widely applied techniques in comparative proteomics. The basic task of 2DE is to identify differential protein expression by quantitative analysis of 2DE images. To reduce the errors of spot quantification in 2DE images, a novel brightness correction m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3432-y

    authors: Ou Q,Xiao J,Yu L,Wu K,Xiong B

    更新日期:2020-03-19 00:00:00

  • INBIA: a boosting methodology for proteomic network inference.

    abstract:BACKGROUND:The analysis of tissue-specific protein interaction networks and their functional enrichment in pathological and normal tissues provides insights on the etiology of diseases. The Pan-cancer proteomic project, in The Cancer Genome Atlas, collects protein expressions in human cancers and it is a reference reso...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2183-5

    authors: Sardina DS,Micale G,Ferro A,Pulvirenti A,Giugno R

    更新日期:2018-07-09 00:00:00