Integrating diverse biological and computational sources for reliable protein-protein interactions.

Abstract:

BACKGROUND:Protein-protein interactions (PPIs) play important roles in various cellular processes. However, the low quality of current PPI data detected from high-throughput screening techniques has diminished the potential usefulness of the data. We need to develop a method to address the high data noise and incompleteness of PPI data, namely, to filter out inaccurate protein interactions (false positives) and predict putative protein interactions (false negatives). RESULTS:In this paper, we proposed a novel two-step method to integrate diverse biological and computational sources of supporting evidence for reliable PPIs. The first step, interaction binning or InterBIN, groups PPIs together to more accurately estimate the likelihood (Bin-Confidence score) that the protein pairs interact for each biological or computational evidence source. The second step, interaction classification or InterCLASS, integrates the collected Bin-Confidence scores to build classifiers and identify reliable interactions. CONCLUSIONS:We performed comprehensive experiments on two benchmark yeast PPI datasets. The experimental results showed that our proposed method can effectively eliminate false positives in detected PPIs and identify false negatives by predicting novel yet reliable PPIs. Our proposed method also performed significantly better than merely using each of individual evidence sources, illustrating the importance of integrating various biological and computational sources of data and evidence.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Wu M,Li X,Chua HN,Kwoh CK,Ng SK

doi

10.1186/1471-2105-11-S7-S8

subject

Has Abstract

pub_date

2010-10-15 00:00:00

pages

S8

issn

1471-2105

pii

1471-2105-11-S7-S8

journal_volume

11 Suppl 7

pub_type

杂志文章
  • LncRNA HOTAIR-mediated Wnt/β-catenin network modeling to predict and validate therapeutic targets for cartilage damage.

    abstract:BACKGROUND:Cartilage damage is a crucial feature involved in several pathological conditions characterized by joint disorders, such as osteoarthritis and rheumatoid arthritis. Accumulated evidences showed that Wnt/β-catenin pathway plays a role in the pathogenesis of cartilage damage. In addition, it is experimentally ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2981-4

    authors: Zhou W,He X,Chen Z,Fan D,Wang Y,Feng H,Zhang G,Lu A,Xiao L

    更新日期:2019-07-31 00:00:00

  • BiPOm: a rule-based ontology to represent and infer molecule knowledge from a biological process-centered viewpoint.

    abstract:BACKGROUND:Managing and organizing biological knowledge remains a major challenge, due to the complexity of living systems. Recently, systemic representations have been promising in tackling such a challenge at the whole-cell scale. In such representations, the cell is considered as a system composed of interlocked sub...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03637-9

    authors: Henry V,Saïs F,Inizan O,Marchadier E,Dibie J,Goelzer A,Fromion V

    更新日期:2020-07-23 00:00:00

  • Challenges in estimating percent inclusion of alternatively spliced junctions from RNA-seq data.

    abstract::Transcript quantification is a long-standing problem in genomics and estimating the relative abundance of alternatively-spliced isoforms from the same transcript is an important special case. Both problems have recently been illuminated by high-throughput RNA sequencing experiments which are quickly generating large a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S6-S11

    authors: Kakaradov B,Xiong HY,Lee LJ,Jojic N,Frey BJ

    更新日期:2012-04-19 00:00:00

  • Modeling lymphocyte homing and encounters in lymph nodes.

    abstract:BACKGROUND:The efficiency of lymph nodes depends on tissue structure and organization, which allow the coordination of lymphocyte traffic. Despite their essential role, our understanding of lymph node specific mechanisms is still incomplete and currently a topic of intense research. RESULTS:In this paper, we present a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-387

    authors: Baldazzi V,Paci P,Bernaschi M,Castiglione F

    更新日期:2009-11-25 00:00:00

  • Detection of biological switches using the method of Gröebner bases.

    abstract:BACKGROUND:Bistability and ability to switch between two stable states is the hallmark of cellular responses. Cellular signaling pathways often contain bistable switches that regulate the transmission of the extracellular information to the nucleus where important biological functions are executed. RESULTS:In this wor...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3155-0

    authors: Arkun Y

    更新日期:2019-11-28 00:00:00

  • Current approaches to gene regulatory network modelling.

    abstract::Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these ca...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S6-S9

    authors: Schlitt T,Brazma A

    更新日期:2007-09-27 00:00:00

  • A multiresolution approach to automated classification of protein subcellular location images.

    abstract:BACKGROUND:Fluorescence microscopy is widely used to determine the subcellular location of proteins. Efforts to determine location on a proteome-wide basis create a need for automated methods to analyze the resulting images. Over the past ten years, the feasibility of using machine learning methods to recognize all maj...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-210

    authors: Chebira A,Barbotin Y,Jackson C,Merryman T,Srinivasa G,Murphy RF,Kovacević J

    更新日期:2007-06-19 00:00:00

  • Prediction of protein structural class with Rough Sets.

    abstract:BACKGROUND:A new method for the prediction of protein structural classes is constructed based on Rough Sets algorithm, which is a rule-based data mining method. Amino acid compositions and 8 physicochemical properties data are used as conditional attributes for the construction of decision system. After reducing the de...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-20

    authors: Cao Y,Liu S,Zhang L,Qin J,Wang J,Tang K

    更新日期:2006-01-14 00:00:00

  • Sequence-structure relations of pseudoknot RNA.

    abstract:BACKGROUND:The analysis of sequence-structure relations of RNA is based on a specific notion and folding of RNA structure. The notion of coarse grained structure employed here is that of canonical RNA pseudoknot contact-structures with at most two mutually crossing bonds (3-noncrossing). These structures are folded by ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S1-S39

    authors: Huang FW,Li LY,Reidys CM

    更新日期:2009-01-30 00:00:00

  • Genome Projector: zoomable genome map with multiple views.

    abstract:BACKGROUND:Molecular biology data exist on diverse scales, from the level of molecules to -omics. At the same time, the data at each scale can be categorised into multiple layers, such as the genome, transcriptome, proteome, metabolome, and biochemical pathways. Due to the highly multi-layer and multi-dimensional natur...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-31

    authors: Arakawa K,Tamaki S,Kono N,Kido N,Ikegami K,Ogawa R,Tomita M

    更新日期:2009-01-23 00:00:00

  • Detection of transposable elements by their compositional bias.

    abstract:BACKGROUND:Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-94

    authors: Andrieu O,Fiston AS,Anxolabéhère D,Quesneville H

    更新日期:2004-07-13 00:00:00

  • Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.

    abstract:BACKGROUND:RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging in the context of dif...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0994-9

    authors: Bi R,Liu P

    更新日期:2016-03-31 00:00:00

  • BioNanoAnalyst: a visualisation tool to assess genome assembly quality using BioNano data.

    abstract:BACKGROUND:Reference genome assemblies are valuable, as they provide insights into gene content, genetic evolution and domestication. The higher the quality of a reference genome assembly the more accurate the downstream analysis will be. During the last few years, major efforts have been made towards improving the qua...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1735-4

    authors: Yuan Y,Bayer PE,Scheben A,Chan CK,Edwards D

    更新日期:2017-06-30 00:00:00

  • Connectivity independent protein-structure alignment: a hierarchical approach.

    abstract:BACKGROUND:Protein-structure alignment is a fundamental tool to study protein function, evolution and model building. In the last decade several methods for structure alignment were introduced, but most of them ignore that structurally similar proteins can share the same spatial arrangement of secondary structure eleme...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-510

    authors: Kolbeck B,May P,Schmidt-Goenner T,Steinke T,Knapp EW

    更新日期:2006-11-21 00:00:00

  • FastGroup: a program to dereplicate libraries of 16S rDNA sequences.

    abstract:BACKGROUND:Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we describe FastGroup, ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-2-9

    authors: Seguritan V,Rohwer F

    更新日期:2001-01-01 00:00:00

  • Recursive model for dose-time responses in pharmacological studies.

    abstract:BACKGROUND:Clinical studies often track dose-response curves of subjects over time. One can easily model the dose-response curve at each time point with Hill equation, but such a model fails to capture the temporal evolution of the curves. On the other hand, one can use Gompertz equation to model the temporal behaviors...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2831-4

    authors: Dhruba SR,Rahman A,Rahman R,Ghosh S,Pal R

    更新日期:2019-06-20 00:00:00

  • Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.

    abstract:BACKGROUND:Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. Th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-99

    authors: Vlasblom J,Wodak SJ

    更新日期:2009-03-30 00:00:00

  • Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

    abstract:BACKGROUND:Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain datab...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-16

    authors: Truong K,Ikura M

    更新日期:2003-05-06 00:00:00

  • Survival models with preclustered gene groups as covariates.

    abstract:BACKGROUND:An important application of high dimensional gene expression measurements is the risk prediction and the interpretation of the variables in the resulting survival models. A major problem in this context is the typically large number of genes compared to the number of observations (individuals). Feature selec...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-478

    authors: Kammers K,Lang M,Hengstler JG,Schmidt M,Rahnenführer J

    更新日期:2011-12-16 00:00:00

  • Using the multi-objective optimization replica exchange Monte Carlo enhanced sampling method for protein-small molecule docking.

    abstract:BACKGROUND:In this study, we extended the replica exchange Monte Carlo (REMC) sampling method to protein-small molecule docking conformational prediction using RosettaLigand. In contrast to the traditional Monte Carlo (MC) and REMC sampling methods, these methods use multi-objective optimization Pareto front informatio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1733-6

    authors: Wang H,Liu H,Cai L,Wang C,Lv Q

    更新日期:2017-07-10 00:00:00

  • Partial mixture model for tight clustering of gene expression time-course.

    abstract:BACKGROUND:Tight clustering arose recently from a desire to obtain tighter and potentially more informative clusters in gene expression studies. Scattered genes with relatively loose correlations should be excluded from the clusters. However, in the literature there is little work dedicated to this area of research. On...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-287

    authors: Yuan Y,Li CT,Wilson R

    更新日期:2008-06-18 00:00:00

  • Prioritizing disease genes with an improved dual label propagation framework.

    abstract:BACKGROUND:Prioritizing disease genes is trying to identify potential disease causing genes for a given phenotype, which can be applied to reveal the inherited basis of human diseases and facilitate drug development. Our motivation is inspired by label propagation algorithm and the false positive protein-protein intera...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2040-6

    authors: Zhang Y,Liu J,Liu X,Fan X,Hong Y,Wang Y,Huang Y,Xie M

    更新日期:2018-02-08 00:00:00

  • A CoD-based stationary control policy for intervening in large gene regulatory networks.

    abstract:BACKGROUND:One of the most important goals of the mathematical modeling of gene regulatory networks is to alter their behavior toward desirable phenotypes. Therapeutic techniques are derived for intervention in terms of stationary control policies. In large networks, it becomes computationally burdensome to derive an o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S10-S10

    authors: Ghaffari N,Ivanov I,Qian X,Dougherty ER

    更新日期:2011-10-18 00:00:00

  • Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.

    abstract:BACKGROUND:Shotgun metagenomics based on untargeted sequencing can explore the taxonomic profile and the function of unknown microorganisms in samples, and complement the shortage of amplicon sequencing. Binning assembled sequences into individual groups, which represent microbial genomes, is the key step and a major c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03667-3

    authors: Yue Y,Huang H,Qi Z,Dou HM,Liu XY,Han TF,Chen Y,Song XJ,Zhang YH,Tu J

    更新日期:2020-07-28 00:00:00

  • 3DScapeCS: application of three dimensional, parallel, dynamic network visualization in Cytoscape.

    abstract:BACKGROUND:The exponential growth of gigantic biological data from various sources, such as protein-protein interaction (PPI), genome sequences scaffolding, Mass spectrometry (MS) molecular networking and metabolic flux, demands an efficient way for better visualization and interpretation beyond the conventional, two-d...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-322

    authors: Wang Q,Tang B,Song L,Ren B,Liang Q,Xie F,Zhuo Y,Liu X,Zhang L

    更新日期:2013-11-14 00:00:00

  • Impact of polymorphic transposable elements on transcription in lymphoblastoid cell lines from public data.

    abstract:BACKGROUND:Transposable elements (TEs) are DNA sequences able to mobilize themselves and to increase their copy-number in the host genome. In the past, they have been considered mainly selfish DNA without evident functions. Nevertheless, currently they are believed to have been extensively involved in the evolution of ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3113-x

    authors: Spirito G,Mangoni D,Sanges R,Gustincich S

    更新日期:2019-11-22 00:00:00

  • EGenBio: a data management system for evolutionary genomics and biodiversity.

    abstract:BACKGROUND:Evolutionary genomics requires management and filtering of large numbers of diverse genomic sequences for accurate analysis and inference on evolutionary processes of genomic and functional change. We developed Evolutionary Genomics and Biodiversity (EGenBio; http://egenbio.lsu.edu) to begin to address this....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-S2-S7

    authors: Nahum LA,Reynolds MT,Wang ZO,Faith JJ,Jonna R,Jiang ZJ,Meyer TJ,Pollock DD

    更新日期:2006-09-06 00:00:00

  • ICEKAT: an interactive online tool for calculating initial rates from continuous enzyme kinetic traces.

    abstract:BACKGROUND:Continuous enzyme kinetic assays are often used in high-throughput applications, as they allow rapid acquisition of large amounts of kinetic data and increased confidence compared to discontinuous assays. However, data analysis is often rate-limiting in high-throughput enzyme assays, as manual inspection and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3513-y

    authors: Olp MD,Kalous KS,Smith BC

    更新日期:2020-05-14 00:00:00

  • Improved identification of conserved cassette exons using Bayesian networks.

    abstract:BACKGROUND:Alternative splicing is a major contributor to the diversity of eukaryotic transcriptomes and proteomes. Currently, large scale detection of alternative splicing using expressed sequence tags (ESTs) or microarrays does not capture all alternative splicing events. Moreover, for many species genomic data is be...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-477

    authors: Sinha R,Hiller M,Pudimat R,Gausmann U,Platzer M,Backofen R

    更新日期:2008-11-12 00:00:00

  • Gene ontology based transfer learning for protein subcellular localization.

    abstract:BACKGROUND:Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-44

    authors: Mei S,Fei W,Zhou S

    更新日期:2011-02-02 00:00:00