Abstract:
BACKGROUND:High-throughput methods can directly detect the set of interacting proteins in model species but the results are often incomplete and exhibit high false positive and false negative rates. A number of researchers have recently presented methods for integrating direct and indirect data for predicting interactions. These methods utilize a common classifier for all pairs. However, due to missing data and high redundancy among the features used, different protein pairs may benefit from different features based on the set of attributes available. In addition, in many cases it is hard to directly determine which of the data sources contributed to a prediction. This information is important for biologists using these predications in the design of new experiments. RESULTS:To address these challenges we propose a Mixture-of-Feature-Experts method for protein-protein interaction prediction. We split the features into roughly homogeneous sets of feature experts. The individual experts use logistic regression and their scores are combined using another logistic regression. When combining the scores the weighting of each expert depends on the set of input attributes available for that pair. Thus, different experts will have different influence on the prediction depending on the available features. CONCLUSION:We applied our method to predict the set of interacting proteins in yeast and human cells. Our method improved upon the best previous methods for this task. In addition, the weighting of the experts provides means to evaluate the prediction based on the high scoring features.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Qi Y,Klein-Seetharaman J,Bar-Joseph Zdoi
10.1186/1471-2105-8-S10-S6subject
Has Abstractpub_date
2007-01-01 00:00:00pages
S6issn
1471-2105pii
1471-2105-8-S10-S6journal_volume
8 Suppl 10pub_type
杂志文章abstract:BACKGROUND:The family of voltage-gated potassium channels comprises a functionally diverse group of membrane proteins. They help maintain and regulate the potassium ion-based component of the membrane potential and are thus central to many critical physiological processes. VKCDB (Voltage-gated potassium [K] Channel Dat...
journal_title:BMC bioinformatics
pub_type: 杂志文章,评审
doi:10.1186/1471-2105-5-3
更新日期:2004-01-09 00:00:00
abstract:BACKGROUND:The Cell Ontology (CL) is an ontology for the representation of in vivo cell types. As biological ontologies such as the CL grow in complexity, they become increasingly difficult to use and maintain. By making the information in the ontology computable, we can use automated reasoners to detect errors and ass...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-6
更新日期:2011-01-05 00:00:00
abstract:BACKGROUND:Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing. RESULTS:We propose s...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1539-6
更新日期:2017-03-02 00:00:00
abstract:BACKGROUND:Temporal gene expression profiles characterize the time-dynamics of expression of specific genes and are increasingly collected in current gene expression experiments. In the analysis of experiments where gene expression is obtained over the life cycle, it is of interest to relate temporal patterns of gene e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-60
更新日期:2008-01-28 00:00:00
abstract:BACKGROUND:PSI-BLAST, an extremely popular tool for sequence similarity search, features the utilization of Position-Specific Scoring Matrix (PSSM) constructed from a multiple sequence alignment (MSA). PSSM allows the detection of more distant homologs than a general amino acid substitution matrix does. An accurate est...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1686-9
更新日期:2017-06-02 00:00:00
abstract:BACKGROUND:Several studies demonstrated the feasibility of predicting bacterial antibiotic resistance phenotypes from whole-genome sequences, the prediction process usually amounting to detecting the presence of genes involved in antibiotic resistance mechanisms, or of specific mutations, previously identified from a t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2403-z
更新日期:2018-10-17 00:00:00
abstract:BACKGROUND:Identifying differentially abundant features between different experimental groups is a common goal for many metabolomics and proteomics studies. However, analyzing data from mass spectrometry (MS) is difficult because the data may not be normally distributed and there is often a large fraction of zero value...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3067-z
更新日期:2019-10-17 00:00:00
abstract:BACKGROUND:To interpret microarray experiments, several ontological analysis tools have been developed. However, current tools are limited to specific organisms. RESULTS:We developed a bioinformatics system to assign the probe set sequences of any organism to a hierarchical functional classification modelled on KEGG o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-87
更新日期:2007-03-12 00:00:00
abstract:BACKGROUND:The rapid pace of bioscience research makes it very challenging to track relevant articles in one's area of interest. MEDLINE, a primary source for biomedical literature, offers access to more than 20 million citations with three-quarters of a million new ones added each year. Thus it is not surprising to se...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0630-0
更新日期:2015-06-20 00:00:00
abstract:BACKGROUND:In this paper, it is proposed an optimization approach for producing reduced alphabets for peptide classification, using a Genetic Algorithm. The classification task is performed by a multi-classifier system where each classifier (Linear or Radial Basis function Support Vector Machines) is trained using feat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-45
更新日期:2008-01-24 00:00:00
abstract:BACKGROUND:Observed levels of gene expression strongly depend on both activity of DNA binding transcription factors (TFs) and chromatin state through different histone modifications (HMs). In order to recover the functional relationship between local chromatin state, TF binding and observed levels of gene expression, r...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3331-2
更新日期:2020-01-02 00:00:00
abstract:BACKGROUND:The biological network is highly dynamic. Functional relations between genes can be activated or deactivated depending on the biological conditions. On the genome-scale network, subnetworks that gain or lose local expression consistency may shed light on the regulatory mechanisms related to the changing biol...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3046-4
更新日期:2019-12-24 00:00:00
abstract:BACKGROUND:Novel sequence motifs detection is becoming increasingly essential in computational biology. However, the high computational cost greatly constrains the efficiency of most motif discovery algorithms. RESULTS:In this paper, we accelerate MEME algorithm targeted on Intel Many Integrated Core (MIC) Architectur...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2276-1
更新日期:2018-08-13 00:00:00
abstract:BACKGROUND:High-throughput screens comparing growth rates of arrays of distinct micro-organism cultures on solid agar are useful, rapid methods of quantifying genetic interactions. Growth rate is an informative phenotype which can be estimated by measuring cell densities at one or more times after inoculation. Precise ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-287
更新日期:2010-05-28 00:00:00
abstract:BACKGROUND:The challenge of classifying cortical interneurons is yet to be solved. Data-driven classification into established morphological types may provide insight and practical value. RESULTS:We trained models using 217 high-quality morphologies of rat somatosensory neocortex interneurons reconstructed by a single...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2470-1
更新日期:2018-12-17 00:00:00
abstract:BACKGROUND:With the growing availability of entire genome sequences, an increasing number of scientists can exploit oligonucleotide microarrays for genome-scale expression studies. While probe-design is a major research area, relatively little work has been reported on the optimization of microarray protocols. RESULTS...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-73
更新日期:2011-03-14 00:00:00
abstract:BACKGROUND:The large increase in the size of patent collections has led to the need of efficient search strategies. But the development of advanced text-mining applications dedicated to patents of the biomedical field remains rare, in particular to address the needs of the pharmaceutical & biotech industry, which inten...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S1-S15
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Most single stranded RNA (ssRNA) viruses mutate rapidly to generate large number of strains having highly divergent capsid sequences. Accurate strain recognition in uncharacterized target capsid sequences is essential for epidemiology, diagnostics, and vaccine development. Strain recognition based on similar...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-379
更新日期:2007-10-10 00:00:00
abstract:BACKGROUND:Phylogenies capture the evolutionary ancestry linking extant species. Correlations and similarities among a set of species are mediated by and need to be understood in terms of the phylogenic tree. In a similar way it has been argued that biological networks also induce correlations among sets of interacting...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-470
更新日期:2010-09-20 00:00:00
abstract:BACKGROUND:The recent explosion in biological and other real-world network data has created the need for improved tools for large network analyses. In addition to well established global network properties, several new mathematical techniques for analyzing local structural properties of large networks have been develop...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-70
更新日期:2008-01-30 00:00:00
abstract:BACKGROUND:Linkage disequilibrium (LD)-the non-random association of alleles at different loci-defines population-specific haplotypes which vary by genomic ancestry. Assessment of allelic frequencies and LD patterns from a variety of ancestral populations enables researchers to better understand population histories as...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3340-1
更新日期:2020-01-10 00:00:00
abstract:BACKGROUND:RNA-seq is a powerful tool for measuring transcriptomes, especially for identifying differentially expressed genes or transcripts (DEGs) between sample groups. A number of methods have been developed for this task, and several evaluation studies have also been reported. However, those evaluations so far have...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0794-7
更新日期:2015-11-04 00:00:00
abstract:BACKGROUND:Protein structure comparison is a fundamental task in structural biology. While the number of known protein structures has grown rapidly over the last decade, searching a large database of protein structures is still relatively slow using existing methods. There is a need for new techniques which can rapidly...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S1-S46
更新日期:2010-01-18 00:00:00
abstract:BACKGROUND:The complexity and dynamics of microbial communities are major factors in the ecology of a system. With the NGS technique, metagenomics data provides a new way to explore microbial interactions. Lotka-Volterra models, which have been widely used to infer animal interactions in dynamic systems, have recently ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1359-0
更新日期:2016-11-25 00:00:00
abstract:BACKGROUND:Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as si...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-28
更新日期:2009-01-21 00:00:00
abstract:BACKGROUND:Protein solvent accessibility prediction is a pivotal intermediate step towards modeling protein tertiary structures directly from one-dimensional sequences. It also plays an important part in identifying protein folds and domains. Although some methods have been presented to the protein solvent accessibilit...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0851-2
更新日期:2016-01-11 00:00:00
abstract:BACKGROUND AND GOAL:The Random Forest (RF) algorithm for regression and classification has considerably gained popularity since its introduction in 2001. Meanwhile, it has grown to a standard classification approach competing with logistic regression in many innovation-friendly scientific fields. RESULTS:In this conte...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2264-5
更新日期:2018-07-17 00:00:00
abstract:BACKGROUND:Integrating data from multiple global assays and curated databases is essential to understand the spatio-temporal interactions within cells. Different experiments measure cellular processes at various widths and depths, while databases contain biological information based on established facts or published da...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-203
更新日期:2008-04-21 00:00:00
abstract:BACKGROUND:The aberrant expression of microRNAs is closely connected to the occurrence and development of a great deal of human diseases. To study human diseases, numerous effective computational models that are valuable and meaningful have been presented by researchers. RESULTS:Here, we present a computational framew...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3409-x
更新日期:2020-02-18 00:00:00
abstract:BACKGROUND:Because common complex diseases are affected by multiple genes and environmental factors, it is essential to investigate gene-gene and/or gene-environment interactions to understand genetic architecture of complex diseases. After the great success of large scale genome-wide association (GWA) studies using th...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S9-S5
更新日期:2012-06-11 00:00:00