Abstract:
BACKGROUND:In recent years, protein-protein interaction (PPI) networks have been well recognized as important resources to elucidate various biological processes and cellular mechanisms. In this paper, we address the problem of predicting protein complexes from a PPI network. This problem has two difficulties. One is related to small complexes, which contains two or three components. It is relatively difficult to identify them due to their simpler internal structure, but unfortunately complexes of such sizes are dominant in major protein complex databases, such as CYC2008. Another difficulty is how to model overlaps between predicted complexes, that is, how to evaluate different predicted complexes sharing common proteins because CYC2008 and other databases include such protein complexes. Thus, it is critical how to model overlaps between predicted complexes to identify them simultaneously. RESULTS:In this paper, we propose a sampling-based protein complex prediction method, RocSampler (Regularizing Overlapping Complexes), which exploits, as part of the whole scoring function, a regularization term for the overlaps of predicted complexes and that for the distribution of sizes of predicted complexes. We have implemented RocSampler in MATLAB and its executable file for Windows is available at the site, http://imi.kyushu-u.ac.jp/~om/software/RocSampler/ . CONCLUSIONS:We have applied RocSampler to five yeast PPI networks and shown that it is superior to other existing methods. This implies that the design of scoring functions including regularization terms is an effective approach for protein complex prediction.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Maruyama O,Kuwahara Ydoi
10.1186/s12859-017-1920-5subject
Has Abstractpub_date
2017-12-06 00:00:00pages
491issue
Suppl 15issn
1471-2105pii
10.1186/s12859-017-1920-5journal_volume
18pub_type
杂志文章abstract:BACKGROUND:Recently copy number variation (CNV) has gained considerable interest as a type of genomic/genetic variation that plays an important role in disease susceptibility. Advances in sequencing technology have created an opportunity for detecting CNVs more accurately. Recently whole exome sequencing (WES) has beco...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1705-x
更新日期:2017-05-31 00:00:00
abstract:BACKGROUND:A cross-correlation (XCorr) score function is one of the most popular score functions utilized to search peptide identifications in databases, and many computer programs, such as SEQUEST, Comet, and Tide, currently use this score function. Recently, the HiXCorr algorithm was developed to speed up this score ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2559-6
更新日期:2018-12-12 00:00:00
abstract:BACKGROUND:Gene duplication events have played a significant role in genome evolution, particularly in plants. Exhaustive searches for all members of a known gene family as well as the identification of new gene families has become increasingly important. Subfunctionalization via changes in regulatory sequences followi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-S2-S19
更新日期:2006-09-06 00:00:00
abstract:BACKGROUND:Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structu...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1616-x
更新日期:2017-04-11 00:00:00
abstract:BACKGROUND:Biomedical processes can provide essential information about the (mal-) functioning of an organism and are thus frequently represented in biomedical terminologies and ontologies, including the GO Biological Process branch. These processes often need to be described and categorised in terms of their attribute...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-217
更新日期:2012-08-28 00:00:00
abstract:BACKGROUND:Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-433
更新日期:2008-10-14 00:00:00
abstract:BACKGROUND:An important step in understanding the conditions that specify gene expression is the recognition of gene regulatory elements. Due to high diversity of different types of transcription factors and their DNA binding preferences, it is a challenging problem to establish an accurate model for recognition of fun...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-S4-S27
更新日期:2006-12-12 00:00:00
abstract:BACKGROUND:While technological advances have made it possible to profile the immune system at high resolution, translating high-throughput data into knowledge of immune mechanisms has been challenged by the complexity of the interactions underlying immune processes. Tools to explore the immune network are critical for ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03702-3
更新日期:2020-08-10 00:00:00
abstract:BACKGROUND:The lack of sufficient training data is the limiting factor for many Machine Learning applications in Computational Biology. If data is available for several different but related problem domains, Multitask Learning algorithms can be used to learn a model based on all available information. In Bioinformatics...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S8-S5
更新日期:2010-10-26 00:00:00
abstract:BACKGROUND:Guanine protein-coupled receptors (GPCRs) constitute a eukaryotic transmembrane protein family and function as "molecular switches" in the second messenger cascades and are found in all organisms between yeast and humans. They form the single, biggest drug-target family due to their versatility of action and...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S1-S3
更新日期:2011-02-15 00:00:00
abstract:BACKGROUND:Enhancers are stretches of DNA (100-1000 bp) that play a major role in development gene expression, evolution and disease. It has been recently shown that in high-level eukaryotes enhancers rarely work alone, instead they collaborate by forming clusters of cis-regulatory modules (CRMs). Although the binding ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-0980-2
更新日期:2016-03-18 00:00:00
abstract:BACKGROUND:The development of high-throughput sequencing and analysis has accelerated multi-omics studies of thousands of microbial species, metagenomes, and infectious disease pathogens. Omics studies are enabling genotype-phenotype association studies which identify genetic determinants of pathogen virulence and drug...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2580-9
更新日期:2019-01-07 00:00:00
abstract:BACKGROUND:With continuing identification of novel structured noncoding RNAs, there is an increasing need to create schematic diagrams showing the consensus features of these molecules. RNA structural diagrams are typically made either with general-purpose drawing programs like Adobe Illustrator, or with automated or i...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-3
更新日期:2011-01-04 00:00:00
abstract:BACKGROUND:Large-scale genomic studies based on transcriptome technologies provide clusters of genes that need to be functionally annotated. The Gene Ontology (GO) implements a controlled vocabulary organised into three hierarchies: cellular components, molecular functions and biological processes. This terminology all...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-241
更新日期:2006-05-04 00:00:00
abstract:BACKGROUND:Metabolomics is one of most recent omics technologies. It has been applied on fields such as food science, nutrition, drug discovery and systems biology. For this, gas chromatography-mass spectrometry (GC-MS) has been largely applied and many computational tools have been developed to support the analysis of...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0374-2
更新日期:2014-12-10 00:00:00
abstract:BACKGROUND:Collections of transcription factor binding profiles (Transfac, Jaspar) are essential to identify regulatory elements in DNA sequences. Subsets of highly similar profiles complicate large scale analysis of transcription factor binding sites. RESULTS:We propose to identify and group similar profiles using tw...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-237
更新日期:2005-09-28 00:00:00
abstract::We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists work...
journal_title:BMC bioinformatics
pub_type:
doi:10.1186/1471-2105-9-S1-S1
更新日期:2008-01-01 00:00:00
abstract:BACKGROUND:Phosphorylated histone H2AX, also known as γH2AX, forms μm-sized nuclear foci at the sites of DNA double-strand breaks (DSBs) induced by ionizing radiation and other agents. Due to their specificity and sensitivity, γH2AX immunoassays have become the gold standard for studying DSB induction and repair. One o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3370-8
更新日期:2020-01-28 00:00:00
abstract:BACKGROUND:Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as p...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S11-S2
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional informa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-534
更新日期:2008-12-16 00:00:00
abstract:BACKGROUND:Lung cancer is the leading cause of the largest number of deaths worldwide and lung adenocarcinoma is the most common form of lung cancer. In order to understand the molecular basis of lung adenocarcinoma, integrative analysis have been performed by using genomics, transcriptomics, epigenomics, proteomics an...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03691-3
更新日期:2020-09-30 00:00:00
abstract:BACKGROUND:We present a model for tagging gene and protein mentions from text using the probabilistic sequence tagging framework of conditional random fields (CRFs). Conditional random fields model the probability P(t/o) of a tag sequence given an observation sequence directly, and have previously been employed success...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-S1-S6
更新日期:2005-01-01 00:00:00
abstract:BACKGROUND:A common method for presenting and studying biological interaction networks is visualization. Software tools can enhance our ability to explore network visualizations and improve our understanding of biological systems, particularly when these tools offer analysis capabilities. However, most published networ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-95
更新日期:2009-03-26 00:00:00
abstract:BACKGROUND:Stable isotope tracing can follow individual atoms through metabolic transformations through the detection of the incorporation of stable isotope within metabolites. This resulting data can be interpreted in terms related to metabolic flux. However, detection of a stable isotope in metabolites by mass spectr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3096-7
更新日期:2019-10-28 00:00:00
abstract::We developed a new computational technique called Step-Level Differential Response (SLDR) to identify genetic regulatory relationships. Our technique takes advantages of functional genomics data for the same species under different perturbation conditions, therefore complementary to current popular computational techn...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S11-S1
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Discovering patterns from gene expression levels is regarded as a classification problem when tissue classes of the samples are given and solved as a discrete-data problem by discretizing the expression levels of each gene into intervals maximizing the interdependence between that gene and the class labels. ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S5-S5
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs that function as post-transcriptional regulators of messenger RNA (mRNA) through base-pairing to 6-8 nucleotide long target sites, usually located within the mRNA 3' untranslated region. A common approach to validate and probe microRNA-mRNA interact...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1057-y
更新日期:2016-04-27 00:00:00
abstract:BACKGROUND:The improvements of high throughput technologies have produced large amounts of multi-omics experiments datasets. Initial analysis of these data has revealed many concurrent gene alterations within single dataset or/and among multiple omics datasets. Although powerful bioinformatics pipelines have been devel...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3171-0
更新日期:2019-11-08 00:00:00
abstract:BACKGROUND:Antibacterial peptides are one of the effecter molecules of innate immune system. Over the last few decades several antibacterial peptides have successfully approved as drug by FDA, which has prompted an interest in these antibacterial peptides. In our recent study we analyzed 999 antibacterial peptides, whi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S1-S19
更新日期:2010-01-18 00:00:00
abstract:BACKGROUND:Drug resistance testing is mandatory in antiretroviral therapy in human immunodeficiency virus (HIV) infected patients for successful treatment. The emergence of resistances against antiretroviral agents remains the major obstacle in inhibition of viral replication and thus to control infection. Due to the h...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1179-2
更新日期:2016-08-22 00:00:00