Abstract:
BACKGROUND:Traditional algorithms for hidden Markov model decoding seek to maximize either the probability of a state path or the number of positions of a sequence assigned to the correct state. These algorithms provide only a single answer and in practice do not produce good results. RESULTS:We explore an alternative approach, where we efficiently compute the k paths of highest probability to explain a sequence and then either use those paths to explore alternative explanations for a sequence or to combine them into a single explanation. Our procedure uses an online pruning technique to reduce usage of primary memory. CONCLUSION:Out algorithm uses much less memory than naive approach. For membrane proteins, even simple path combination algorithms give good explanations, and if we look at the paths we are combining, we can give a sense of confidence in the explanation as well. For proteins with two topologies, the k best paths can give insight into both correct explanations of a sequence, a feature lacking from traditional algorithms in this domain.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Brown DG,Golod Ddoi
10.1186/1471-2105-11-S1-S28subject
Has Abstractpub_date
2010-01-18 00:00:00pages
S28issn
1471-2105pii
1471-2105-11-S1-S28journal_volume
11 Suppl 1pub_type
杂志文章abstract:BACKGROUND:Regulation of gene expression, protein synthesis, replication and assembly of many viruses involve RNA-protein interactions. Although some successful computational tools have been reported to recognize RNA binding sites in proteins, the problem of specificity remains poorly investigated. After the nucleotide...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S13-S5
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:Most known eukaryotic genomes contain mobile copied elements called transposable elements. In some species, these elements account for the majority of the genome sequence. They have been subject to many mutations and other genomic events (copies, deletions, captures) during transposition. The identification ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-474
更新日期:2010-09-22 00:00:00
abstract:BACKGROUND:Frequent pattern mining analysis applied on microarray dataset appears to be a promising strategy for identifying relationships between gene expression levels. Unfortunately, too many itemsets (co-expressed genes) are identified by this analysis method since it does not consider the importance of each gene w...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-230
更新日期:2013-07-21 00:00:00
abstract:BACKGROUND:In microarray experiments the numbers of replicates are often limited due to factors such as cost, availability of sample or poor hybridization. There are currently few choices for the analysis of a pair of microarrays where N = 1 in each condition. In this paper, we demonstrate the effectiveness of a new al...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-489
更新日期:2008-11-21 00:00:00
abstract:BACKGROUND:Multiple co-inertia analysis (mCIA) is a multivariate analysis method that can assess relationships and trends in multiple datasets. Recently it has been used for integrative analysis of multiple high-dimensional -omics datasets. However, its estimated loading vectors are non-sparse, which presents challenge...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3455-4
更新日期:2020-04-15 00:00:00
abstract:BACKGROUND:Patient records contain valuable information regarding explanation of diagnosis, progression of disease, prescription and/or effectiveness of treatment, and more. Automatic recognition of clinically important concepts and the identification of relationships between those concepts in patient records are preli...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S3-S3
更新日期:2011-06-09 00:00:00
abstract:BACKGROUND:Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that one has to face are integrity, consistency, redundancy, connectivity, expressiveness and updatability. DESCRIPTION:Here we present...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-70
更新日期:2006-02-15 00:00:00
abstract:BACKGROUND:Mass spectrometry (MS) coupled with online separation methods is commonly applied for differential and quantitative profiling of biological samples in metabolomic as well as proteomic research. Such approaches are used for systems biology, functional genomics, and biomarker discovery, among others. An ongoin...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-395
更新日期:2010-07-23 00:00:00
abstract:BACKGROUND:DNA methylation changes are associated with a wide array of biological processes. Bisulfite conversion of DNA followed by high-throughput sequencing is increasingly being used to assess genome-wide methylation at single-base resolution. The relative slowness of most commonly used aligners for processing such...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-337
更新日期:2014-10-18 00:00:00
abstract:BACKGROUND:High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of g...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-14
更新日期:2007-01-18 00:00:00
abstract:BACKGROUND:In the past decade, bioinformatics tools have matured enough to reliably perform sophisticated primary data analysis on Next Generation Sequencing (NGS) data, such as mapping, assemblies and variant calling, however, there is still a dire need for improvements in the higher level analysis such as NGS data or...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-167
更新日期:2013-05-28 00:00:00
abstract:BACKGROUND:Protein-protein docking is a valuable computational approach for investigating protein-protein interactions. Shape complementarity is the most basic component of a scoring function and plays an important role in protein-protein docking. Despite significant progresses, shape representation remains an open que...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3270-y
更新日期:2019-12-24 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNAs) are a kind of small noncoding RNA molecules that are direct posttranscriptional regulations of mRNA targets. Studies have indicated that miRNAs play key roles in complex diseases by taking part in many biological processes, such as cell growth, cell death and so on. Therefore, in order to ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3426-9
更新日期:2020-03-18 00:00:00
abstract:BACKGROUND:The binding of regulatory proteins to their specific DNA targets determines the accurate expression of the neighboring genes. The in silico prediction of new binding sites in completely sequenced genomes is a key aspect in the deeper understanding of gene regulatory networks. Several algorithms have been des...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-554
更新日期:2010-11-11 00:00:00
abstract:BACKGROUND:Phylogenetic trees are an important tool to study the evolutionary relationships among organisms. The huge amount of available taxa poses difficulties in their interactive visualization. This hampers the interaction with the users to provide feedback for the further improvement of the taxonomic framework. R...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1841-3
更新日期:2017-09-30 00:00:00
abstract:BACKGROUND:Pattern matching is the core of bioinformatics; it is used in database searching, restriction enzyme mapping, and finding open reading frames. It is done repeatedly over increasingly long sequences, thus codes must be efficient and insensitive to sequence length. Such patterns of interest include simple moti...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-3-8
更新日期:2002-01-01 00:00:00
abstract:BACKGROUND:Many cancer genomes are extensively rearranged with highly aberrant chromosomal karyotypes. Structural and copy number variations in cancer genomes can be determined via abnormal mapping of sequenced reads to the reference genome. Recently it became possible to reconcile both of these types of large-scale va...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3208-4
更新日期:2019-12-17 00:00:00
abstract:BACKGROUND:As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and the quality of varian...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3164-z
更新日期:2019-11-08 00:00:00
abstract:BACKGROUND:In recent years, protein-protein interaction (PPI) networks have been well recognized as important resources to elucidate various biological processes and cellular mechanisms. In this paper, we address the problem of predicting protein complexes from a PPI network. This problem has two difficulties. One is r...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1920-5
更新日期:2017-12-06 00:00:00
abstract:BACKGROUND:We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S9-S16
更新日期:2011-10-05 00:00:00
abstract:BACKGROUND:Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional informa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-534
更新日期:2008-12-16 00:00:00
abstract:BACKGROUND:We previously introduced PCPS (Proteasome Cleavage Prediction Server), a web-based tool to predict proteasome cleavage sites using n-grams. Here, we evaluated the ability of PCPS immunoproteasome cleavage model to discriminate CD8+ T cell epitopes. RESULTS:We first assembled an epitope dataset consisting of...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03782-1
更新日期:2020-12-14 00:00:00
abstract:BACKGROUND:Detection of periodically expressed genes from microarray data without use of known periodic and non-periodic training examples is an important problem, e.g. for identifying genes regulated by the cell-cycle in poorly characterised organisms. Commonly the investigator is only interested in genes expressed at...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-63
更新日期:2006-02-09 00:00:00
abstract:BACKGROUND:Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for predi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-274
更新日期:2009-09-01 00:00:00
abstract:BACKGROUND:Microarray co-expression signatures are an important tool for studying gene function and relations between genes. In addition to genuine biological co-expression, correlated signals can result from technical deficiencies like hybridization of reporters with off-target transcripts. An approach that is able to...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-461
更新日期:2007-11-26 00:00:00
abstract:BACKGROUND:Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly ass...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-74
更新日期:2010-02-04 00:00:00
abstract:BACKGROUND:Lung cancer is the leading cause of the largest number of deaths worldwide and lung adenocarcinoma is the most common form of lung cancer. In order to understand the molecular basis of lung adenocarcinoma, integrative analysis have been performed by using genomics, transcriptomics, epigenomics, proteomics an...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03691-3
更新日期:2020-09-30 00:00:00
abstract:BACKGROUND:We investigate the relationships between the EC (Enzyme Commission) class, the associated chemical reaction, and the reaction mechanism by building predictive models using Support Vector Machine (SVM), Random Forest (RF) and k-Nearest Neighbours (kNN). We consider two ways of encoding the reaction mechanism ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-60
更新日期:2012-04-24 00:00:00
abstract:BACKGROUND:Multiple data-analytic methods have been proposed for evaluating gene-expression levels in specific biological pathways, assessing differential expression associated with a binary phenotype. Following Goeman and Bühlmann's recent review, we compared statistical performance of three methods, namely Global Tes...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-431
更新日期:2007-11-07 00:00:00
abstract:BACKGROUND:Protein sequence alignment analyses have become a crucial step for many bioinformatics studies during the past decades. Multiple sequence alignment (MSA) and pair-wise sequence alignment (PSA) are two major approaches in sequence alignment. Former benchmark studies revealed drawbacks of MSA methods on nucleo...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2524-4
更新日期:2018-12-31 00:00:00