Abstract:
BACKGROUND:Computational discovery of transcription factor binding sites (TFBS) is a challenging but important problem of bioinformatics. In this study, improvement of a Gibbs sampling based technique for TFBS discovery is attempted through an approach that is widely known, but which has never been investigated before: reduction of the effect of local optima. RESULTS:To alleviate the vulnerability of Gibbs sampling to local optima trapping, we propose to combine a thermodynamic method, called simulated tempering, with Gibbs sampling. The resultant algorithm, GibbsST, is then validated using synthetic data and actual promoter sequences extracted from Saccharomyces cerevisiae. It is noteworthy that the marked improvement of the efficiency presented in this paper is attributable solely to the improvement of the search method. CONCLUSION:Simulated tempering is a powerful solution for local optima problems found in pattern discovery. Extended application of simulated tempering for various bioinformatic problems is promising as a robust solution against local optima problems.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Shida Kdoi
10.1186/1471-2105-7-486subject
Has Abstractpub_date
2006-11-04 00:00:00pages
486issn
1471-2105pii
1471-2105-7-486journal_volume
7pub_type
杂志文章abstract:BACKGROUND:Reverse engineering of transcriptional regulatory networks (TRN) from genomics data has always represented a computational challenge in System Biology. The major issue is modeling the complex crosstalk among transcription factors (TFs) and their target genes, with a method able to handle both the high number...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3510-1
更新日期:2020-05-29 00:00:00
abstract:BACKGROUND:Temporal gene expression profiles characterize the time-dynamics of expression of specific genes and are increasingly collected in current gene expression experiments. In the analysis of experiments where gene expression is obtained over the life cycle, it is of interest to relate temporal patterns of gene e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-60
更新日期:2008-01-28 00:00:00
abstract:BACKGROUND:Advances in sequencing and genotyping technologies are leading to the widespread availability of multi-species variation data, dense genotype data and large-scale resequencing projects. The 1000 Genomes Project and similar efforts in other species are challenging the methods previously used for storage and m...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-238
更新日期:2010-05-11 00:00:00
abstract:BACKGROUND:The ability to detect nuclei in embryos is essential for studying the development of multicellular organisms. A system of automated nuclear detection has already been tested on a set of four-dimensional (4D) Nomarski differential interference contrast (DIC) microscope images of Caenorhabditis elegans embryos...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-125
更新日期:2005-05-24 00:00:00
abstract:BACKGROUND:Recent technological advances in DNA sequencing and genotyping have led to the accumulation of a remarkable quantity of data on genetic polymorphisms. However, the development of new statistical and computational tools for effective processing of these data has not been equally as fast. In particular, Machin...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S6-S24
更新日期:2009-06-16 00:00:00
abstract:BACKGROUND:Cryo-electron microscopy (Cryo-EM) is widely used in the determination of the three-dimensional (3D) structures of macromolecules. Particle picking from 2D micrographs remains a challenging early step in the Cryo-EM pipeline due to the diversity of particle shapes and the extremely low signal-to-noise ratio ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03809-7
更新日期:2020-11-09 00:00:00
abstract:BACKGROUND:A common feature of microarray experiments is the occurrence of missing gene expression data. These missing values occur for a variety of reasons, in particular, because of the filtering of poor quality spots and the removal of undefined values when a logarithmic transformation is applied to negative backgro...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-234
更新日期:2005-09-26 00:00:00
abstract:BACKGROUND:The rate of protein structures being deposited in the Protein Data Bank surpasses the capacity to experimentally characterise them and therefore computational methods to analyse these structures have become increasingly important. Identifying the region of the protein most likely to be involved in function i...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-379
更新日期:2009-11-18 00:00:00
abstract:BACKGROUND:Proton Magnetic Resonance (MR) Spectroscopy (MRS) is a widely available technique for those clinical centres equipped with MR scanners. Unlike the rest of MR-based techniques, MRS yields not images but spectra of metabolites in the tissues. In pathological situations, the MRS profile changes and this has bee...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-581
更新日期:2010-11-29 00:00:00
abstract:BACKGROUND:Estimation of individual ancestry from genetic data is useful for the analysis of disease association studies, understanding human population history and interpreting personal genomic variation. New, computationally efficient methods are needed for ancestry inference that can effectively utilize existing inf...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0418-7
更新日期:2015-01-16 00:00:00
abstract:BACKGROUND:The explosive growth of biological data provides opportunities for new statistical and comparative analyses of large information sets, such as alignments comprising tens of thousands of sequences. In such studies, sequence annotations frequently play an essential role, and reliable results depend on metadata...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-S1-S7
更新日期:2008-01-01 00:00:00
abstract:BACKGROUND:Protein sequence alignment analyses have become a crucial step for many bioinformatics studies during the past decades. Multiple sequence alignment (MSA) and pair-wise sequence alignment (PSA) are two major approaches in sequence alignment. Former benchmark studies revealed drawbacks of MSA methods on nucleo...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2524-4
更新日期:2018-12-31 00:00:00
abstract:BACKGROUND:Global maritime trade plays an important role in the modern transportation industry. It brings significant economic profit along with bioinvasion risk. Species translocate and establish in a non-native area through ballast water and biofouling. Aiming at aquatic bioinvasion issue, people proposed various sug...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2272-5
更新日期:2018-08-13 00:00:00
abstract:BACKGROUND:Comparative genomics has become an essential approach for identifying homologous gene candidates and their functions, and for studying genome evolution. There are many tools available for genome comparisons. Unfortunately, most of them are not applicable for the identification of unique genes and the inferen...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-S4-S18
更新日期:2006-12-12 00:00:00
abstract:BACKGROUND:Because loops connect regular secondary structures, analysis of the former depends directly on the definition of the latter. The numerous assignment methods, however, can offer different definitions. In a previous study, we defined a structural alphabet composed of 16 average protein fragments, which we call...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-58
更新日期:2004-05-12 00:00:00
abstract:BACKGROUND:When constructing new biomarker or gene signature scores for time-to-event outcomes, the underlying aims are to develop a discrimination model that helps to predict whether patients have a poor or good prognosis and to identify the most influential variables for this task. In practice, this is often done fit...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1149-8
更新日期:2016-07-22 00:00:00
abstract:BACKGROUND:Alternative splicing is the critical process in a single gene coding, which removes introns and joins exons, and splicing branchpoints are indicators for the alternative splicing. Wet experiments have identified a great number of human splicing branchpoints, but many branchpoints are still unknown. In order ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1875-6
更新日期:2017-12-01 00:00:00
abstract:BACKGROUND:Identifying the interactions between proteins and long non-coding RNAs (lncRNAs) is of great importance to decipher the functional mechanisms of lncRNAs. However, current experimental techniques for detection of lncRNA-protein interactions are limited and inefficient. Many methods have been proposed to predi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2390-0
更新日期:2018-10-11 00:00:00
abstract:BACKGROUND:Biocatalysis in organic solvents is nowadays a common practice with a large potential in Biotechnology. Several studies report that proteins which are co-crystallized or soaked in organic solvents preserve their fold integrity showing almost identical arrangements when compared to their aqueous forms. Howeve...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2044-2
更新日期:2018-01-30 00:00:00
abstract:BACKGROUND:The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appea...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-4-41
更新日期:2003-09-11 00:00:00
abstract:BACKGROUND:The application of high-throughput sequencing in a broad range of quantitative genomic assays (e.g., DNA-seq, ChIP-seq) has created a high demand for the analysis of large-scale read-count data. Typically, the genome is divided into tiling windows and windowed read-count data is generated for the entire geno...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2077-6
更新日期:2018-03-01 00:00:00
abstract:BACKGROUND:Bioluminescent proteins (BLPs) widely exist in many living organisms. As BLPs are featured by the capability of emitting lights, they can be served as biomarkers and easily detected in biomedical research, such as gene expression analysis and signal transduction pathways. Therefore, accurate identification o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1709-6
更新日期:2017-06-05 00:00:00
abstract:BACKGROUND:High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. Studying complete protein networks can reveal more insight about healthy/disease states than studying proteins in isolation. Similarly, a comparative study of pr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03827-5
更新日期:2020-11-04 00:00:00
abstract:BACKGROUND:Nearly all cellular processes involve proteins structurally rearranging to accommodate molecular partners. The energy landscape underscores the inherent nature of proteins as dynamic molecules interconverting between structures with varying energies. In principle, reconstructing a protein's energy landscape ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2822-5
更新日期:2019-06-06 00:00:00
abstract:BACKGROUND:Protein function prediction is an important problem in the post-genomic era. Recent advances in experimental biology have enabled the production of vast amounts of protein-protein interaction (PPI) data. Thus, using PPI data to functionally annotate proteins has been extensively studied. However, most existi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S12-S4
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Parsimony and maximum likelihood methods of phylogenetic tree estimation and parsimony methods for genome rearrangements are central to the study of genome evolution yet to date they have largely been pursued in isolation. RESULTS:We present a data structure called a history graph that offers a practical ba...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-206
更新日期:2014-06-19 00:00:00
abstract:BACKGROUND:Biological data that are well-organized by an ontology, such as Gene Ontology, enables high-throughput availability of the semantic web. It can also be used to facilitate high throughput classification of biomedical information. However, to our knowledge, no evaluation has been published on automating classi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-S3-S7
更新日期:2007-05-09 00:00:00
abstract:BACKGROUND:A cross-correlation (XCorr) score function is one of the most popular score functions utilized to search peptide identifications in databases, and many computer programs, such as SEQUEST, Comet, and Tide, currently use this score function. Recently, the HiXCorr algorithm was developed to speed up this score ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2559-6
更新日期:2018-12-12 00:00:00
abstract:BACKGROUND:In recent years, protein-protein interaction (PPI) networks have been well recognized as important resources to elucidate various biological processes and cellular mechanisms. In this paper, we address the problem of predicting protein complexes from a PPI network. This problem has two difficulties. One is r...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1920-5
更新日期:2017-12-06 00:00:00
abstract:BACKGROUND:Recently, mass spectrometry data have been mined using a genetic algorithm to produce discriminatory models that distinguish healthy individuals from those with cancer. This algorithm is the basis for claims of 100% sensitivity and specificity in two related publicly available datasets. To date, no detailed ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-180
更新日期:2004-11-19 00:00:00