Abstract:
BACKGROUND:Observed levels of gene expression strongly depend on both activity of DNA binding transcription factors (TFs) and chromatin state through different histone modifications (HMs). In order to recover the functional relationship between local chromatin state, TF binding and observed levels of gene expression, regression methods have proven to be useful tools. They have been successfully applied to predict mRNA levels from genome-wide experimental data and they provide insight into context-dependent gene regulatory mechanisms. However, heterogeneity arising from gene-set specific regulatory interactions is often overlooked. RESULTS:We show that regression models that predict gene expression by using experimentally derived ChIP-seq profiles of TFs can be significantly improved by mixture modelling. In order to find biologically relevant gene clusters, we employ a Bayesian allocation procedure which allows us to integrate additional biological information such as three-dimensional nuclear organization of chromosomes and gene function. The data integration procedure involves transforming the additional data into gene similarity values. We propose a generic similarity measure that is especially suitable for situations where the additional data are of both continuous and discrete type, and compare its performance with similar measures in the context of mixture modelling. CONCLUSIONS:We applied the proposed method on a data from mouse embryonic stem cells (ESC). We find that including additional data results in mixture components that exhibit biologically meaningful gene clusters, and provides valuable insight into the heterogeneity of the regulatory interactions.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Aflakparast M,Geeven G,de Gunst MCMdoi
10.1186/s12859-019-3331-2subject
Has Abstractpub_date
2020-01-02 00:00:00pages
3issue
1issn
1471-2105pii
10.1186/s12859-019-3331-2journal_volume
21pub_type
杂志文章abstract:BACKGROUND:Many proteins contain conserved sequence patterns (motifs) that contribute to their functionality. The process of experimentally identifying and validating novel protein motifs can be difficult, expensive, and time consuming. A means for helping to identify in advance the possible function of a novel motif i...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-379
更新日期:2011-09-26 00:00:00
abstract:BACKGROUND:During evolution, large-scale genome rearrangements of chromosomes shuffle the order of homologous genome sequences ("synteny blocks") across species. Some years ago, a controversy erupted in genome rearrangement studies over whether rearrangements recur, causing breakpoints to be reused. METHODS:We investi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S9-S1
更新日期:2011-10-05 00:00:00
abstract:BACKGROUND:Chemical cross-linking is used for protein-protein contacts mapping and for structural analysis. One of the difficulties in cross-linking studies is the analysis of mass-spectrometry data and the assignment of the site of cross-link incorporation. The difficulties are due to higher charges of fragment ions, ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S11-S16
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Reverse engineering of transcriptional regulatory networks (TRN) from genomics data has always represented a computational challenge in System Biology. The major issue is modeling the complex crosstalk among transcription factors (TFs) and their target genes, with a method able to handle both the high number...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3510-1
更新日期:2020-05-29 00:00:00
abstract:BACKGROUND:The improvements of high throughput technologies have produced large amounts of multi-omics experiments datasets. Initial analysis of these data has revealed many concurrent gene alterations within single dataset or/and among multiple omics datasets. Although powerful bioinformatics pipelines have been devel...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3171-0
更新日期:2019-11-08 00:00:00
abstract:BACKGROUND:Multidimensional protein identification technology (MudPIT)-based shot-gun proteomics has been proven to be an effective platform for functional proteomics. In particular, the various sample preparation methods and bioinformatics tools can be integrated to improve the proteomics platform for applications lik...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S15-S8
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:In current comparative proteomics studies, the large number of images generated by 2D gels is currently compared using spot matching algorithms. Unfortunately, differences in gel migration and sample variability make efficient spot alignment very difficult to obtain, and, as consequence most of the software ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-460
更新日期:2008-10-28 00:00:00
abstract:BACKGROUND:It has been proposed that future reference genomes should be graph structures in order to better represent the sequence diversity present in a species. However, there is currently no standard method to represent genomic intervals, such as the positions of genes or transcription factor binding sites, on graph...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1678-9
更新日期:2017-05-18 00:00:00
abstract:BACKGROUND:Time- and dose-to-event phenotypes used in basic science and translational studies are commonly measured imprecisely or incompletely due to limitations of the experimental design or data collection schema. For example, drug-induced toxicities are not reported by the actual time or dose triggering the event, ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2899-x
更新日期:2019-05-28 00:00:00
abstract:BACKGROUND:The toxic effects of many simple organic compounds stem from their biotransformation to chemically reactive metabolites which bind covalently to cellular proteins. To understand the mechanisms of cytotoxic responses it may be important to know which proteins become adducted and whether some may be common tar...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-95
更新日期:2007-03-16 00:00:00
abstract:BACKGROUND:Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations fo...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2073-x
更新日期:2018-05-08 00:00:00
abstract:BACKGROUND:Activation of naïve B lymphocytes by extracellular ligands, e.g. antigen, lipopolysaccharide (LPS) and CD40 ligand, induces a combination of common and ligand-specific phenotypic changes through complex signal transduction pathways. For example, although all three of these ligands induce proliferation, only ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-237
更新日期:2006-05-02 00:00:00
abstract:BACKGROUND:Continuous enzyme kinetic assays are often used in high-throughput applications, as they allow rapid acquisition of large amounts of kinetic data and increased confidence compared to discontinuous assays. However, data analysis is often rate-limiting in high-throughput enzyme assays, as manual inspection and...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3513-y
更新日期:2020-05-14 00:00:00
abstract:BACKGROUND:Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-94
更新日期:2004-07-13 00:00:00
abstract:BACKGROUND:A common method for presenting and studying biological interaction networks is visualization. Software tools can enhance our ability to explore network visualizations and improve our understanding of biological systems, particularly when these tools offer analysis capabilities. However, most published networ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-95
更新日期:2009-03-26 00:00:00
abstract:BACKGROUND:In population genetics, simulation is a fundamental tool for analyzing how basic evolutionary forces such as natural selection, recombination, and mutation shape the genetic landscape of a population. Forward simulation represents the most powerful, but, at the same time, most compute-intensive approach for ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-216
更新日期:2013-07-09 00:00:00
abstract:BACKGROUND:Although many of the genic features in Mycobacterium abscessus have been fully validated, a comprehensive understanding of the regulatory elements remains lacking. Moreover, there is little understanding of how the organism regulates its transcriptomic profile, enabling cells to survive in hostile environmen...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3042-8
更新日期:2019-09-10 00:00:00
abstract:BACKGROUND:Journal articles and databases are two major modes of communication in the biological sciences, and thus integrating these critical resources is of urgent importance to increase the pace of discovery. Projects focused on bridging the gap between journals and databases have been on the rise over the last five...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-175
更新日期:2011-05-19 00:00:00
abstract:BACKGROUND:The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. RESULTS:...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2103-8
更新日期:2018-03-09 00:00:00
abstract:BACKGROUND:Discovering patterns from gene expression levels is regarded as a classification problem when tissue classes of the samples are given and solved as a discrete-data problem by discretizing the expression levels of each gene into intervals maximizing the interdependence between that gene and the class labels. ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S5-S5
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:The Acel_2062 protein from Acidothermus cellulolyticus is a protein of unknown function. Initial sequence analysis predicted that it was a metallopeptidase from the presence of a motif conserved amongst the Asp-zincins, which are peptidases that contain a single, catalytic zinc ion ligated by the histidines ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-1
更新日期:2014-01-03 00:00:00
abstract:BACKGROUND:We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, ma...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-101
更新日期:2004-07-26 00:00:00
abstract:BACKGROUND:As numerous diseases involve errors in signal transduction, modern therapeutics often target proteins involved in cellular signaling. Interpretation of the activity of signaling pathways during disease development or therapeutic intervention would assist in drug development, design of therapy, and target ide...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-99
更新日期:2006-02-28 00:00:00
abstract:BACKGROUND:The MAQC project demonstrated that microarrays with comparable content show inter- and intra-platform reproducibility. However, since the content of gene databases still increases, the development of new generations of microarrays covering new content is mandatory. To better understand the potential challeng...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-186
更新日期:2009-06-18 00:00:00
abstract:BACKGROUND:The identification of prognostic genes that can distinguish the prognostic risks of cancer patients remains a significant challenge. Previous works have proven that functional gene sets were more reliable for this task than the gene signature. However, few works have considered the cross-talk among functiona...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2674-z
更新日期:2019-02-18 00:00:00
abstract:BACKGROUND:In the adaptive immune system, variable regions of immunoglobulin (IG) are encoded by random recombination of variable (V), diversity (D), and joining (J) gene segments in the germline. Partitioning the functional antibody sequences to their sourcing germline gene segments is vital not only for understanding...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-S12-S20
更新日期:2008-12-12 00:00:00
abstract:BACKGROUND:A drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. The increasing volume of the scientific literature overwhelms health care professionals trying to be kept up-to-date with all published studies on DDI. METHODS:This paper describes a hybrid linguistic approa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S2-S1
更新日期:2011-03-29 00:00:00
abstract:BACKGROUND:Microarray technology provides an efficient means for globally exploring physiological processes governed by the coordinated expression of multiple genes. However, identification of genes differentially expressed in microarray experiments is challenging because of their potentially high type I error rate. Me...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-142
更新日期:2008-03-06 00:00:00
abstract::Time course gene expression experiments are a popular means to infer co-expression. Many methods have been proposed to cluster genes or to build networks based on similarity measures of their expression dynamics. In this paper we apply a correlation based approach to network reconstruction to three datasets of time se...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-S1-S16
更新日期:2007-03-08 00:00:00
abstract:BACKGROUND:Genome and metagenome studies have identified thousands of protein families whose functions are poorly understood and for which techniques for functional characterization provide only partial information. For such proteins, the genome context can give further information about their functional context. RESU...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-141
更新日期:2011-05-09 00:00:00