Abstract:
BACKGROUND:Gene expression data can be analyzed by summarizing groups of individual gene expression profiles based on GO annotation information. The mean expression profile per group can then be used to identify interesting GO categories in relation to the experimental settings. However, the expression profiles present in GO classes are often heterogeneous, i.e., there are several different expression profiles within one class. As a result, important experimental findings can be obscured because the summarizing profile does not seem to be of interest. We propose to tackle this problem by finding homogeneous subclasses within GO categories: preclustering. RESULTS:Two microarray datasets are analyzed. First, a selection of genes from a well-known Saccharomyces cerevisiae dataset is used. The GO class "cell wall organization and biogenesis" is shown as a specific example. After preclustering, this term can be associated with different phases in the cell cycle, where it could not be associated with a specific phase previously. Second, a dataset of differentiation of human Mesenchymal Stem Cells (MSC) into osteoblasts is used. For this dataset results are shown in which the GO term "skeletal development" is a specific example of a heterogeneous GO class for which better associations can be made after preclustering. The Intra Cluster Correlation (ICC), a measure of cluster tightness, is applied to identify relevant clusters. CONCLUSIONS:We show that this method leads to an improved interpretability of results in Principal Component Analysis.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
De Haan JR,Piek E,van Schaik RC,de Vlieg J,Bauerschmidt S,Buydens LM,Wehrens Rdoi
10.1186/1471-2105-11-158subject
Has Abstractpub_date
2010-03-26 00:00:00pages
158issn
1471-2105pii
1471-2105-11-158journal_volume
11pub_type
杂志文章abstract:BACKGROUND:Data are the evidentiary basis for scientific hypotheses, analyses and publication, for policy formation and for decision-making. They are essential to the evaluation and testing of results by peer scientists both present and future. There is broad consensus in the scientific and conservation communities tha...
journal_title:BMC bioinformatics
pub_type: 指南,杂志文章
doi:10.1186/1471-2105-12-S15-S1
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:The Triplex cell vaccine is a cancer cellular vaccine that can prevent almost completely the mammary tumor onset in HER-2/neu transgenic mice. In a translational perspective, the activity of the Triplex vaccine was also investigated against lung metastases showing that the vaccine is an effective treatment a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S7-S13
更新日期:2010-10-15 00:00:00
abstract:BACKGROUND:Determining differentially expressed genes (DEGs) between biological samples is the key to understand how genotype gives rise to phenotype. RNA-seq and microarray are two main technologies for profiling gene expression levels. However, considerable discrepancy has been found between DEGs detected using the t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0847-y
更新日期:2016-01-11 00:00:00
abstract:BACKGROUND:Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biologi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-16-S6-S4
更新日期:2015-01-01 00:00:00
abstract:BACKGROUND:Differentially expressed genes are typically identified by analyzing the variation between replicate measurements. These procedures implicitly assume that there are no systematic errors in the data even though several sources of systematic error are known. RESULTS:OpWise estimates the amount of systematic e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-19
更新日期:2006-01-13 00:00:00
abstract:BACKGROUND:To understand biological processes and diseases, it is crucial to unravel the concerted interplay of transcription factors (TFs), microRNAs (miRNAs) and their targets within regulatory networks and fundamental sub-networks. An integrative computational resource generating a comprehensive view of these regula...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-67
更新日期:2011-03-04 00:00:00
abstract:BACKGROUND:Bioinformatics software quality assurance is essential in genomic medicine. Systematic verification and validation of bioinformatics software is difficult because it is often not possible to obtain a realistic "gold standard" for systematic evaluation. Here we apply a technique that originates from the softw...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S16-S15
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opp...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-206
更新日期:2012-08-17 00:00:00
abstract:BACKGROUND:Ontologies are widely used to represent knowledge in biomedicine. Systematic approaches for detecting errors and disagreements are needed for large ontologies with hundreds or thousands of terms and semantic relationships. A recent approach of defining terms using logical definitions is now increasingly bein...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-418
更新日期:2011-10-27 00:00:00
abstract:BACKGROUND:With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-5
更新日期:2014-01-09 00:00:00
abstract:BACKGROUND:Correctly identifying genomic regions enriched with histone modifications and transcription factors is key to understanding their regulatory and developmental roles. Conceptually, these regions are divided into two categories, narrow peaks and broad domains, and different algorithms are used to identify each...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-0991-z
更新日期:2016-03-24 00:00:00
abstract:BACKGROUND:Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-prot...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-73
更新日期:2009-02-27 00:00:00
abstract:BACKGROUND:In population genetics, simulation is a fundamental tool for analyzing how basic evolutionary forces such as natural selection, recombination, and mutation shape the genetic landscape of a population. Forward simulation represents the most powerful, but, at the same time, most compute-intensive approach for ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-216
更新日期:2013-07-09 00:00:00
abstract:BACKGROUND:Time course microarray profiles examine the expression of genes over a time domain. They are necessary in order to determine the complete set of genes that are dynamically expressed under given conditions, and to determine the interaction between these genes. Because of cost and resource issues, most time se...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-13
更新日期:2011-01-11 00:00:00
abstract:BACKGROUND:Numerous functional genomics approaches have been developed to study the model organism yeast, Saccharomyces cerevisiae, with the aim of systematically understanding the biology of the cell. Some of these techniques are based on yeast growth differences under different conditions, such as those generated by ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-117
更新日期:2007-04-04 00:00:00
abstract:BACKGROUND:Protein sequence motifs are by definition short fragments of conserved amino acids, often associated with a specific function. Accordingly protein sequence profiles derived from multiple sequence alignments provide an alternative description of functional motifs characterizing families of related sequences. ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-164
更新日期:2005-06-29 00:00:00
abstract:BACKGROUND:Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. RESULTS:PubFocus w...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-424
更新日期:2006-10-02 00:00:00
abstract:BACKGROUND:The application of high-throughput sequencing in a broad range of quantitative genomic assays (e.g., DNA-seq, ChIP-seq) has created a high demand for the analysis of large-scale read-count data. Typically, the genome is divided into tiling windows and windowed read-count data is generated for the entire geno...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2077-6
更新日期:2018-03-01 00:00:00
abstract:BACKGROUND:High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of g...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-14
更新日期:2007-01-18 00:00:00
abstract:BACKGROUND:Community structure is ubiquitous in biological networks. There has been an increased interest in unraveling the community structure of biological systems as it may provide important insights into a system's functional components and the impact of local structures on dynamics at a global scale. Choosing an a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-220
更新日期:2014-06-25 00:00:00
abstract:BACKGROUND:The definition of a distance measure plays a key role in the evaluation of different clustering solutions of gene expression profiles. In this empirical study we compare different clustering solutions when using the Mutual Information (MI) measure versus the use of the well known Euclidean distance and Pears...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-111
更新日期:2007-03-30 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNA) are regulatory genes that target and repress other RNA molecules via sequence-specific binding. Several biological processes are regulated across many organisms by evolutionarily conserved miRNAs. Plants and invertebrates employ their miRNA in defense against viruses by targeting and degrad...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S2-S3
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Accurate somatic mutation-calling is essential for insightful mutation analyses in cancer studies. Several mutation-callers are publicly available and more are likely to appear. Nonetheless, mutation-calling is still challenging and there is unlikely to be one established caller that systematically outperfor...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-154
更新日期:2014-05-21 00:00:00
abstract:BACKGROUND:The knowledge base-driven pathway analysis is becoming the first choice for many investigators, in that it not only can reduce the complexity of functional analysis by grouping thousands of genes into just several hundred pathways, but also can increase the explanatory power for the experiment by identifying...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1285-1
更新日期:2016-10-06 00:00:00
abstract:BACKGROUND:Microarray technology provides an efficient means for globally exploring physiological processes governed by the coordinated expression of multiple genes. However, identification of genes differentially expressed in microarray experiments is challenging because of their potentially high type I error rate. Me...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-142
更新日期:2008-03-06 00:00:00
abstract:BACKGROUND:For many types of analyses, data about gene structure and locations of non-coding regions of genes are required. Although a vast amount of genomic sequence data is available, precise annotation of genes is lacking behind. Finding the corresponding gene of a given protein sequence by means of conventional too...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-278
更新日期:2008-06-13 00:00:00
abstract:BACKGROUND:The cost efficient two-stage design is often used in genome-wide association studies (GWASs) in searching for genetic loci underlying the susceptibility for complex diseases. Replication-based analysis, which considers data from each stage separately, often suffers from loss of efficiency. Joint test that co...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-9
更新日期:2011-01-07 00:00:00
abstract:BACKGROUND:Event sequences where different types of events often occur close together arise, e.g., when studying potential transcription factor binding sites (TFBS, events) of certain transcription factors (TF, types) in a DNA sequence. These events tend to occur in bursts: in some genomic regions there are more genes ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-336
更新日期:2008-08-08 00:00:00
abstract:BACKGROUND:The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-525
更新日期:2008-12-08 00:00:00
abstract:BACKGROUND:Recently differential variability has been showed to be valuable in evaluating the association of DNA methylation to the risks of complex human diseases. The statistical tests based on both differential methylation level and differential variability can be more powerful than those based only on differential ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2185-3
更新日期:2018-05-18 00:00:00