High-Throughput GoMiner, an 'industrial-strength' integrative gene ontology tool for interpretation of multiple-microarray experiments, with application to studies of Common Variable Immune Deficiency (CVID).

Abstract:

BACKGROUND:We previously developed GoMiner, an application that organizes lists of 'interesting' genes (for example, under-and overexpressed genes from a microarray experiment) for biological interpretation in the context of the Gene Ontology. The original version of GoMiner was oriented toward visualization and interpretation of the results from a single microarray (or other high-throughput experimental platform), using a graphical user interface. Although that version can be used to examine the results from a number of microarrays one at a time, that is a rather tedious task, and original GoMiner includes no apparatus for obtaining a global picture of results from an experiment that consists of multiple microarrays. We wanted to provide a computational resource that automates the analysis of multiple microarrays and then integrates the results across all of them in useful exportable output files and visualizations. RESULTS:We now introduce a new tool, High-Throughput GoMiner, that has those capabilities and a number of others: It (i) efficiently performs the computationally-intensive task of automated batch processing of an arbitrary number of microarrays, (ii) produces a human-or computer-readable report that rank-orders the multiple microarray results according to the number of significant GO categories, (iii) integrates the multiple microarray results by providing organized, global clustered image map visualizations of the relationships of significant GO categories, (iv) provides a fast form of 'false discovery rate' multiple comparisons calculation, and (v) provides annotations and visualizations for relating transcription factor binding sites to genes and GO categories. CONCLUSION:High-Throughput GoMiner achieves the desired goal of providing a computational resource that automates the analysis of multiple microarrays and integrates results across all of the microarrays. For illustration, we show an application of this new tool to the interpretation of altered gene expression patterns in Common Variable Immune Deficiency (CVID). High-Throughput GoMiner will be useful in a wide range of applications, including the study of time-courses, evaluation of multiple drug treatments, comparison of multiple gene knock-outs or knock-downs, and screening of large numbers of chemical derivatives generated from a promising lead compound.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Zeeberg BR,Qin H,Narasimhan S,Sunshine M,Cao H,Kane DW,Reimers M,Stephens RM,Bryant D,Burt SK,Elnekave E,Hari DM,Wynn TA,Cunningham-Rundles C,Stewart DM,Nelson D,Weinstein JN

doi

10.1186/1471-2105-6-168

keywords:

subject

Has Abstract

pub_date

2005-07-05 00:00:00

pages

168

issn

1471-2105

pii

1471-2105-6-168

journal_volume

6

pub_type

杂志文章
  • HH-suite3 for fast remote homology detection and deep protein annotation.

    abstract:BACKGROUND:HH-suite is a widely used open source software suite for sensitive sequence similarity searches and protein fold recognition. It is based on pairwise alignment of profile Hidden Markov models (HMMs), which represent multiple sequence alignments of homologous proteins. RESULTS:We developed a single-instructi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3019-7

    authors: Steinegger M,Meier M,Mirdita M,Vöhringer H,Haunsberger SJ,Söding J

    更新日期:2019-09-14 00:00:00

  • On the comparison of regulatory sequences with multiple resolution Entropic Profiles.

    abstract:BACKGROUND:Enhancers are stretches of DNA (100-1000 bp) that play a major role in development gene expression, evolution and disease. It has been recently shown that in high-level eukaryotes enhancers rarely work alone, instead they collaborate by forming clusters of cis-regulatory modules (CRMs). Although the binding ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0980-2

    authors: Comin M,Antonello M

    更新日期:2016-03-18 00:00:00

  • The development of a comparison approach for Illumina bead chips unravels unexpected challenges applying newest generation microarrays.

    abstract:BACKGROUND:The MAQC project demonstrated that microarrays with comparable content show inter- and intra-platform reproducibility. However, since the content of gene databases still increases, the development of new generations of microarrays covering new content is mandatory. To better understand the potential challeng...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-186

    authors: Eggle D,Debey-Pascher S,Beyer M,Schultze JL

    更新日期:2009-06-18 00:00:00

  • Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model.

    abstract:BACKGROUND:Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-272

    authors: He X,Sarma MS,Ling X,Chee B,Zhai C,Schatz B

    更新日期:2010-05-20 00:00:00

  • PseUI: Pseudouridine sites identification based on RNA sequence information.

    abstract:BACKGROUND:Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2321-0

    authors: He J,Fang T,Zhang Z,Huang B,Zhu X,Xiong Y

    更新日期:2018-08-29 00:00:00

  • A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis.

    abstract:BACKGROUND:Protein remote homology detection and fold recognition are central problems in bioinformatics. Currently, discriminative methods based on support vector machine (SVM) are the most effective and accurate methods for solving these problems. A key step to improve the performance of the SVM-based methods is to f...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-510

    authors: Liu B,Wang X,Lin L,Dong Q,Wang X

    更新日期:2008-12-01 00:00:00

  • MATLIGN: a motif clustering, comparison and matching tool.

    abstract:BACKGROUND:Sequence motifs representing transcription factor binding sites (TFBS) are commonly encoded as position frequency matrices (PFM) or degenerate consensus sequences (CS). These formats are used to represent the characterised TFBS profiles stored in transcription factor databases, as well as to represent the po...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-189

    authors: Kankainen M,Löytynoja A

    更新日期:2007-06-08 00:00:00

  • DisCons: a novel tool to quantify and classify evolutionary conservation of intrinsic protein disorder.

    abstract:BACKGROUND:Analyzing the amino acid sequence of an intrinsically disordered protein (IDP) in an evolutionary context can yield novel insights on the functional role of disordered regions and sequence element(s). However, in the case of many IDPs, the lack of evolutionary conservation of the primary sequence can hamper ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0592-2

    authors: Varadi M,Guharoy M,Zsolyomi F,Tompa P

    更新日期:2015-05-13 00:00:00

  • Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature.

    abstract:BACKGROUND:The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. RESULTS:...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2103-8

    authors: Müller HM,Van Auken KM,Li Y,Sternberg PW

    更新日期:2018-03-09 00:00:00

  • methCancer-gen: a DNA methylome dataset generator for user-specified cancer type based on conditional variational autoencoder.

    abstract:BACKGROUND:Recently, DNA methylation has drawn great attention due to its strong correlation with abnormal gene activities and informative representation of the cancer status. As a number of studies focus on DNA methylation signatures in cancer, demand for utilizing publicly available methylome dataset has been increas...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3516-8

    authors: Choi J,Chae H

    更新日期:2020-05-11 00:00:00

  • Conceptual-level workflow modeling of scientific experiments using NMR as a case study.

    abstract:BACKGROUND:Scientific workflows improve the process of scientific experiments by making computations explicit, underscoring data flow, and emphasizing the participation of humans in the process when intuition and human reasoning are required. Workflows for experiments also highlight transitions among experimental phase...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-31

    authors: Verdi KK,Ellis HJ,Gryk MR

    更新日期:2007-01-30 00:00:00

  • Efficient prediction of human protein-protein interactions at a global scale.

    abstract:BACKGROUND:Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods. RESULTS:On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0383-1

    authors: Schoenrock A,Samanfar B,Pitre S,Hooshyar M,Jin K,Phillips CA,Wang H,Phanse S,Omidi K,Gui Y,Alamgir M,Wong A,Barrenäs F,Babu M,Benson M,Langston MA,Green JR,Dehne F,Golshani A

    更新日期:2014-12-10 00:00:00

  • In silico design of targeted SRM-based experiments.

    abstract::Selected reaction monitoring (SRM)-based proteomics approaches enable highly sensitive and reproducible assays for profiling of thousands of peptides in one experiment. The development of such assays involves the determination of retention time, detectability and fragmentation properties of peptides, followed by an op...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S16-S8

    authors: Nahnsen S,Kohlbacher O

    更新日期:2012-01-01 00:00:00

  • Comparative study on gene set and pathway topology-based enrichment methods.

    abstract:BACKGROUND:Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0751-5

    authors: Bayerlová M,Jung K,Kramer F,Klemm F,Bleckmann A,Beißbarth T

    更新日期:2015-10-22 00:00:00

  • CNN-based ranking for biomedical entity normalization.

    abstract:BACKGROUND:Most state-of-the-art biomedical entity normalization systems, such as rule-based systems, merely rely on morphological information of entity mentions, but rarely consider their semantic information. In this paper, we introduce a novel convolutional neural network (CNN) architecture that regards biomedical e...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1805-7

    authors: Li H,Chen Q,Tang B,Wang X,Xu H,Wang B,Huang D

    更新日期:2017-10-03 00:00:00

  • Promoter prediction in E. coli based on SIDD profiles and Artificial Neural Networks.

    abstract:BACKGROUND:One of the major challenges in biology is the correct identification of promoter regions. Computational methods based on motif searching have been the traditional approach taken. Recent studies have shown that DNA structural properties, such as curvature, stacking energy, and stress-induced duplex destabiliz...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S6-S17

    authors: Bland C,Newsome AS,Markovets AA

    更新日期:2010-10-07 00:00:00

  • Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data.

    abstract:BACKGROUND:Integrating data from multiple global assays and curated databases is essential to understand the spatio-temporal interactions within cells. Different experiments measure cellular processes at various widths and depths, while databases contain biological information based on established facts or published da...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-203

    authors: Zhang Y,Xuan J,de los Reyes BG,Clarke R,Ressom HW

    更新日期:2008-04-21 00:00:00

  • SCOPA and META-SCOPA: software for the analysis and aggregation of genome-wide association studies of multiple correlated phenotypes.

    abstract:BACKGROUND:Genome-wide association studies (GWAS) of single nucleotide polymorphisms (SNPs) have been successful in identifying loci contributing genetic effects to a wide range of complex human diseases and quantitative traits. The traditional approach to GWAS analysis is to consider each phenotype separately, despite...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1437-3

    authors: Mägi R,Suleimanov YV,Clarke GM,Kaakinen M,Fischer K,Prokopenko I,Morris AP

    更新日期:2017-01-11 00:00:00

  • A context-blocks model for identifying clinical relationships in patient records.

    abstract:BACKGROUND:Patient records contain valuable information regarding explanation of diagnosis, progression of disease, prescription and/or effectiveness of treatment, and more. Automatic recognition of clinically important concepts and the identification of relationships between those concepts in patient records are preli...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S3-S3

    authors: Islamaj Doğan R,Névéol A,Lu Z

    更新日期:2011-06-09 00:00:00

  • Integrating multiple molecular sources into a clinical risk prediction signature by extracting complementary information.

    abstract:BACKGROUND:High-throughput technology allows for genome-wide measurements at different molecular levels for the same patient, e.g. single nucleotide polymorphisms (SNPs) and gene expression. Correspondingly, it might be beneficial to also integrate complementary information from different molecular levels when building...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1183-6

    authors: Hieke S,Benner A,Schlenl RF,Schumacher M,Bullinger L,Binder H

    更新日期:2016-08-30 00:00:00

  • Novel domain expansion methods to improve the computational efficiency of the Chemical Master Equation solution for large biological networks.

    abstract:BACKGROUND:Numerical solutions of the chemical master equation (CME) are important for understanding the stochasticity of biochemical systems. However, solving CMEs is a formidable task. This task is complicated due to the nonlinear nature of the reactions and the size of the networks which result in different realizat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03668-2

    authors: Kosarwal R,Kulasiri D,Samarasinghe S

    更新日期:2020-11-11 00:00:00

  • SEQprocess: a modularized and customizable pipeline framework for NGS processing in R package.

    abstract:BACKGROUNDS:Next-Generation Sequencing (NGS) is now widely used in biomedical research for various applications. Processing of NGS data requires multiple programs and customization of the processing pipelines according to the data platforms. However, rapid progress of the NGS applications and processing methods urgentl...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2676-x

    authors: Joo T,Choi JH,Lee JH,Park SE,Jeon Y,Jung SH,Woo HG

    更新日期:2019-02-20 00:00:00

  • Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

    abstract:BACKGROUND:Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain datab...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-16

    authors: Truong K,Ikura M

    更新日期:2003-05-06 00:00:00

  • Identification of germ cell-specific genes in mammalian meiotic prophase.

    abstract:BACKGROUND:Mammalian germ cells undergo meiosis to produce sperm or eggs, haploid cells that are primed to meet and propagate life. Meiosis is initiated by retinoic acid and meiotic prophase is the first and most complex stage of meiosis when homologous chromosomes pair to exchange genetic information. Errors in meiosi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-72

    authors: Li Y,Ray D,Ye P

    更新日期:2013-02-27 00:00:00

  • Qxpak.5: old mixed model solutions for new genomics problems.

    abstract:BACKGROUND:Mixed models have a long and fruitful history in statistics. They are pertinent to genomics problems because they are highly versatile, accommodating a wide variety of situations within the same theoretical and algorithmic framework. RESULTS:Qxpak is a package for versatile statistical genomics, specificall...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-202

    authors: Pérez-Enciso M,Misztal I

    更新日期:2011-05-25 00:00:00

  • Fast individual ancestry inference from DNA sequence data leveraging allele frequencies for multiple populations.

    abstract:BACKGROUND:Estimation of individual ancestry from genetic data is useful for the analysis of disease association studies, understanding human population history and interpreting personal genomic variation. New, computationally efficient methods are needed for ancestry inference that can effectively utilize existing inf...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0418-7

    authors: Bansal V,Libiger O

    更新日期:2015-01-16 00:00:00

  • Survival Online: a web-based service for the analysis of correlations between gene expression and clinical and follow-up data.

    abstract:BACKGROUND:Complex microarray gene expression datasets can be used for many independent analyses and are particularly interesting for the validation of potential biomarkers and multi-gene classifiers. This article presents a novel method to perform correlations between microarray gene expression data and clinico-pathol...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S12-S10

    authors: Corradi L,Mirisola V,Porro I,Torterolo L,Fato M,Romano P,Pfeffer U

    更新日期:2009-10-15 00:00:00

  • Predicting nucleosome positioning using a duration Hidden Markov Model.

    abstract:BACKGROUND:The nucleosome is the fundamental packing unit of DNAs in eukaryotic cells. Its detailed positioning on the genome is closely related to chromosome functions. Increasing evidence has shown that genomic DNA sequence itself is highly predictive of nucleosome positioning genome-wide. Therefore a fast software t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-346

    authors: Xi L,Fondufe-Mittendorf Y,Xia L,Flatow J,Widom J,Wang JP

    更新日期:2010-06-24 00:00:00

  • Gene ontology based transfer learning for protein subcellular localization.

    abstract:BACKGROUND:Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-44

    authors: Mei S,Fei W,Zhou S

    更新日期:2011-02-02 00:00:00

  • GObar: a gene ontology based analysis and visualization tool for gene sets.

    abstract:BACKGROUND:Microarray experiments, as well as other genomic analyses, often result in large gene sets containing up to several hundred genes. The biological significance of such sets of genes is, usually, not readily apparent. Identification of the functions of the genes in the set can help highlight features of intere...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-189

    authors: Lee JS,Katari G,Sachidanandam R

    更新日期:2005-07-25 00:00:00