Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation.

Abstract:

BACKGROUND:Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to employ meta-analysis of multiple independent microarray datasets. This allows one to compare expression data from a diverse collection of breast tumor array datasets generated on either cDNA or oligonucleotide arrays. RESULTS:We gathered expression data from 9 published microarray studies examining estrogen receptor positive (ER+) and estrogen receptor negative (ER-) BrCa tumor cases from the Oncomine database. We performed a meta-analysis and identified genes that were universally up or down regulated with respect to ER+ versus ER- tumor status. We surveyed both the proximal promoter and 3' untranslated regions (3'UTR) of our top-ranking genes in each expression group to test whether common sequence elements may contribute to the observed expression patterns. Utilizing a combination of known transcription factor binding sites (TFBS), evolutionarily conserved mammalian promoter and 3'UTR motifs, and microRNA (miRNA) seed sequences, we identified numerous motifs that were disproportionately represented between the two gene classes suggesting a common regulatory network for the observed gene expression patterns. CONCLUSION:Some of the genes we identified distinguish key transcripts previously seen in array studies, while others are newly defined. Many of the genes identified as overexpressed in ER- tumors were previously identified as expression markers for neoplastic transformation in multiple human cancers. Moreover, our motif analysis identified a collection of specific cis-acting target sites which may collectively play a role in the differential gene expression patterns observed in ER+ versus ER- breast cancer tumors. Importantly, the gene sets and associated DNA motifs provide a starting point with which to explore the mechanistic basis for the observed expression patterns in breast tumors.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Smith DD,Saetrom P,Snøve O Jr,Lundberg C,Rivas GE,Glackin C,Larson GP

doi

10.1186/1471-2105-9-63

subject

Has Abstract

pub_date

2008-01-28 00:00:00

pages

63

issn

1471-2105

pii

1471-2105-9-63

journal_volume

9

pub_type

杂志文章
  • Multi-label literature classification based on the Gene Ontology graph.

    abstract:BACKGROUND:The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-525

    authors: Jin B,Muller B,Zhai C,Lu X

    更新日期:2008-12-08 00:00:00

  • Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.

    abstract:BACKGROUND:RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging in the context of dif...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0994-9

    authors: Bi R,Liu P

    更新日期:2016-03-31 00:00:00

  • Analysis of cancer metabolism with high-throughput technologies.

    abstract:BACKGROUND:Recent advances in genomics and proteomics have allowed us to study the nuances of the Warburg effect--a long-standing puzzle in cancer energy metabolism--at an unprecedented level of detail. While modern next-generation sequencing technologies are extremely powerful, the lack of appropriate data analysis to...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S10-S8

    authors: Markovets AA,Herman D

    更新日期:2011-10-18 00:00:00

  • A method of predicting changes in human gene splicing induced by genetic variants in context of cis-acting elements.

    abstract:BACKGROUND:Polymorphic variants and mutations disrupting canonical splicing isoforms are among the leading causes of human hereditary disorders. While there is a substantial evidence of aberrant splicing causing Mendelian diseases, the implication of such events in multi-genic disorders is yet to be well understood. We...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-22

    authors: Churbanov A,Vorechovský I,Hicks C

    更新日期:2010-01-12 00:00:00

  • Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.

    abstract:BACKGROUND:High-throughput sequencing technologies, such as the Illumina Genome Analyzer, are powerful new tools for investigating a wide range of biological and medical questions. Statistical and computational methods are key for drawing meaningful and accurate conclusions from the massive and complex datasets generat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-94

    authors: Bullard JH,Purdom E,Hansen KD,Dudoit S

    更新日期:2010-02-18 00:00:00

  • Amino acid sequence associated with bacteriophage recombination site helps to reveal genes potentially acquired through horizontal gene transfer.

    abstract:BACKGROUND:Horizontal gene transfer, i.e. the acquisition of genetic material from nonparent organism, is considered an important force driving species evolution. Many cases of horizontal gene transfer from prokaryotes to eukaryotes have been registered, but no transfer mechanism has been deciphered so far, although vi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03599-y

    authors: Daugavet MA,Shabelnikov SV,Podgornaya OI

    更新日期:2020-07-24 00:00:00

  • Informative gene selection and the direct classification of tumors based on relative simplicity.

    abstract:BACKGROUND:Selecting a parsimonious set of informative genes to build highly generalized performance classifier is the most important task for the analysis of tumor microarray expression data. Many existing gene pair evaluation methods cannot highlight diverse patterns of gene pairs only used one strategy of vertical c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0893-0

    authors: Chen Y,Wang L,Li L,Zhang H,Yuan Z

    更新日期:2016-01-20 00:00:00

  • Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.

    abstract:BACKGROUND:One of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-410

    authors: Scheeff ED,Bourne PE

    更新日期:2006-09-14 00:00:00

  • Meta-eQTL: a tool set for flexible eQTL meta-analysis.

    abstract:BACKGROUND:Increasing number of eQTL (Expression Quantitative Trait Loci) datasets facilitate genetics and systems biology research. Meta-analysis tools are in need to jointly analyze datasets of same or similar issue types to improve statistical power especially in trans-eQTL mapping. Meta-analysis framework is also n...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0392-0

    authors: Di Narzo AF,Cheng H,Lu J,Hao K

    更新日期:2014-11-28 00:00:00

  • Large scale analysis of protein conformational transitions from aqueous to non-aqueous media.

    abstract:BACKGROUND:Biocatalysis in organic solvents is nowadays a common practice with a large potential in Biotechnology. Several studies report that proteins which are co-crystallized or soaked in organic solvents preserve their fold integrity showing almost identical arrangements when compared to their aqueous forms. Howeve...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2044-2

    authors: Rueda AJV,Monzon AM,Ardanaz SM,Iglesias LE,Parisi G

    更新日期:2018-01-30 00:00:00

  • LNDriver: identifying driver genes by integrating mutation and expression data based on gene-gene interaction network.

    abstract:BACKGROUND:Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each indiv...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1332-y

    authors: Wei PJ,Zhang D,Xia J,Zheng CH

    更新日期:2016-12-23 00:00:00

  • Fast and robust group-wise eQTL mapping using sparse graphical models.

    abstract:BACKGROUND:Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. The traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression tra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0421-z

    authors: Cheng W,Shi Y,Zhang X,Wang W

    更新日期:2015-01-16 00:00:00

  • DeepQA: improving the estimation of single protein model quality with deep belief networks.

    abstract:BACKGROUND:Protein quality assessment (QA) useful for ranking and selecting protein models has long been viewed as one of the major challenges for protein tertiary structure prediction. Especially, estimating the quality of a single protein model, which is important for selecting a few good models out of a large model ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1405-y

    authors: Cao R,Bhattacharya D,Hou J,Cheng J

    更新日期:2016-12-05 00:00:00

  • CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics.

    abstract:BACKGROUND:Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly ass...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-74

    authors: Gai X,Perin JC,Murphy K,O'Hara R,D'arcy M,Wenocur A,Xie HM,Rappaport EF,Shaikh TH,White PS

    更新日期:2010-02-04 00:00:00

  • Performance of a genetic algorithm for mass spectrometry proteomics.

    abstract:BACKGROUND:Recently, mass spectrometry data have been mined using a genetic algorithm to produce discriminatory models that distinguish healthy individuals from those with cancer. This algorithm is the basis for claims of 100% sensitivity and specificity in two related publicly available datasets. To date, no detailed ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-180

    authors: Jeffries NO

    更新日期:2004-11-19 00:00:00

  • Linear predictive coding representation of correlated mutation for protein sequence alignment.

    abstract:BACKGROUND:Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mutation and incorporat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S2-S2

    authors: Jeong CS,Kim D

    更新日期:2010-04-16 00:00:00

  • Augmented annotation and orthologue analysis for Oryctolagus cuniculus: Better Bunny.

    abstract:BACKGROUND:The rabbit is an important model organism used in a wide range of biomedical research. However, the rabbit genome is still sparsely annotated, thus prohibiting extensive functional analysis of gene sets derived from whole-genome experiments. We developed a web-based application that provides augmented annota...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-84

    authors: Craig DB,Kannan S,Dombkowski AA

    更新日期:2012-05-08 00:00:00

  • MetaMIS: a metagenomic microbial interaction simulator based on microbial community profiles.

    abstract:BACKGROUND:The complexity and dynamics of microbial communities are major factors in the ecology of a system. With the NGS technique, metagenomics data provides a new way to explore microbial interactions. Lotka-Volterra models, which have been widely used to infer animal interactions in dynamic systems, have recently ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1359-0

    authors: Shaw GT,Pao YY,Wang D

    更新日期:2016-11-25 00:00:00

  • The scoring of poses in protein-protein docking: current capabilities and future directions.

    abstract:BACKGROUND:Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-286

    authors: Moal IH,Torchala M,Bates PA,Fernández-Recio J

    更新日期:2013-10-01 00:00:00

  • Computational evaluation of TIS annotation for prokaryotic genomes.

    abstract:BACKGROUND:Accurate annotation of translation initiation sites (TISs) is essential for understanding the translation initiation mechanism. However, the reliability of TIS annotation in widely used databases such as RefSeq is uncertain due to the lack of experimental benchmarks. RESULTS:Based on a homogeneity assumptio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-160

    authors: Hu GQ,Zheng X,Ju LN,Zhu H,She ZS

    更新日期:2008-03-25 00:00:00

  • An assessment of catalytic residue 3D ensembles for the prediction of enzyme function.

    abstract:BACKGROUND:The central element of each enzyme is the catalytic site, which commonly catalyzes a single biochemical reaction with high specificity. It was unclear to us how often sites that catalyze the same or highly similar reactions evolved on different, i. e. non-homologous protein folds and how similar their 3D pos...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0807-6

    authors: Žváček C,Friedrichs G,Heizinger L,Merkl R

    更新日期:2015-11-04 00:00:00

  • Automated multigroup outlier identification in molecular high-throughput data using bagplots and gemplots.

    abstract:BACKGROUND:Analyses of molecular high-throughput data often lack in robustness, i.e. results are very sensitive to the addition or removal of a single observation. Therefore, the identification of extreme observations is an important step of quality control before doing further data analysis. Standard outlier detection...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1645-5

    authors: Kruppa J,Jung K

    更新日期:2017-05-02 00:00:00

  • High-Throughput GoMiner, an 'industrial-strength' integrative gene ontology tool for interpretation of multiple-microarray experiments, with application to studies of Common Variable Immune Deficiency (CVID).

    abstract:BACKGROUND:We previously developed GoMiner, an application that organizes lists of 'interesting' genes (for example, under-and overexpressed genes from a microarray experiment) for biological interpretation in the context of the Gene Ontology. The original version of GoMiner was oriented toward visualization and interp...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-168

    authors: Zeeberg BR,Qin H,Narasimhan S,Sunshine M,Cao H,Kane DW,Reimers M,Stephens RM,Bryant D,Burt SK,Elnekave E,Hari DM,Wynn TA,Cunningham-Rundles C,Stewart DM,Nelson D,Weinstein JN

    更新日期:2005-07-05 00:00:00

  • New challenges for text mining: mapping between text and manually curated pathways.

    abstract:BACKGROUND:Associating literature with pathways poses new challenges to the Text Mining (TM) community. There are three main challenges to this task: (1) the identification of the mapping position of a specific entity or reaction in a given pathway, (2) the recognition of the causal relationships among multiple reactio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S3-S5

    authors: Oda K,Kim JD,Ohta T,Okanohara D,Matsuzaki T,Tateisi Y,Tsujii J

    更新日期:2008-04-11 00:00:00

  • A mixture of feature experts approach for protein-protein interaction prediction.

    abstract:BACKGROUND:High-throughput methods can directly detect the set of interacting proteins in model species but the results are often incomplete and exhibit high false positive and false negative rates. A number of researchers have recently presented methods for integrating direct and indirect data for predicting interacti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S10-S6

    authors: Qi Y,Klein-Seetharaman J,Bar-Joseph Z

    更新日期:2007-01-01 00:00:00

  • Predicting anatomic therapeutic chemical classification codes using tiered learning.

    abstract:BACKGROUND:The low success rate and high cost of drug discovery requires the development of new paradigms to identify molecules of therapeutic value. The Anatomical Therapeutic Chemical (ATC) Code System is a World Health Organization (WHO) proposed classification that assigns multi-level codes to compounds based on th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1660-6

    authors: Olson T,Singh R

    更新日期:2017-06-07 00:00:00

  • Optimal sequencing depth design for whole genome re-sequencing in pigs.

    abstract:BACKGROUND:As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and the quality of varian...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3164-z

    authors: Jiang Y,Jiang Y,Wang S,Zhang Q,Ding X

    更新日期:2019-11-08 00:00:00

  • Model based analysis of real-time PCR data from DNA binding dye protocols.

    abstract:BACKGROUND:Reverse transcription followed by real-time PCR is widely used for quantification of specific mRNA, and with the use of double-stranded DNA binding dyes it is becoming a standard for microarray data validation. Despite the kinetic information generated by real-time PCR, most popular analysis methods assume c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-85

    authors: Alvarez MJ,Vila-Ortiz GJ,Salibe MC,Podhajcer OL,Pitossi FJ

    更新日期:2007-03-09 00:00:00

  • Principal components analysis based methodology to identify differentially expressed genes in time-course microarray data.

    abstract:BACKGROUND:Time-course microarray experiments are being increasingly used to characterize dynamic biological processes. In these experiments, the goal is to identify genes differentially expressed in time-course data, measured between different biological conditions. These differentially expressed genes can reveal the ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-267

    authors: Jonnalagadda S,Srinivasan R

    更新日期:2008-06-06 00:00:00

  • Considering scores between unrelated proteins in the search database improves profile comparison.

    abstract:BACKGROUND:Profile-based comparison of multiple sequence alignments is a powerful methodology for the detection remote protein sequence similarity, which is essential for the inference and analysis of protein structure, function, and evolution. Accurate estimation of statistical significance of detected profile similar...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-399

    authors: Sadreyev RI,Wang Y,Grishin NV

    更新日期:2009-12-04 00:00:00