Assessment of the relationship between pre-chip and post-chip quality measures for Affymetrix GeneChip expression data.

Abstract:

BACKGROUND:Gene expression microarray experiments are expensive to conduct and guidelines for acceptable quality control at intermediate steps before and after the samples are hybridised to chips are vague. We conducted an experiment hybridising RNA from human brain to 117 U133A Affymetrix GeneChips and used these data to explore the relationship between 4 pre-chip variables and 22 post-chip outcomes and quality control measures. RESULTS:We found that the pre-chip variables were significantly correlated with each other but that this correlation was strongest between measures of RNA quality and cRNA yield. Post-mortem interval was negatively correlated with these variables. Four principal components, reflecting array outliers, array adjustment, hybridisation noise and RNA integrity, explain about 75% of the total post-chip measure variability. Two significant canonical correlations existed between the pre-chip and post-chip variables, derived from MAS 5.0, dChip and the Bioconductor packages affy and affyPLM. The strongest (CANCOR 0.838, p < 0.0001) correlated RNA integrity and yield with post chip quality control (QC) measures indexing 3'/5' RNA ratios, bias or scaling of the chip and scaling of the variability of the signal across the chip. Post-mortem interval was relatively unimportant. We also found that the RNA integrity number (RIN) could be moderately well predicted by post-chip measures B_ACTIN35, GAPDH35 and SF. CONCLUSION:We have found that the post-chip variables having the strongest association with quantities measurable before hybridisation are those reflecting RNA integrity. Other aspects of quality, such as noise measures (reflecting the execution of the assay) or measures reflecting data quality (outlier status and array adjustment variables) are not well predicted by the variables we were able to determine ahead of time. There could be other variables measurable pre-hybridisation which may be better associated with expression data quality measures. Uncovering such connections could create savings on costly microarray experiments by eliminating poor samples before hybridisation.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Jones L,Goldstein DR,Hughes G,Strand AD,Collin F,Dunnett SB,Kooperberg C,Aragaki A,Olson JM,Augood SJ,Faull RL,Luthi-Carter R,Moskvina V,Hodges AK

doi

10.1186/1471-2105-7-211

subject

Has Abstract

pub_date

2006-04-19 00:00:00

pages

211

issn

1471-2105

pii

1471-2105-7-211

journal_volume

7

pub_type

杂志文章
  • Critique of the pairwise method for estimating qPCR amplification efficiency: beware of correlated data!

    abstract:BACKGROUND:A recently proposed method for estimating qPCR amplification efficiency E analyzes fluorescence intensity ratios from pairs of points deemed to lie in the exponential growth region on the amplification curves for all reactions in a dilution series. This method suffers from a serious problem: The resulting ra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03604-4

    authors: Tellinghuisen J

    更新日期:2020-07-08 00:00:00

  • Recursive model for dose-time responses in pharmacological studies.

    abstract:BACKGROUND:Clinical studies often track dose-response curves of subjects over time. One can easily model the dose-response curve at each time point with Hill equation, but such a model fails to capture the temporal evolution of the curves. On the other hand, one can use Gompertz equation to model the temporal behaviors...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2831-4

    authors: Dhruba SR,Rahman A,Rahman R,Ghosh S,Pal R

    更新日期:2019-06-20 00:00:00

  • Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology.

    abstract:BACKGROUND:An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biolo...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-94

    authors: Hill SM,Neve RM,Bayani N,Kuo WL,Ziyad S,Spellman PT,Gray JW,Mukherjee S

    更新日期:2012-05-11 00:00:00

  • Multiple sequence alignment accuracy and evolutionary distance estimation.

    abstract:BACKGROUND:Sequence alignment is a common tool in bioinformatics and comparative genomics. It is generally assumed that multiple sequence alignment yields better results than pair wise sequence alignment, but this assumption has rarely been tested, and never with the control provided by simulation analysis. This study ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-278

    authors: Rosenberg MS

    更新日期:2005-11-23 00:00:00

  • Leveraging TCGA gene expression data to build predictive models for cancer drug response.

    abstract:BACKGROUND:Machine learning has been utilized to predict cancer drug response from multi-omics data generated from sensitivities of cancer cell lines to different therapeutic compounds. Here, we build machine learning models using gene expression data from patients' primary tumor tissues to predict whether a patient wi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03690-4

    authors: Clayton EA,Pujol TA,McDonald JF,Qiu P

    更新日期:2020-09-30 00:00:00

  • An algorithm for automated closure during assembly.

    abstract:BACKGROUND:Finishing is the process of improving the quality and utility of draft genome sequences generated by shotgun sequencing and computational assembly. Finishing can involve targeted sequencing. Finishing reads may be incorporated by manual or automated means. One automated method uses targeted addition by local...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-457

    authors: Koren S,Miller JR,Walenz BP,Sutton G

    更新日期:2010-09-10 00:00:00

  • RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.

    abstract:BACKGROUND:RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-323

    authors: Li B,Dewey CN

    更新日期:2011-08-04 00:00:00

  • Filling out the structural map of the NTF2-like superfamily.

    abstract:BACKGROUND:The NTF2-like superfamily is a versatile group of protein domains sharing a common fold. The sequences of these domains are very diverse and they share no common sequence motif. These domains serve a range of different functions within the proteins in which they are found, including both catalytic and non-ca...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-327

    authors: Eberhardt RY,Chang Y,Bateman A,Murzin AG,Axelrod HL,Hwang WC,Aravind L

    更新日期:2013-11-19 00:00:00

  • An automatic device for detection and classification of malaria parasite species in thick blood film.

    abstract:BACKGROUND:Current malaria diagnosis relies primarily on microscopic examination of Giemsa-stained thick and thin blood films. This method requires vigorously trained technicians to efficiently detect and classify the malaria parasite species such as Plasmodium falciparum (Pf) and Plasmodium vivax (Pv) for an appropria...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S17-S18

    authors: Kaewkamnerd S,Uthaipibull C,Intarapanich A,Pannarut M,Chaotheing S,Tongsima S

    更新日期:2012-01-01 00:00:00

  • Hotspot Hunter: a computational system for large-scale screening and selection of candidate immunological hotspots in pathogen proteomes.

    abstract:BACKGROUND:T-cell epitopes that promiscuously bind to multiple alleles of a human leukocyte antigen (HLA) supertype are prime targets for development of vaccines and immunotherapies because they are relevant to a large proportion of the human population. The presence of clusters of promiscuous T-cell epitopes, immunolo...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S1-S19

    authors: Zhang GL,Khan AM,Srinivasan KN,Heiny A,Lee K,Kwoh CK,August JT,Brusic V

    更新日期:2008-01-01 00:00:00

  • SitesIdentify: a protein functional site prediction tool.

    abstract:BACKGROUND:The rate of protein structures being deposited in the Protein Data Bank surpasses the capacity to experimentally characterise them and therefore computational methods to analyse these structures have become increasingly important. Identifying the region of the protein most likely to be involved in function i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-379

    authors: Bray T,Chan P,Bougouffa S,Greaves R,Doig AJ,Warwicker J

    更新日期:2009-11-18 00:00:00

  • MOSBIE: a tool for comparison and analysis of rule-based biochemical models.

    abstract:BACKGROUND:Mechanistic models that describe the dynamical behaviors of biochemical systems are common in computational systems biology, especially in the realm of cellular signaling. The development of families of such models, either by a single research group or by different groups working within the same area, presen...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-316

    authors: Wenskovitch JE Jr,Harris LA,Tapia JJ,Faeder JR,Marai GE

    更新日期:2014-09-25 00:00:00

  • Rigorous assessment and integration of the sequence and structure based features to predict hot spots.

    abstract:BACKGROUND:Systematic mutagenesis studies have shown that only a few interface residues termed hot spots contribute significantly to the binding free energy of protein-protein interactions. Therefore, hot spots prediction becomes increasingly important for well understanding the essence of proteins interactions and hel...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-311

    authors: Chen R,Chen W,Yang S,Wu D,Wang Y,Tian Y,Shi Y

    更新日期:2011-07-29 00:00:00

  • OpWise: operons aid the identification of differentially expressed genes in bacterial microarray experiments.

    abstract:BACKGROUND:Differentially expressed genes are typically identified by analyzing the variation between replicate measurements. These procedures implicitly assume that there are no systematic errors in the data even though several sources of systematic error are known. RESULTS:OpWise estimates the amount of systematic e...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-19

    authors: Price MN,Arkin AP,Alm EJ

    更新日期:2006-01-13 00:00:00

  • GeneBins: a database for classifying gene expression data, with application to plant genome arrays.

    abstract:BACKGROUND:To interpret microarray experiments, several ontological analysis tools have been developed. However, current tools are limited to specific organisms. RESULTS:We developed a bioinformatics system to assign the probe set sequences of any organism to a hierarchical functional classification modelled on KEGG o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-87

    authors: Goffard N,Weiller G

    更新日期:2007-03-12 00:00:00

  • How to decide which are the most pertinent overly-represented features during gene set enrichment analysis.

    abstract:BACKGROUND:The search for enriched features has become widely used to characterize a set of genes or proteins. A key aspect of this technique is its ability to identify correlations amongst heterogeneous data such as Gene Ontology annotations, gene expression data and genome location of genes. Despite the rapid growth ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-332

    authors: Barriot R,Sherman DJ,Dutour I

    更新日期:2007-09-11 00:00:00

  • FocAn: automated 3D analysis of DNA repair foci in image stacks acquired by confocal fluorescence microscopy.

    abstract:BACKGROUND:Phosphorylated histone H2AX, also known as γH2AX, forms μm-sized nuclear foci at the sites of DNA double-strand breaks (DSBs) induced by ionizing radiation and other agents. Due to their specificity and sensitivity, γH2AX immunoassays have become the gold standard for studying DSB induction and repair. One o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3370-8

    authors: Memmel S,Sisario D,Zimmermann H,Sauer M,Sukhorukov VL,Djuzenova CS,Flentje M

    更新日期:2020-01-28 00:00:00

  • Partition-based optimization model for generative anatomy modeling language (POM-GAML).

    abstract:BACKGROUND:This paper presents a novel approach for Generative Anatomy Modeling Language (GAML). This approach automatically detects the geometric partitions in 3D anatomy that in turn speeds up integrated non-linear optimization model in GAML for 3D anatomy modeling with constraints (e.g. joints). This integrated non-...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2626-7

    authors: Demirel D,Cetinsaya B,Halic T,Kockara S,Ahmadi S

    更新日期:2019-03-14 00:00:00

  • Bayesian Unidimensional Scaling for visualizing uncertainty in high dimensional datasets with latent ordering of observations.

    abstract:BACKGROUND:Detecting patterns in high-dimensional multivariate datasets is non-trivial. Clustering and dimensionality reduction techniques often help in discerning inherent structures. In biological datasets such as microbial community composition or gene expression data, observations can be generated from a continuous...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1790-x

    authors: Nguyen LH,Holmes S

    更新日期:2017-09-13 00:00:00

  • Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data.

    abstract:BACKGROUND:The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly asse...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-192

    authors: Mosén-Ansorena D,Aransay AM,Rodríguez-Ezpeleta N

    更新日期:2012-08-07 00:00:00

  • A machine learning strategy for predicting localization of post-translational modification sites in protein-protein interacting regions.

    abstract:BACKGROUND:One very important functional domain of proteins is the protein-protein interacting region (PPIR), which forms the binding interface between interacting polypeptide chains. Post-translational modifications (PTMs) that occur in the PPIR can either interfere with or facilitate the interaction between proteins....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1165-8

    authors: Saethang T,Payne DM,Avihingsanon Y,Pisitkun T

    更新日期:2016-08-17 00:00:00

  • Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment.

    abstract:BACKGROUND:Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. RESULTS:In this paper, we have proposed a Vertical...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-353

    authors: Naznin F,Sarker R,Essam D

    更新日期:2011-08-25 00:00:00

  • Detecting variants with Metabolic Design, a new software tool to design probes for explorative functional DNA microarray development.

    abstract:BACKGROUND:Microorganisms display vast diversity, and each one has its own set of genes, cell components and metabolic reactions. To assess their huge unexploited metabolic potential in different ecosystems, we need high throughput tools, such as functional microarrays, that allow the simultaneous analysis of thousands...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-478

    authors: Terrat S,Peyretaillade E,Gonçalves O,Dugat-Bony E,Gravelat F,Moné A,Biderre-Petit C,Boucher D,Troquet J,Peyret P

    更新日期:2010-09-23 00:00:00

  • mRNA:guanine-N7 cap methyltransferases: identification of novel members of the family, evolutionary analysis, homology modeling, and analysis of sequence-structure-function relationships.

    abstract:BACKGROUND:The 5'-terminal cap structure plays an important role in many aspects of mRNA metabolism. Capping enzymes encoded by viruses and pathogenic fungi are attractive targets for specific inhibitors. There is a large body of experimental data on viral and cellular methyltransferases (MTases) that carry out guanine...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-2-2

    authors: Bujnicki JM,Feder M,Radlinska M,Rychlewski L

    更新日期:2001-01-01 00:00:00

  • Identification of functional hubs and modules by converting interactome networks into hierarchical ordering of proteins.

    abstract:BACKGROUND:Protein-protein interactions play a key role in biological processes of proteins within a cell. Recent high-throughput techniques have generated protein-protein interaction data in a genome-scale. A wide range of computational approaches have been applied to interactome network analysis for uncovering functi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S3-S3

    authors: Cho YR,Zhang A

    更新日期:2010-04-29 00:00:00

  • In situ analysis of cross-hybridisation on microarrays and the inference of expression correlation.

    abstract:BACKGROUND:Microarray co-expression signatures are an important tool for studying gene function and relations between genes. In addition to genuine biological co-expression, correlated signals can result from technical deficiencies like hybridization of reporters with off-target transcripts. An approach that is able to...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-461

    authors: Casneuf T,Van de Peer Y,Huber W

    更新日期:2007-11-26 00:00:00

  • Analyzing miRNA co-expression networks to explore TF-miRNA regulation.

    abstract:BACKGROUND:Current microRNA (miRNA) research in progress has engendered rapid accumulation of expression data evolving from microarray experiments. Such experiments are generally performed over different tissues belonging to a specific species of metazoan. For disease diagnosis, microarray probes are also prepared with...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-163

    authors: Bandyopadhyay S,Bhattacharyya M

    更新日期:2009-05-28 00:00:00

  • Integrating diverse biological and computational sources for reliable protein-protein interactions.

    abstract:BACKGROUND:Protein-protein interactions (PPIs) play important roles in various cellular processes. However, the low quality of current PPI data detected from high-throughput screening techniques has diminished the potential usefulness of the data. We need to develop a method to address the high data noise and incomplet...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S7-S8

    authors: Wu M,Li X,Chua HN,Kwoh CK,Ng SK

    更新日期:2010-10-15 00:00:00

  • GraphCrunch: a tool for large network analyses.

    abstract:BACKGROUND:The recent explosion in biological and other real-world network data has created the need for improved tools for large network analyses. In addition to well established global network properties, several new mathematical techniques for analyzing local structural properties of large networks have been develop...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-70

    authors: Milenković T,Lai J,Przulj N

    更新日期:2008-01-30 00:00:00

  • BioIMAX: a Web 2.0 approach for easy exploratory and collaborative access to multivariate bioimage data.

    abstract:BACKGROUND:Innovations in biological and biomedical imaging produce complex high-content and multivariate image data. For decision-making and generation of hypotheses, scientists need novel information technology tools that enable them to visually explore and analyze the data and to discuss and communicate results or f...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-297

    authors: Loyek C,Rajpoot NM,Khan M,Nattkemper TW

    更新日期:2011-07-21 00:00:00