Alternative mapping of probes to genes for Affymetrix chips.

Abstract:

BACKGROUND:Short oligonucleotide arrays have several probes measuring the expression level of each target transcript. Therefore the selection of probes is a key component for the quality of measurements. However, once probes have been selected and synthesized on an array, it is still possible to re-evaluate the results using an updated mapping of probes to genes, taking into account the latest biological knowledge available. METHODS:We investigated how probes found on recent commercial microarrays for human genes (Affymetrix HG-U133A) were matching a recent curated collection of human transcripts: the NCBI RefSeq database. We also built mappings and used them in place of the original probe to genes associations provided by the manufacturer of the arrays. RESULTS:In a large number of cases, 36%, the probes matching a reference sequence were consistent with the grouping of probes by the manufacturer of the chips. For the remaining cases there were discrepancies and we show how that can affect the analysis of data. CONCLUSIONS:While the probes on Affymetrix arrays remain the same for several years, the biological knowledge concerning the genomic sequences evolves rapidly. Using up-to-date knowledge can apparently change the outcome of an analysis.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Gautier L,Møller M,Friis-Hansen L,Knudsen S

doi

10.1186/1471-2105-5-111

keywords:

subject

Has Abstract

pub_date

2004-08-14 00:00:00

pages

111

issn

1471-2105

pii

1471-2105-5-111

journal_volume

5

pub_type

杂志文章
  • NextSV: a meta-caller for structural variants from low-coverage long-read sequencing data.

    abstract:BACKGROUND:Structural variants (SVs) in human genomes are implicated in a variety of human diseases. Long-read sequencing delivers much longer read lengths than short-read sequencing and may greatly improve SV detection. However, due to the relatively high cost of long-read sequencing, it is unclear what coverage is ne...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2207-1

    authors: Fang L,Hu J,Wang D,Wang K

    更新日期:2018-05-23 00:00:00

  • A note on generalized Genome Scan Meta-Analysis statistics.

    abstract:BACKGROUND:Wise et al. introduced a rank-based statistical technique for meta-analysis of genome scans, the Genome Scan Meta-Analysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed dif...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-32

    authors: Koziol JA,Feng AC

    更新日期:2005-02-17 00:00:00

  • Application of text-mining for updating protein post-translational modification annotation in UniProtKB.

    abstract:BACKGROUND:The annotation of protein post-translational modifications (PTMs) is an important task of UniProtKB curators and, with continuing improvements in experimental methodology, an ever greater number of articles are being published on this topic. To help curators cope with this growing body of information we have...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-104

    authors: Veuthey AL,Bridge A,Gobeill J,Ruch P,McEntyre JR,Bougueleret L,Xenarios I

    更新日期:2013-03-22 00:00:00

  • Clustering analysis of tumor metabolic networks.

    abstract:BACKGROUND:Biological networks are representative of the diverse molecular interactions that occur within cells. Some of the commonly studied biological networks are modeled through protein-protein interactions, gene regulatory, and metabolic pathways. Among these, metabolic networks are probably the most studied, as t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03564-9

    authors: Manipur I,Granata I,Maddalena L,Guarracino MR

    更新日期:2020-08-21 00:00:00

  • The COG database: an updated version includes eukaryotes.

    abstract:BACKGROUND:The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appea...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-41

    authors: Tatusov RL,Fedorova ND,Jackson JD,Jacobs AR,Kiryutin B,Koonin EV,Krylov DM,Mazumder R,Mekhedov SL,Nikolskaya AN,Rao BS,Smirnov S,Sverdlov AV,Vasudevan S,Wolf YI,Yin JJ,Natale DA

    更新日期:2003-09-11 00:00:00

  • EVA: Exome Variation Analyzer, an efficient and versatile tool for filtering strategies in medical genomics.

    abstract:BACKGROUND:Whole exome sequencing (WES) has become the strategy of choice to identify a coding allelic variant for a rare human monogenic disorder. This approach is a revolution in medical genetics history, impacting both fundamental research, and diagnostic methods leading to personalized medicine. A plethora of effic...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S14-S9

    authors: Coutant S,Cabot C,Lefebvre A,Léonard M,Prieur-Gaston E,Campion D,Lecroq T,Dauchel H

    更新日期:2012-01-01 00:00:00

  • Benchmarking the HLA typing performance of Polysolver and Optitype in 50 Danish parental trios.

    abstract:BACKGROUND:The adaptive immune response intrinsically depends on hypervariable human leukocyte antigen (HLA) genes. Concomitantly, correct HLA phenotyping is crucial for successful donor-patient matching in organ transplantation. The cost and technical limitations of current laboratory techniques, together with advance...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2239-6

    authors: Matey-Hernandez ML,Danish Pan Genome Consortium.,Brunak S,Izarzugaza JMG

    更新日期:2018-06-25 00:00:00

  • CNV-seq, a new method to detect copy number variation using high-throughput sequencing.

    abstract:BACKGROUND:DNA copy number variation (CNV) has been recognized as an important source of genetic variation. Array comparative genomic hybridization (aCGH) is commonly used for CNV detection, but the microarray platform has a number of inherent limitations. RESULTS:Here, we describe a method to detect copy number varia...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-80

    authors: Xie C,Tammi MT

    更新日期:2009-03-06 00:00:00

  • Use of a multi-way method to analyze the amino acid composition of a conserved group of orthologous proteins in prokaryotes.

    abstract:BACKGROUND:Amino acids in proteins are not used equally. Some of the differences in the amino acid composition of proteins are between species (mainly due to nucleotide composition and lifestyle) and some are between proteins from the same species (related to protein function, expression or subcellular localization, fo...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-257

    authors: Pasamontes A,Garcia-Vallve S

    更新日期:2006-05-18 00:00:00

  • DraGnET: software for storing, managing and analyzing annotated draft genome sequence data.

    abstract:BACKGROUND:New "next generation" DNA sequencing technologies offer individual researchers the ability to rapidly generate large amounts of genome sequence data at dramatically reduced costs. As a result, a need has arisen for new software tools for storage, management and analysis of genome sequence data. Although bioi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-100

    authors: Duncan S,Sirkanungo R,Miller L,Phillips GJ

    更新日期:2010-02-22 00:00:00

  • Amino acid sequence associated with bacteriophage recombination site helps to reveal genes potentially acquired through horizontal gene transfer.

    abstract:BACKGROUND:Horizontal gene transfer, i.e. the acquisition of genetic material from nonparent organism, is considered an important force driving species evolution. Many cases of horizontal gene transfer from prokaryotes to eukaryotes have been registered, but no transfer mechanism has been deciphered so far, although vi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03599-y

    authors: Daugavet MA,Shabelnikov SV,Podgornaya OI

    更新日期:2020-07-24 00:00:00

  • Genome Projector: zoomable genome map with multiple views.

    abstract:BACKGROUND:Molecular biology data exist on diverse scales, from the level of molecules to -omics. At the same time, the data at each scale can be categorised into multiple layers, such as the genome, transcriptome, proteome, metabolome, and biochemical pathways. Due to the highly multi-layer and multi-dimensional natur...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-31

    authors: Arakawa K,Tamaki S,Kono N,Kido N,Ikegami K,Ogawa R,Tomita M

    更新日期:2009-01-23 00:00:00

  • 3DScapeCS: application of three dimensional, parallel, dynamic network visualization in Cytoscape.

    abstract:BACKGROUND:The exponential growth of gigantic biological data from various sources, such as protein-protein interaction (PPI), genome sequences scaffolding, Mass spectrometry (MS) molecular networking and metabolic flux, demands an efficient way for better visualization and interpretation beyond the conventional, two-d...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-322

    authors: Wang Q,Tang B,Song L,Ren B,Liang Q,Xie F,Zhuo Y,Liu X,Zhang L

    更新日期:2013-11-14 00:00:00

  • Quantiprot - a Python package for quantitative analysis of protein sequences.

    abstract:BACKGROUND:The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where seq...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1751-4

    authors: Konopka BM,Marciniak M,Dyrka W

    更新日期:2017-07-17 00:00:00

  • PDB-UF: database of predicted enzymatic functions for unannotated protein structures from structural genomics.

    abstract:BACKGROUND:The number of protein structures from structural genomics centers dramatically increases in the Protein Data Bank (PDB). Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-53

    authors: von Grotthuss M,Plewczynski D,Ginalski K,Rychlewski L,Shakhnovich EI

    更新日期:2006-02-06 00:00:00

  • Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations.

    abstract:BACKGROUND:Here we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0887-y

    authors: Leung E,Huang A,Cadag E,Montana A,Soliman JL,Zhou CL

    更新日期:2016-01-20 00:00:00

  • Efficient estimation of grouped survival models.

    abstract:BACKGROUND:Time- and dose-to-event phenotypes used in basic science and translational studies are commonly measured imprecisely or incompletely due to limitations of the experimental design or data collection schema. For example, drug-induced toxicities are not reported by the actual time or dose triggering the event, ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2899-x

    authors: Li Z,Lin J,Sibley AB,Truong T,Chua KC,Jiang Y,McCarthy J,Kroetz DL,Allen A,Owzar K

    更新日期:2019-05-28 00:00:00

  • Scoredist: a simple and robust protein sequence distance estimator.

    abstract:BACKGROUND:Distance-based methods are popular for reconstructing evolutionary trees thanks to their speed and generality. A number of methods exist for estimating distances from sequence alignments, which often involves some sort of correction for multiple substitutions. The problem is to accurately estimate the number...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-108

    authors: Sonnhammer EL,Hollich V

    更新日期:2005-04-27 00:00:00

  • Knowledge-guided multi-scale independent component analysis for biomarker identification.

    abstract:BACKGROUND:Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles. However, from gene expression profile data alone, statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study. In this paper, we develop a nove...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-416

    authors: Chen L,Xuan J,Wang C,Shih IeM,Wang Y,Zhang Z,Hoffman E,Clarke R

    更新日期:2008-10-06 00:00:00

  • Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection.

    abstract:BACKGROUND:Several studies demonstrated the feasibility of predicting bacterial antibiotic resistance phenotypes from whole-genome sequences, the prediction process usually amounting to detecting the presence of genes involved in antibiotic resistance mechanisms, or of specific mutations, previously identified from a t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2403-z

    authors: Mahé P,Tournoud M

    更新日期:2018-10-17 00:00:00

  • Simulating autosomal genotypes with realistic linkage disequilibrium and a spiked-in genetic effect.

    abstract:BACKGROUND:To evaluate statistical methods for genome-wide genetic analyses, one needs to be able to simulate realistic genotypes. We here describe a method, applicable to a broad range of association study designs, that can simulate autosome-wide single-nucleotide polymorphism data with realistic linkage disequilibriu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-2004-2

    authors: Shi M,Umbach DM,Wise AS,Weinberg CR

    更新日期:2018-01-02 00:00:00

  • A MATLAB tool for pathway enrichment using a topology-based pathway regulation score.

    abstract:BACKGROUND:Handling the vast amount of gene expression data generated by genome-wide transcriptional profiling techniques is a challenging task, demanding an informed combination of pre-processing, filtering and analysis methods if meaningful biological conclusions are to be drawn. For example, a range of traditional s...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0358-2

    authors: Ibrahim M,Jassim S,Cawthorne MA,Langlands K

    更新日期:2014-11-04 00:00:00

  • Ortholog-based protein-protein interaction prediction and its application to inter-species interactions.

    abstract:BACKGROUND:The rapid growth of protein-protein interaction (PPI) data has led to the emergence of PPI network analysis. Despite advances in high-throughput techniques, the interactomes of several model organisms are still far from complete. Therefore, it is desirable to expand these interactomes with ortholog-based and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S12-S11

    authors: Lee SA,Chan CH,Tsai CH,Lai JM,Wang FS,Kao CY,Huang CY

    更新日期:2008-12-12 00:00:00

  • Challenging popular tools for the annotation of genetic variations with a real case, pathogenic mutations of lysosomal alpha-galactosidase.

    abstract:BACKGROUND:Severity gradation of missense mutations is a big challenge for exome annotation. Predictors of deleteriousness that are most frequently used to filter variants found by next generation sequencing, produce qualitative predictions, but also numerical scores. It has never been tested if these scores correlate ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2416-7

    authors: Cimmaruta C,Citro V,Andreotti G,Liguori L,Cubellis MV,Hay Mele B

    更新日期:2018-11-30 00:00:00

  • Uncovering packaging features of co-regulated modules based on human protein interaction and transcriptional regulatory networks.

    abstract:BACKGROUND:Network co-regulated modules are believed to have the functionality of packaging multiple biological entities, and can thus be assumed to coordinate many biological functions in their network neighbouring regions. RESULTS:Here, we weighted edges of a human protein interaction network and a transcriptional r...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-392

    authors: Chen L,Wang H,Zhang L,Li W,Wang Q,Shang Y,He Y,He W,Li X,Tai J,Li X

    更新日期:2010-07-22 00:00:00

  • Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida.

    abstract:BACKGROUND:Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to envi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S7-S7

    authors: Pirooznia M,Gong P,Guan X,Inouye LS,Yang K,Perkins EJ,Deng Y

    更新日期:2007-11-01 00:00:00

  • Disease candidate gene identification and prioritization using protein interaction networks.

    abstract:BACKGROUND:Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-prot...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-73

    authors: Chen J,Aronow BJ,Jegga AG

    更新日期:2009-02-27 00:00:00

  • Identification of consensus RNA secondary structures using suffix arrays.

    abstract:BACKGROUND:The identification of a consensus RNA motif often consists in finding a conserved secondary structure with minimum free energy in an ensemble of aligned sequences. However, an alignment is often difficult to obtain without prior structural information. Thus the need for tools to automate this process. RESUL...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-244

    authors: Anwar M,Nguyen T,Turcotte M

    更新日期:2006-05-05 00:00:00

  • Automatic classification of protein structures using low-dimensional structure space mappings.

    abstract:BACKGROUND:Protein function is closely intertwined with protein structure. Discovery of meaningful structure-function relationships is of utmost importance in protein biochemistry and has led to creation of high-quality, manually curated classification databases, such as the gold-standard SCOP (Structural Classificatio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S2-S1

    authors: Asarnow D,Singh R

    更新日期:2014-01-01 00:00:00

  • Mutation status coupled with RNA-sequencing data can efficiently identify important non-significantly mutated genes serving as diagnostic biomarkers of endometrial cancer.

    abstract:BACKGROUND:Endometrial cancers (ECs) are one of the most common types of malignant tumor in females. Substantial efforts had been made to identify significantly mutated genes (SMGs) in ECs and use them as biomarkers for the classification of histological subtypes and the prediction of clinical outcomes. However, the im...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1891-6

    authors: Liu K,He L,Liu Z,Xu J,Liu Y,Kuang Q,Wen Z,Li M

    更新日期:2017-12-28 00:00:00