A hybrid imputation approach for microarray missing value estimation.

Abstract:

BACKGROUND:Missing data is an inevitable phenomenon in gene expression microarray experiments due to instrument failure or human error. It has a negative impact on performance of downstream analysis. Technically, most existing approaches suffer from this prevalent problem. Imputation is one of the frequently used methods for processing missing data. Actually many developments have been achieved in the research on estimating missing values. The challenging task is how to improve imputation accuracy for data with a large missing rate. METHODS:In this paper, induced by the thought of collaborative training, we propose a novel hybrid imputation method, called Recursive Mutual Imputation (RMI). Specifically, RMI exploits global correlation information and local structure in the data, captured by two popular methods, Bayesian Principal Component Analysis (BPCA) and Local Least Squares (LLS), respectively. Mutual strategy is implemented by sharing the estimated data sequences at each recursive process. Meanwhile, we consider the imputation sequence based on the number of missing entries in the target gene. Furthermore, a weight based integrated method is utilized in the final assembling step. RESULTS:We evaluate RMI with three state-of-art algorithms (BPCA, LLS, Iterated Local Least Squares imputation (ItrLLS)) on four publicly available microarray datasets. Experimental results clearly demonstrate that RMI significantly outperforms comparative methods in terms of Normalized Root Mean Square Error (NRMSE), especially for datasets with large missing rates and less complete genes. CONCLUSIONS:It is noted that our proposed hybrid imputation approach incorporates both global and local information of microarray genes, which achieves lower NRMSE values against to any single approach only. Besides, this study highlights the need for considering the imputing sequence of missing entries for imputation methods.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Li H,Zhao C,Shao F,Li GZ,Wang X

doi

10.1186/1471-2164-16-S9-S1

subject

Has Abstract

pub_date

2015-01-01 00:00:00

pages

S1

issn

1471-2164

pii

1471-2164-16-S9-S1

journal_volume

16 Suppl 9

pub_type

杂志文章
  • A transcriptome approach towards understanding the development of ripening capacity in 'Bartlett' pears (Pyrus communis L.).

    abstract:BACKGROUND:The capacity of European pear fruit (Pyrus communis L.) to ripen after harvest develops during the final stages of growth on the tree. The objective of this study was to characterize changes in 'Bartlett' pear fruit physico-chemical properties and transcription profiles during fruit maturation leading to att...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1939-9

    authors: Nham NT,de Freitas ST,Macnish AJ,Carr KM,Kietikul T,Guilatco AJ,Jiang CZ,Zakharov F,Mitcham EJ

    更新日期:2015-10-09 00:00:00

  • Large scale genome-wide association and LDLA mapping study identifies QTLs for boar taint and related sex steroids.

    abstract:BACKGROUND:Boar taint is observed in a high proportion of uncastrated male pigs and is characterized by an unpleasant odor/flavor in cooked meat, primarily caused by elevated levels of androstenone and skatole. Androstenone is a steroid produced in the testis in parallel with biosynthesis of other sex steroids like tes...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-362

    authors: Grindflek E,Lien S,Hamland H,Hansen MH,Kent M,van Son M,Meuwissen TH

    更新日期:2011-07-13 00:00:00

  • The complete and fully assembled genome sequence of Aeromonas salmonicida subsp. pectinolytica and its comparative analysis with other Aeromonas species: investigation of the mobilome in environmental and pathogenic strains.

    abstract:BACKGROUND:Due to the predominant usage of short-read sequencing to date, most bacterial genome sequences reported in the last years remain at the draft level. This precludes certain types of analyses, such as the in-depth analysis of genome plasticity. RESULTS:Here we report the finalized genome sequence of the envir...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4301-6

    authors: Pfeiffer F,Zamora-Lagos MA,Blettinger M,Yeroslaviz A,Dahl A,Gruber S,Habermann BH

    更新日期:2018-01-05 00:00:00

  • EPAS1 gene variants are associated with sprint/power athletic performance in two cohorts of European athletes.

    abstract:BACKGROUND:The endothelial PAS domain protein 1 (EPAS1) activates genes that are involved in erythropoiesis and angiogenesis, thus favoring a better delivery of oxygen to the tissues and is a plausible candidate to influence athletic performance. Using innovative statistical methods we compared genotype distributions a...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-382

    authors: Voisin S,Cieszczyk P,Pushkarev VP,Dyatlov DA,Vashlyayev BF,Shumaylov VA,Maciejewska-Karlowska A,Sawczuk M,Skuza L,Jastrzebski Z,Bishop DJ,Eynon N

    更新日期:2014-05-18 00:00:00

  • Regulatory network changes between cell lines and their tissues of origin.

    abstract:BACKGROUND:Cell lines are an indispensable tool in biomedical research and often used as surrogates for tissues. Although there are recognized important cellular and transcriptomic differences between cell lines and tissues, a systematic overview of the differences between the regulatory processes of a cell line and th...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4111-x

    authors: Lopes-Ramos CM,Paulson JN,Chen CY,Kuijjer ML,Fagny M,Platig J,Sonawane AR,DeMeo DL,Quackenbush J,Glass K

    更新日期:2017-09-12 00:00:00

  • SCUD: Saccharomyces cerevisiae ubiquitination database.

    abstract:BACKGROUND:Ubiquitination is an important post-translational modification involved in diverse biological processes. Therefore, genomewide representation of the ubiquitination system for a species is important. DESCRIPTION:SCUD is a web-based database for the ubiquitination system in Saccharomyces cerevisiae (Baker's y...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-440

    authors: Lee WC,Lee M,Jung JW,Kim KP,Kim D

    更新日期:2008-09-24 00:00:00

  • Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing.

    abstract:BACKGROUND:Paired-tag sequencing approaches are commonly used for the analysis of genome structure. However, mammalian genomes have a complex organization with a variety of repetitive elements that complicate comprehensive genome-wide analyses. RESULTS:Here, we systematically assessed the utility of paired-end and mat...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-257

    authors: van Heesch S,Kloosterman WP,Lansu N,Ruzius FP,Levandowsky E,Lee CC,Zhou S,Goldstein S,Schwartz DC,Harkins TT,Guryev V,Cuppen E

    更新日期:2013-04-16 00:00:00

  • Contrasting genetic variation and positive selection followed the divergence of NBS-encoding genes in Asian and European pears.

    abstract:BACKGROUND:The NBS disease-related gene family coordinates the inherent immune system in plants in response to pathogen infections. Previous studies have identified NBS-encoding genes in Pyrus bretschneideri ('Dangshansuli', an Asian pear) and Pyrus communis ('Bartlett', a European pear) genomes, but the patterns of ge...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-07226-1

    authors: Sun M,Zhang M,Singh J,Song B,Tang Z,Liu Y,Wang R,Qin M,Li J,Khan A,Wu J

    更新日期:2020-11-19 00:00:00

  • Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences.

    abstract:BACKGROUND:Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. RESULTS:Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic se...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-140

    authors: Liu GE,Matukumalli LK,Sonstegard TS,Shade LL,Van Tassell CP

    更新日期:2006-06-07 00:00:00

  • Analysis of the SOS response of Vibrio and other bacteria with multiple chromosomes.

    abstract:BACKGROUND:The SOS response is a well-known regulatory network present in most bacteria and aimed at addressing DNA damage. It has also been linked extensively to stress-induced mutagenesis, virulence and the emergence and dissemination of antibiotic resistance determinants. Recently, the SOS response has been shown to...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-58

    authors: Sanchez-Alberola N,Campoy S,Barbé J,Erill I

    更新日期:2012-02-03 00:00:00

  • Comprehensive SNP array study of frequently used neuroblastoma cell lines; copy neutral loss of heterozygosity is common in the cell lines but uncommon in primary tumors.

    abstract:BACKGROUND:Copy neutral loss of heterozygosity (CN-LOH) refers to a special case of LOH occurring without any resulting loss in copy number. These alterations is sometimes seen in tumors as a way to inactivate a tumor suppressor gene and have been found to be important in several types of cancer. RESULTS:We have used ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-443

    authors: Kryh H,Carén H,Erichsen J,Sjöberg RM,Abrahamsson J,Kogner P,Martinsson T

    更新日期:2011-09-07 00:00:00

  • Characterization of human plasma-derived exosomal RNAs by deep sequencing.

    abstract:BACKGROUND:Exosomes, endosome-derived membrane microvesicles, contain specific RNA transcripts that are thought to be involved in cell-cell communication. These RNA transcripts have great potential as disease biomarkers. To characterize exosomal RNA profiles systemically, we performed RNA sequencing analysis using thre...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-319

    authors: Huang X,Yuan T,Tschannen M,Sun Z,Jacob H,Du M,Liang M,Dittmar RL,Liu Y,Liang M,Kohli M,Thibodeau SN,Boardman L,Wang L

    更新日期:2013-05-10 00:00:00

  • Genome-wide association study of prolactin levels in blood plasma and cerebrospinal fluid.

    abstract:BACKGROUND:Prolactin is a polypeptide hormone secreted by the anterior pituitary gland that plays an essential role in lactation, tissue growth, and suppressing apoptosis to increase cell survival. Prolactin serves as a key player in many life-critical processes, including immune system and reproduction. Prolactin is a...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2785-0

    authors: Staley LA,Ebbert MT,Parker S,Bailey M,Alzheimer’s Disease Neuroimaging Initiative.,Ridge PG,Goate AM,Kauwe JS

    更新日期:2016-06-29 00:00:00

  • Small RNAs from plants, bacteria and fungi within the order Hypocreales are ubiquitous in human plasma.

    abstract:BACKGROUND:The human microbiome plays a significant role in maintaining normal physiology. Changes in its composition have been associated with bowel disease, metabolic disorders and atherosclerosis. Sequences of microbial origin have been observed within small RNA sequencing data obtained from blood samples. The aim o...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-933

    authors: Beatty M,Guduric-Fuchs J,Brown E,Bridgett S,Chakravarthy U,Hogg RE,Simpson DA

    更新日期:2014-10-25 00:00:00

  • Transcriptome profiling of litchi leaves in response to low temperature reveals candidate regulatory genes and key metabolic events during floral induction.

    abstract:BACKGROUND:Litchi (Litchi chinensis Sonn.) is an economically important evergreen fruit tree widely cultivated in subtropical areas. Low temperature is absolutely required for floral induction of litchi, but its molecular mechanism is not fully understood. Leaves of litchi played a key role during floral induction and ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3747-x

    authors: Zhang H,Shen J,Wei Y,Chen H

    更新日期:2017-05-10 00:00:00

  • Comparison between two amplicon-based sequencing panels of different scales in the detection of somatic mutations associated with gastric cancer.

    abstract:BACKGROUND:Sequencing data from The Cancer Genome Atlas (TGCA), the International Cancer Genome Consortium and other research institutes have revealed the presence of genetic alterations in several tumor types, including gastric cancer. These data have been combined into a catalog of significantly mutated genes for eac...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3166-4

    authors: Hirotsu Y,Kojima Y,Okimoto K,Amemiya K,Mochizuki H,Omata M

    更新日期:2016-10-26 00:00:00

  • Whole genome sequencing analysis of multiple Salmonella serovars provides insights into phylogenetic relatedness, antimicrobial resistance, and virulence markers across humans, food animals and agriculture environmental sources.

    abstract:BACKGROUND:Salmonella enterica is a significant foodborne pathogen, which can be transmitted via several distinct routes, and reports on acquisition of antimicrobial resistance (AMR) are increasing. To better understand the association between human Salmonella clinical isolates and the potential environmental/animal re...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5137-4

    authors: Pornsukarom S,van Vliet AHM,Thakur S

    更新日期:2018-11-06 00:00:00

  • Construction of a highly flexible and comprehensive gene collection representing the ORFeome of the human pathogen Chlamydia pneumoniae.

    abstract:BACKGROUND:The Gram-negative bacterium Chlamydia pneumoniae (Cpn) is the leading intracellular human pathogen responsible for respiratory infections such as pneumonia and bronchitis. Basic and applied research in pathogen biology, especially the elaboration of new mechanism-based anti-pathogen strategies, target discov...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-632

    authors: Maier CJ,Maier RH,Virok DP,Maass M,Hintner H,Bauer JW,Onder K

    更新日期:2012-11-16 00:00:00

  • Cluster analysis of replicated alternative polyadenylation data using canonical correlation analysis.

    abstract:BACKGROUND:Alternative polyadenylation (APA) has emerged as a pervasive mechanism that contributes to the transcriptome complexity and dynamics of gene regulation. The current tsunami of whole genome poly(A) site data from various conditions generated by 3' end sequencing provides a valuable data source for the study o...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5433-7

    authors: Ye W,Long Y,Ji G,Su Y,Ye P,Fu H,Wu X

    更新日期:2019-01-22 00:00:00

  • Single feature polymorphism (SFP)-based selective sweep identification and association mapping of growth-related metabolic traits in Arabidopsis thaliana.

    abstract:BACKGROUND:Natural accessions of Arabidopsis thaliana are characterized by a high level of phenotypic variation that can be used to investigate the extent and mode of selection on the primary metabolic traits. A collection of 54 A. thaliana natural accession-derived lines were subjected to deep genotyping through Singl...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-188

    authors: Childs LH,Witucka-Wall H,Günther T,Sulpice R,Korff MV,Stitt M,Walther D,Schmid KJ,Altmann T

    更新日期:2010-03-20 00:00:00

  • Bayesian prediction of bacterial growth temperature range based on genome sequences.

    abstract:BACKGROUND:The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thu...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-S7-S3

    authors: Jensen DB,Vesth TC,Hallin PF,Pedersen AG,Ussery DW

    更新日期:2012-01-01 00:00:00

  • Cross-species hybridisation of pig RNA to human nylon microarrays.

    abstract:BACKGROUND:The objective of this research was to investigate the reproducibility of cross-species microarray hybridisation. Comparisons between same- and cross-species hybridisations were also made. Nine hybridisations between a single pig skeletal muscle RNA sample and three human cDNA nylon microarrays were completed...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-3-27

    authors: Moody DE,Zou Z,McIntyre L

    更新日期:2002-09-27 00:00:00

  • Outlier analysis of functional genomic profiles enriches for oncology targets and enables precision medicine.

    abstract:BACKGROUND:Genome-scale functional genomic screens across large cell line panels provide a rich resource for discovering tumor vulnerabilities that can lead to the next generation of targeted therapies. Their data analysis typically has focused on identifying genes whose knockdown enhances response in various pre-defin...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2807-y

    authors: Zhu Z,Ihle NT,Rejto PA,Zarrinkar PP

    更新日期:2016-06-13 00:00:00

  • Whole genome sequence analysis of the TALLYHO/Jng mouse.

    abstract:BACKGROUND:The TALLYHO/Jng (TH) mouse is a polygenic model for obesity and type 2 diabetes first described in the literature in 2001. The origin of the TH strain is an outbred colony of the Theiler Original strain and mice derived from this source were selectively bred for male hyperglycemia establishing an inbred stra...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3245-6

    authors: Denvir J,Boskovic G,Fan J,Primerano DA,Parkman JK,Kim JH

    更新日期:2016-11-11 00:00:00

  • Genomic diversity dynamics in conserved chicken populations are revealed by genome-wide SNPs.

    abstract:BACKGROUND:Maintaining maximum genetic diversity and preserving breed viability in conserved populations necessitates the rigorous evaluation of conservation schemes. Three chicken breeds (Baier Yellow Chicken (BEC), Beijing You Chicken (BYC) and Langshan Chicken (LSC)) are currently in conservation programs in China. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4973-6

    authors: Zhang M,Han W,Tang H,Li G,Zhang M,Xu R,Liu Y,Yang T,Li W,Zou J,Wu K

    更新日期:2018-08-09 00:00:00

  • Temporal transcriptome changes induced by MDV in Marek's disease-resistant and -susceptible inbred chickens.

    abstract:BACKGROUND:Marek's disease (MD) is a lymphoproliferative disease in chickens caused by Marek's disease virus (MDV) and characterized by T cell lymphoma and infiltration of lymphoid cells into various organs such as liver, spleen, peripheral nerves and muscle. Resistance to MD and disease risk have long been thought to ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-501

    authors: Yu Y,Luo J,Mitra A,Chang S,Tian F,Zhang H,Yuan P,Zhou H,Song J

    更新日期:2011-10-12 00:00:00

  • Unravelling the complex trait of harvest index in rapeseed (Brassica napus L.) with association mapping.

    abstract:BACKGROUND:Harvest index (HI), the ratio of grain yield to total biomass, is considered as a measure of biological success in partitioning assimilated photosynthate to the harvestable product. While crop production can be dramatically improved by increasing HI, the underlying molecular genetic mechanism of HI in rapese...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1607-0

    authors: Luo X,Ma C,Yue Y,Hu K,Li Y,Duan Z,Wu M,Tu J,Shen J,Yi B,Fu T

    更新日期:2015-05-12 00:00:00

  • High-throughput genome sequencing of lichenizing fungi to assess gene loss in the ammonium transporter/ammonia permease gene family.

    abstract:BACKGROUND:Horizontal gene transfer has shaped the evolution of the ammonium transporter/ammonia permease gene family. Horizontal transfers of ammonium transporter/ammonia permease genes into the fungi include one transfer from archaea to the filamentous ascomycetes associated with the adaptive radiation of the leotiom...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-225

    authors: McDonald TR,Mueller O,Dietrich FS,Lutzoni F

    更新日期:2013-04-04 00:00:00

  • Transcript profiling of Populus tomentosa genes in normal, tension, and opposite wood by RNA-seq.

    abstract:BACKGROUND:Wood formation affects the chemical and physical properties of wood, and thus affects its utility as a building material or a feedstock for biofuels, pulp and paper. To obtain genome-wide insights on the transcriptome changes and regulatory networks in wood formation, we used high-throughput RNA sequencing t...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1390-y

    authors: Chen J,Chen B,Zhang D

    更新日期:2015-03-10 00:00:00

  • Flashy flagella: flagellin modification is relatively common and highly versatile among the Enterobacteriaceae.

    abstract:BACKGROUND:Post-translational glycosylation of the flagellin protein is relatively common among Gram-negative bacteria, and has been linked to several phenotypes, including flagellar biosynthesis and motility, biofilm formation, host immune evasion and manipulation and virulence. However to date, despite extensive phys...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2735-x

    authors: De Maayer P,Cowan DA

    更新日期:2016-05-20 00:00:00