A hybrid imputation approach for microarray missing value estimation.

Abstract:

BACKGROUND:Missing data is an inevitable phenomenon in gene expression microarray experiments due to instrument failure or human error. It has a negative impact on performance of downstream analysis. Technically, most existing approaches suffer from this prevalent problem. Imputation is one of the frequently used methods for processing missing data. Actually many developments have been achieved in the research on estimating missing values. The challenging task is how to improve imputation accuracy for data with a large missing rate. METHODS:In this paper, induced by the thought of collaborative training, we propose a novel hybrid imputation method, called Recursive Mutual Imputation (RMI). Specifically, RMI exploits global correlation information and local structure in the data, captured by two popular methods, Bayesian Principal Component Analysis (BPCA) and Local Least Squares (LLS), respectively. Mutual strategy is implemented by sharing the estimated data sequences at each recursive process. Meanwhile, we consider the imputation sequence based on the number of missing entries in the target gene. Furthermore, a weight based integrated method is utilized in the final assembling step. RESULTS:We evaluate RMI with three state-of-art algorithms (BPCA, LLS, Iterated Local Least Squares imputation (ItrLLS)) on four publicly available microarray datasets. Experimental results clearly demonstrate that RMI significantly outperforms comparative methods in terms of Normalized Root Mean Square Error (NRMSE), especially for datasets with large missing rates and less complete genes. CONCLUSIONS:It is noted that our proposed hybrid imputation approach incorporates both global and local information of microarray genes, which achieves lower NRMSE values against to any single approach only. Besides, this study highlights the need for considering the imputing sequence of missing entries for imputation methods.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Li H,Zhao C,Shao F,Li GZ,Wang X

doi

10.1186/1471-2164-16-S9-S1

subject

Has Abstract

pub_date

2015-01-01 00:00:00

pages

S1

issn

1471-2164

pii

1471-2164-16-S9-S1

journal_volume

16 Suppl 9

pub_type

杂志文章
  • Efficient depletion of ribosomal RNA for RNA sequencing in planarians.

    abstract:BACKGROUND:The astounding regenerative abilities of planarian flatworms prompt steadily growing interest in examining their molecular foundation. Planarian regeneration was found to require hundreds of genes and is hence a complex process. Thus, RNA interference followed by transcriptome-wide gene expression analysis b...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6292-y

    authors: Kim IV,Ross EJ,Dietrich S,Döring K,Sánchez Alvarado A,Kuhn CD

    更新日期:2019-11-29 00:00:00

  • Evaluation and optimisation of indel detection workflows for ion torrent sequencing of the BRCA1 and BRCA2 genes.

    abstract:BACKGROUND:The Ion Torrent PGM is a popular benchtop sequencer that shows promise in replacing conventional Sanger sequencing as the gold standard for mutation detection. Despite the PGM's reported high accuracy in calling single nucleotide variations, it tends to generate many false positive calls in detecting inserti...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-516

    authors: Yeo ZX,Wong JC,Rozen SG,Lee AS

    更新日期:2014-06-24 00:00:00

  • Acquisition of genome information from single-celled unculturable organisms (radiolaria) by exploiting genome profiling (GP).

    abstract:BACKGROUND:There is no effective method to obtain genome information from single-celled unculturable organisms such as radiolarians. Even worse, such organisms are often very difficult to collect. Sequence analysis of 18S rDNA has been carried out, but obtaining the data has been difficult and it has provided a rather ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-135

    authors: Kouduka M,Matuoka A,Nishigaki K

    更新日期:2006-06-02 00:00:00

  • Systemic treatment of xenografts with vaccinia virus GLV-1h68 reveals the immunologic facet of oncolytic therapy.

    abstract:BACKGROUND:GLV-1h68 is an attenuated recombinant vaccinia virus (VACV) that selectively colonizes established human xenografts inducing their complete regression. RESULTS:Here, we explored xenograft/VACV/host interactions in vivo adopting organism-specific expression arrays and tumor cell/VACV in vitro comparing VACV ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-301

    authors: Worschech A,Chen N,Yu YA,Zhang Q,Pos Z,Weibel S,Raab V,Sabatino M,Monaco A,Liu H,Monsurró V,Buller RM,Stroncek DF,Wang E,Szalay AA,Marincola FM

    更新日期:2009-07-07 00:00:00

  • Comparative transcriptomics uncovers alternative splicing changes and signatures of selection from maize improvement.

    abstract:BACKGROUND:Alternative splicing (AS) is an important regulatory mechanism that greatly contributes to eukaryotic transcriptome diversity. A substantial amount of evidence has demonstrated that AS complexity is relevant to eukaryotic evolution, development, adaptation, and complexity. In this study, six teosinte and ten...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1582-5

    authors: Huang J,Gao Y,Jia H,Liu L,Zhang D,Zhang Z

    更新日期:2015-05-08 00:00:00

  • Analysis of plant LTR-retrotransposons at the fine-scale family level reveals individual molecular patterns.

    abstract:BACKGROUND:Sugarcane is an important crop worldwide for sugar production and increasingly, as a renewable energy source. Modern cultivars have polyploid, large complex genomes, with highly unequal contributions from ancestral genomes. Long Terminal Repeat retrotransposons (LTR-RTs) are the single largest components of ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-137

    authors: Domingues DS,Cruz GM,Metcalfe CJ,Nogueira FT,Vicentini R,Alves Cde S,Van Sluys MA

    更新日期:2012-04-16 00:00:00

  • Sequencing and analysis of the gene-rich space of cowpea.

    abstract:BACKGROUND:Cowpea, Vigna unguiculata (L.) Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly b...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-103

    authors: Timko MP,Rushton PJ,Laudeman TW,Bokowiec MT,Chipumuro E,Cheung F,Town CD,Chen X

    更新日期:2008-02-27 00:00:00

  • Diversity of the cell-wall associated genomic island of the archaeon Haloquadratum walsbyi.

    abstract:BACKGROUND:Haloquadratum walsbyi represents up to 80% of cells in NaCl-saturated brines worldwide, but is notoriously difficult to maintain under laboratory conditions. In order to establish the extent of genetic diversity in a natural population of this microbe, we screened a H. walsbyi enriched metagenomic fosmid lib...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1794-8

    authors: Martin-Cuadrado AB,Pašić L,Rodriguez-Valera F

    更新日期:2015-08-13 00:00:00

  • Leaps and lulls in the developmental transcriptome of Dictyostelium discoideum.

    abstract:BACKGROUND:Development of the soil amoeba Dictyostelium discoideum is triggered by starvation. When placed on a solid substrate, the starving solitary amoebae cease growth, communicate via extracellular cAMP, aggregate by tens of thousands and develop into multicellular organisms. Early phases of the developmental prog...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1491-7

    authors: Rosengarten RD,Santhanam B,Fuller D,Katoh-Kurasawa M,Loomis WF,Zupan B,Shaulsky G

    更新日期:2015-04-13 00:00:00

  • Correction to: Comparative transcriptomics reveals PrrABmediated control of metabolic, respiration, energy-generating, and dormancy pathways in Mycobacterium smegmatis.

    abstract::Following the publication of the original article [1], the authors reported an error in Fig. 2 of the PDF version of their article. ...

    journal_title:BMC genomics

    pub_type: 杂志文章,已发布勘误

    doi:10.1186/s12864-019-6419-1

    authors: Maarsingh JD,Yang S,Park JG,Haydel SE

    更新日期:2019-12-31 00:00:00

  • Cross-species global and subset gene expression profiling identifies genes involved in prostate cancer response to selenium.

    abstract:BACKGROUND:Gene expression technologies have the ability to generate vast amounts of data, yet there often resides only limited resources for subsequent validation studies. This necessitates the ability to perform sorting and prioritization of the output data. Previously described methodologies have used functional pat...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-5-58

    authors: Schlicht M,Matysiak B,Brodzeller T,Wen X,Liu H,Zhou G,Dhir R,Hessner MJ,Tonellato P,Suckow M,Pollard M,Datta MW

    更新日期:2004-08-20 00:00:00

  • Global analyses of Ceratocystis cacaofunesta mitochondria: from genome to proteome.

    abstract:BACKGROUND:The ascomycete fungus Ceratocystis cacaofunesta is the causal agent of wilt disease in cacao, which results in significant economic losses in the affected producing areas. Despite the economic importance of the Ceratocystis complex of species, no genomic data are available for any of its members. Given that ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-91

    authors: Ambrosio AB,do Nascimento LC,Oliveira BV,Teixeira PJ,Tiburcio RA,Toledo Thomazella DP,Leme AF,Carazzolle MF,Vidal RO,Mieczkowski P,Meinhardt LW,Pereira GA,Cabrera OG

    更新日期:2013-02-11 00:00:00

  • A phenomics-based approach for the detection and interpretation of shared genetic influences on 29 biochemical indices in southern Chinese men.

    abstract:BACKGROUND:Phenomics provides new technologies and platforms as a systematic phenome-genome approach. However, few studies have reported on the systematic mining of shared genetics among clinical biochemical indices based on phenomics methods, especially in China. This study aimed to apply phenomics to systematically e...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6363-0

    authors: Hu Y,Tan A,Yu L,Hou C,Kuang H,Wu Q,Su J,Zhou Q,Zhu Y,Zhang C,Wei W,Li L,Li W,Huang Y,Huang H,Xie X,Lu T,Zhang H,Yang X,Gao Y,Li T,Jiang Y,Mo Z

    更新日期:2019-12-16 00:00:00

  • The GAMYB gene in rye: sequence, polymorphisms, map location, allele-specific markers, and relationship with α-amylase activity.

    abstract:BACKGROUND:Transcription factor (TF) GAMYB, belonging to MYB family (named after the gene of the avian myeloblastosis virus) is a master gibberellin (GA)-induced regulatory protein that is crucial for development and germination of cereal grain and involved in anther formation. It activates many genes including high-mo...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06991-3

    authors: Bienias A,Góralska M,Masojć P,Milczarski P,Myśków B

    更新日期:2020-08-24 00:00:00

  • Obesity-related known and candidate SNP markers can significantly change affinity of TATA-binding protein for human gene promoters.

    abstract:BACKGROUND:Obesity affects quality of life and life expectancy and is associated with cardiovascular disorders, cancer, diabetes, reproductive disorders in women, prostate diseases in men, and congenital anomalies in children. The use of single nucleotide polymorphism (SNP) markers of diseases and drug responses (i.e.,...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-16-S13-S5

    authors: Arkova OV,Ponomarenko MP,Rasskazov DA,Drachkova IA,Arshinova TV,Ponomarenko PM,Savinkova LK,Kolchanov NA

    更新日期:2015-01-01 00:00:00

  • Effect of paleopolyploidy and allopolyploidy on gene expression in banana.

    abstract:BACKGROUND:Bananas (Musa spp.) are an important crop worldwide. Most modern cultivars resulted from a complex polyploidization history that comprised three whole genome duplications (WGDs) shaping the haploid Musa genome, followed by inter- and intra-specific crosses between Musa acuminata and M. balbisiana (A and B ge...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5618-0

    authors: Cenci A,Hueber Y,Zorrilla-Fontanesi Y,van Wesemael J,Kissel E,Gislard M,Sardos J,Swennen R,Roux N,Carpentier SC,Rouard M

    更新日期:2019-03-27 00:00:00

  • Exclusivity offers a sound yet practical species criterion for bacteria despite abundant gene flow.

    abstract:BACKGROUND:The question of whether bacterial species objectively exist has long divided microbiologists. A major source of contention stems from the fact that bacteria regularly engage in horizontal gene transfer (HGT), making it difficult to ascertain relatedness and draw boundaries between taxa. A natural way to defi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5099-6

    authors: Wright ES,Baum DA

    更新日期:2018-10-03 00:00:00

  • Transcriptomic analyses reveal the adaptive features and biological differences of guts from two invasive whitefly species.

    abstract:BACKGROUND:The gut of phloem feeding insects is critical for nutrition uptake and xenobiotics degradation. However, partly due to its tiny size, genomic information for the gut of phloem feeding insects is limited. RESULTS:In this study, the gut transcriptomes of two species of invasive whiteflies in the Bemisia tabac...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-370

    authors: Ye XD,Su YL,Zhao QY,Xia WQ,Liu SS,Wang XW

    更新日期:2014-05-15 00:00:00

  • DNA and RNA-sequence based GWAS highlights membrane-transport genes as key modulators of milk lactose content.

    abstract:BACKGROUND:Lactose provides an easily-digested energy source for neonates, and is the primary carbohydrate in milk in most species. Bovine lactose is also a key component of many human food products. However, compared to analyses of other milk components, the genetic control of lactose has been little studied. Here we ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4320-3

    authors: Lopdell TJ,Tiplady K,Struchalin M,Johnson TJJ,Keehan M,Sherlock R,Couldrey C,Davis SR,Snell RG,Spelman RJ,Littlejohn MD

    更新日期:2017-12-15 00:00:00

  • Inorganic Arsenic-induced cellular transformation is coupled with genome wide changes in chromatin structure, transcriptome and splicing patterns.

    abstract:BACKGROUND:Arsenic (As) exposure is a significant worldwide environmental health concern. Low dose, chronic arsenic exposure has been associated with a higher than normal risk of skin, lung, and bladder cancer, as well as cardiovascular disease and diabetes. While arsenic-induced biological changes play a role in disea...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1295-9

    authors: Riedmann C,Ma Y,Melikishvili M,Godfrey SG,Zhang Z,Chen KC,Rouchka EC,Fondufe-Mittendorf YN

    更新日期:2015-03-19 00:00:00

  • Directionality of point mutation and 5-methylcytosine deamination rates in the chimpanzee genome.

    abstract:BACKGROUND:The pattern of point mutation is important for studying mutational mechanisms, genome evolution, and diseases. Previous studies of mutation direction were largely based on substitution data from a limited number of loci. To date, there is no genome-wide analysis of mutation direction or methylation-dependent...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-316

    authors: Jiang C,Zhao Z

    更新日期:2006-12-13 00:00:00

  • Comparative genome analysis of Prevotella intermedia strain isolated from infected root canal reveals features related to pathogenicity and adaptation.

    abstract:BACKGROUND:Many species of the genus Prevotella are pathogens that cause oral diseases. Prevotella intermedia is known to cause various oral disorders e.g. periodontal disease, periapical periodontitis and noma as well as colonize in the respiratory tract and be associated with cystic fibrosis and chronic bronchitis. I...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1272-3

    authors: Ruan Y,Shen L,Zou Y,Qi Z,Yin J,Jiang J,Guo L,He L,Chen Z,Tang Z,Qin S

    更新日期:2015-02-25 00:00:00

  • Genome-wide identification and comparison of differentially expressed profiles of miRNAs and lncRNAs with associated ceRNA networks in the gonads of Chinese soft-shelled turtle, Pelodiscus sinensis.

    abstract:BACKGROUND:The gonad is the major factor affecting animal reproduction. The regulatory mechanism of the expression of protein-coding genes involved in reproduction still remains to be elucidated. Increasing evidence has shown that ncRNAs play key regulatory roles in gene expression in many life processes. The roles of ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06826-1

    authors: Ma X,Cen S,Wang L,Zhang C,Wu L,Tian X,Wu Q,Li X,Wang X

    更新日期:2020-06-29 00:00:00

  • Unlocking the mystery of the hard-to-sequence phage genome: PaP1 methylome and bacterial immunity.

    abstract:BACKGROUND:Whole-genome sequencing is an important method to understand the genetic information, gene function, biological characteristics and survival mechanisms of organisms. Sequencing large genomes is very simple at present. However, we encountered a hard-to-sequence genome of Pseudomonas aeruginosa phage PaP1. Sho...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-803

    authors: Lu S,Le S,Tan Y,Li M,Liu C,Zhang K,Huang J,Chen H,Rao X,Zhu J,Zou L,Ni Q,Li S,Wang J,Jin X,Hu Q,Yao X,Zhao X,Zhang L,Huang G,Hu F

    更新日期:2014-09-19 00:00:00

  • Gene silencing pathways found in the green alga Volvox carteri reveal insights into evolution and origins of small RNA systems in plants.

    abstract:BACKGROUND:Volvox carteri (V. carteri) is a multicellular green alga used as model system for the evolution of multicellularity. So far, the contribution of small RNA pathways to these phenomena is not understood. Thus, we have sequenced V. carteri Argonaute 3 (VcAGO3)-associated small RNAs from different developmental...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3202-4

    authors: Dueck A,Evers M,Henz SR,Unger K,Eichner N,Merkl R,Berezikov E,Engelmann JC,Weigel D,Wenzl S,Meister G

    更新日期:2016-11-02 00:00:00

  • Gut microbiota dysbiosis and bacterial community assembly associated with cholesterol gallstones in large-scale study.

    abstract:BACKGROUND:Elucidating gut microbiota among gallstone patients as well as the complex bacterial colonization of cholesterol gallstones may help in both the prediction and subsequent lowered risk of cholelithiasis. To this end, we studied the composition of bacterial communities of gut, bile, and gallstones from 29 gall...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-669

    authors: Wu T,Zhang Z,Liu B,Hou D,Liang Y,Zhang J,Shi P

    更新日期:2013-10-01 00:00:00

  • Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    abstract:BACKGROUND:Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.)...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-211

    authors: Straub SC,Fishbein M,Livshultz T,Foster Z,Parks M,Weitemier K,Cronn RC,Liston A

    更新日期:2011-05-04 00:00:00

  • Comparative mitogenomic analysis of the superfamily Pentatomoidea (Insecta: Hemiptera: Heteroptera) and phylogenetic implications.

    abstract:BACKGROUND:Insect mitochondrial genomes (mitogenomes) are the most extensively used genetic marker for evolutionary and population genetics studies of insects. The Pentatomoidea superfamily is economically important and the largest superfamily within Pentatomomorpha with over 7,000 species. To better understand the div...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1679-x

    authors: Yuan ML,Zhang QL,Guo ZL,Wang J,Shen YY

    更新日期:2015-06-16 00:00:00

  • Enrichment of Triticum aestivum gene annotations using ortholog cliques and gene ontologies in other plants.

    abstract:BACKGROUND:While the gargantuan multi-nation effort of sequencing T. aestivum gets close to completion, the annotation process for the vast number of wheat genes and proteins is in its infancy. Previous experimental studies carried out on model plant organisms such as A. thaliana and O. sativa provide a plethora of gen...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1496-2

    authors: Tulpan D,Leger S,Tchagang A,Pan Y

    更新日期:2015-04-15 00:00:00

  • Involvements of PCD and changes in gene expression profile during self-pruning of spring shoots in sweet orange (Citrus sinensis).

    abstract:BACKGROUND:Citrus shoot tips abscise at an anatomically distinct abscission zone (AZ) that separates the top part of the shoots into basal and apical portions (citrus self-pruning). Cell separation occurs only at the AZ, which suggests its cells have distinctive molecular regulation. Although several studies have looke...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-892

    authors: Zhang JZ,Zhao K,Ai XY,Hu CG

    更新日期:2014-10-13 00:00:00