How data analysis affects power, reproducibility and biological insight of RNA-seq studies in complex datasets.

Abstract:

:The sequencing of the full transcriptome (RNA-seq) has become the preferred choice for the measurement of genome-wide gene expression. Despite its widespread use, challenges remain in RNA-seq data analysis. One often-overlooked aspect is normalization. Despite the fact that a variety of factors or 'batch effects' can contribute unwanted variation to the data, commonly used RNA-seq normalization methods only correct for sequencing depth. The study of gene expression is particularly problematic when it is influenced simultaneously by a variety of biological factors in addition to the one of interest. Using examples from experimental neuroscience, we show that batch effects can dominate the signal of interest; and that the choice of normalization method affects the power and reproducibility of the results. While commonly used global normalization methods are not able to adequately normalize the data, more recently developed RNA-seq normalization can. We focus on one particular method, RUVSeq and show that it is able to increase power and biological insight of the results. Finally, we provide a tutorial outlining the implementation of RUVSeq normalization that is applicable to a broad range of studies as well as meta-analysis of publicly available data.

journal_name

Nucleic Acids Res

journal_title

Nucleic acids research

authors

Peixoto L,Risso D,Poplawski SG,Wimmer ME,Speed TP,Wood MA,Abel T

doi

10.1093/nar/gkv736

subject

Has Abstract

pub_date

2015-09-18 00:00:00

pages

7664-74

issue

16

eissn

0305-1048

issn

1362-4962

pii

gkv736

journal_volume

43

pub_type

杂志文章
  • Hepatotoxicity of high affinity gapmer antisense oligonucleotides is mediated by RNase H1 dependent promiscuous reduction of very long pre-mRNA transcripts.

    abstract::High affinity antisense oligonucleotides (ASOs) containing bicylic modifications (BNA) such as locked nucleic acid (LNA) designed to induce target RNA cleavage have been shown to have enhanced potency along with a higher propensity to cause hepatotoxicity. In order to understand the mechanism of this hepatotoxicity, t...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkv1210

    authors: Burel SA,Hart CE,Cauntay P,Hsiao J,Machemer T,Katz M,Watt A,Bui HH,Younis H,Sabripour M,Freier SM,Hung G,Dan A,Prakash TP,Seth PP,Swayze EE,Bennett CF,Crooke ST,Henry SP

    更新日期:2016-03-18 00:00:00

  • Deletion analysis of a unique 3' splice site indicates that alternating guanine and thymine residues represent an efficient splicing signal.

    abstract::The 3' splice site of the second intron (I2) of the human apolipoprotein-AII gene, (GT)16GGGCAG, is unique in that, although fully functional, a stretch of alternating guanine and thymine residues replaces the polypyrimidine tract usually associated with 3' splice junctions. The transient expression of successive 5' d...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/15.9.3787

    authors: Shelley CS,Baralle FE

    更新日期:1987-05-11 00:00:00

  • uORFdb--a comprehensive literature database on eukaryotic uORF biology.

    abstract::Approximately half of all human transcripts contain at least one upstream translational initiation site that precedes the main coding sequence (CDS) and gives rise to an upstream open reading frame (uORF). We generated uORFdb, publicly available at http://cbdm.mdc-berlin.de/tools/uorfdb, to serve as a comprehensive li...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt952

    authors: Wethmar K,Barbosa-Silva A,Andrade-Navarro MA,Leutz A

    更新日期:2014-01-01 00:00:00

  • The European Nucleotide Archive.

    abstract::The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary nucleotide-sequence repository. The ENA consists of three main databases: the Sequence Read Archive (SRA), the Trace Archive and EMBL-Bank. The objective of ENA is to support and promote the use of nucleotide sequencing as an experimen...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkq967

    authors: Leinonen R,Akhtar R,Birney E,Bower L,Cerdeno-Tárraga A,Cheng Y,Cleland I,Faruque N,Goodgame N,Gibson R,Hoad G,Jang M,Pakseresht N,Plaister S,Radhakrishnan R,Reddy K,Sobhany S,Ten Hoopen P,Vaughan R,Zalunin V,Cochr

    更新日期:2011-01-01 00:00:00

  • Presence of multiple species of polypeptides immunologically related to transcription factor TFIIIA in adult Xenopus tissues.

    abstract::Transcription of 5S RNA gene in Xenopus oocytes requires a 38 kDa transcription factor TFIIIA, which interacts with the 50 bp internal control region of the gene. We looked for TFIIIA-like polypeptides in the extracts of adult Xenopus tissues on the basis of their antigenic cross-reactivity to anti-TFIIIA antibody. Se...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/17.14.5597

    authors: Yasui W,Ryoji M

    更新日期:1989-07-25 00:00:00

  • Selection of sequence elements that substitute for the standard AATAAA motif which signals 3' processing and polyadenylation of late simian virus 40 mRNAs.

    abstract::A method is described which allows selection of sequences which can substitute for the normal AATAAA hexanucleotide involved in polyadenylation of SV40 late mRNAs. Plaques were generated from viral DNA lacking the motif, forcing acquisition of substitute sequences. Four variants were characterized. All displayed wild-...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/13.22.8053

    authors: Swimmer C,Shenk T

    更新日期:1985-11-25 00:00:00

  • Histone H3 K79 methylation states play distinct roles in UV-induced sister chromatid exchange and cell cycle checkpoint arrest in Saccharomyces cerevisiae.

    abstract::Histone post-translational modifications have been shown to contribute to DNA damage repair. Prior studies have suggested that specific H3K79 methylation states play distinct roles in the response to UV-induced DNA damage. To evaluate these observations, we examined the effect of altered H3K79 methylation patterns on ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gku242

    authors: Rossodivita AA,Boudoures AL,Mecoli JP,Steenkiste EM,Karl AL,Vines EM,Cole AM,Ansbro MR,Thompson JS

    更新日期:2014-06-01 00:00:00

  • Stimuli of differentiation regulate RNA elongation in the transcription units for the major stage-specific antigens of Trypanosoma brucei.

    abstract::In Trypanosoma brucei, the mutually exclusive expression of the major surface antigens, the variant surface glycoprotein (VSG) of the bloodstream form and procyclin of the procyclic form, is due to a stage-specific accumulation of the respective mRNAs. Through the targeting of a reporter construct in the procyclin pro...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/23.11.1862

    authors: Vanhamme L,Berberof M,Le Ray D,Pays E

    更新日期:1995-06-11 00:00:00

  • From benchmarking HITS-CLIP peak detection programs to a new method for identification of miRNA-binding sites from Ago2-CLIP data.

    abstract::Experimental evidence indicates that about 60% of miRNA-binding activity does not follow the canonical rule about the seed matching between miRNA and target mRNAs, but rather a non-canonical miRNA targeting activity outside the seed or with a seed-like motifs. Here, we propose a new unbiased method to identify canonic...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkx007

    authors: Bottini S,Hamouda-Tekaya N,Tanasa B,Zaragosi LE,Grandjean V,Repetto E,Trabucchi M

    更新日期:2017-05-19 00:00:00

  • Cyber-T web server: differential analysis of high-throughput data.

    abstract::The Bayesian regularization method for high-throughput differential analysis, described in Baldi and Long (A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001: 17: 509-519) and implemented in the Cyber-T web server, is ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gks420

    authors: Kayala MA,Baldi P

    更新日期:2012-07-01 00:00:00

  • Terminal labeling and addition of homopolymer tracts to duplex DNA fragments by terminal deoxynucleotidyl transferase.

    abstract::Terminal deoxynucleotidyl transferase, which requires a single-stranded DNA primer under the usual assay conditions, can be made to accept double-stranded DNA as primer for the addition of either rNMP or dNMP, if Mg+2 ion is replaced by Co+2 ion. The priming efficiency in the presence of (C leads to) CO+2 ion with res...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/3.1.101

    authors: Roychoudhury R,Jay E,Wu R

    更新日期:1976-01-01 00:00:00

  • Stability of the primary organization of nucleosome core particles upon some conformational transitions.

    abstract::The sequential arrangement of histones along DNA in nucleosome core particles was determined between 0.5 and 600 mM salt and from 0 to 8 M urea. These concentrations of salt and urea up to 6 M had no significant effect on the linear order of histones along DNA but 8 M urea caused the rearrangement of histones. Conform...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/9.5.1053

    authors: Zayetz VW,Bavykin SG,Karpov VL,Mirzabekov AD

    更新日期:1981-03-11 00:00:00

  • A simulation of subtractive hybridization.

    abstract::Various strategies employed in genomic DNA cloning by subtractive hybridization have been examined by computer simulations, with the comparison between the predictions and the published results. The result shows that the efficiency of target sequence enrichment and the sensitivity to experimental conditions depend str...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/26.6.1440

    authors: Cho TJ,Park SS

    更新日期:1998-03-15 00:00:00

  • Novel DNA binding proteins highly specific to UV-damaged DNA sequences from embryos of Drosophila melanogaster.

    abstract::Three new proteins which selectively bind to UV-damaged DNA were identified and purified to near homogeneity from UV-irradiated Drosophila melanogaster embryos through several column chromatographies. These proteins, tentatively designated as D-DDB P1, P2 and P3, can be identified as different complex bands in a gel s...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/23.14.2600

    authors: Kai M,Takahashi T,Todo T,Sakaguchi K

    更新日期:1995-07-25 00:00:00

  • The regulatory region of phage fr replicase cistron. III. Initiation activity of specific fr RNA fragments.

    abstract::RNA fragments from phage fr covering the complete or part of the replicase cistron initiation region have been used as templates in the formation of a ribosomal initiation complex in vitro. The results so obtained together with our earlier findings in a similar approach applied to fragments of the structurally related...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/10.23.7763

    authors: Berzin V,Cielens I,Jansone I,Gren EJ

    更新日期:1982-12-11 00:00:00

  • bZIP-Type transcription factors CREB and OASIS bind and stimulate the promoter of the mammalian transcription factor GCMa/Gcm1 in trophoblast cells.

    abstract::One of the master regulators of placental cell fusion in mammals leading to multi-nucleated syncytiotrophoblasts is the transcription factor GCMa. Recently, we proved that the cAMP-driven protein kinase A signaling pathway is fundamental for up-regulation of GCMa transcript levels and protein stability. Here, we show ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn306

    authors: Schubert SW,Abendroth A,Kilian K,Vogler T,Mayr B,Knerr I,Hashemolhosseini S

    更新日期:2008-06-01 00:00:00

  • The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog).

    abstract::The NHGRI-EBI GWAS Catalog has provided data from published genome-wide association studies since 2008. In 2015, the database was redesigned and relocated to EMBL-EBI. The new infrastructure includes a new graphical user interface (www.ebi.ac.uk/gwas/), ontology supported search functionality and an improved curation ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkw1133

    authors: MacArthur J,Bowler E,Cerezo M,Gil L,Hall P,Hastings E,Junkins H,McMahon A,Milano A,Morales J,Pendlington ZM,Welter D,Burdett T,Hindorff L,Flicek P,Cunningham F,Parkinson H

    更新日期:2017-01-04 00:00:00

  • Predicting translational diffusion of evolutionary conserved RNA structures by the nucleotide number.

    abstract::Ribonucleic acids are highly conserved essential parts of cellular life. RNA function is determined to a large extent by its hydrodynamic behaviour. The presented study proposes a strategy to predict the hydrodynamic behaviour of RNA single strands on the basis of the polymer size. By atom-level shell-modelling of hig...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkq808

    authors: Werner A

    更新日期:2011-02-01 00:00:00

  • A possible origin of newly-born bacterial genes: significance of GC-rich nonstop frame on antisense strand.

    abstract::Base compositions were examined at every position in codons of more than 50 genes from taxonomically different bacteria and of the corresponding antisense sequences on the bacterial genes. We propose that the nonstop frame on antisense strand [NSF(a)] of GC-rich bacterial genes is the most promising sequence for newly...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/24.21.4249

    authors: Ikehara K,Amada F,Yoshida S,Mikata Y,Tanaka A

    更新日期:1996-11-01 00:00:00

  • The conserved ribonuclease aCPSF1 triggers genome-wide transcription termination of Archaea via a 3'-end cleavage mode.

    abstract::Transcription termination defines accurate transcript 3'-ends and ensures programmed transcriptomes, making it critical to life. However, transcription termination mechanisms remain largely unknown in Archaea. Here, we reported the physiological significance of the newly identified general transcription termination fa...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa702

    authors: Yue L,Li J,Zhang B,Qi L,Li Z,Zhao F,Li L,Zheng X,Dong X

    更新日期:2020-09-25 00:00:00

  • Increased erythroid-specific expression of a mutated HPFH gamma-globin promoter requires the erythroid factor NFE-1.

    abstract::The -175 T greater than C mutation in the promoter of the A gamma- or G gamma-globin gene causes a 50-100 fold increase of the expression of the respective gene in adult erythroid cells (Hereditary Persistence of Fetal Hemoglobin). We show here that this mutation increases 3-9 fold the expression of a gamma-CAT report...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/17.14.5509

    authors: Nicolis S,Ronchi A,Malgaretti N,Mantovani R,Giglioni B,Ottolenghi S

    更新日期:1989-07-25 00:00:00

  • Coding capacity of complementary DNA strands.

    abstract::A Fortran computer algorithm has been used to analyze the nucleotide sequence of several structural genes. The analysis performed on both coding and complementary DNA strands shows that whereas open reading frames shorter than 100 codons are randomly distributed on both DNA strands, open reading frames longer than 100...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/9.6.1499

    authors: Casino A,Cipollaro M,Guerrini AM,Mastrocinque G,Spena A,Scarlato V

    更新日期:1981-03-25 00:00:00

  • Analysis of intermolecular base pair formation of prohead RNA of the phage phi29 DNA packaging motor using NMR spectroscopy.

    abstract::The bacteriophage ø29 DNA packaging motor that assembles on the precursor capsid (prohead) contains an essential 174-nt structural RNA (pRNA) that forms multimers. To determine the structural features of the CE- and D-loops believed to be involved in multimerization of pRNA, 35- and 19-nt RNA molecules containing the ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkm874

    authors: Kitamura A,Jardine PJ,Anderson DL,Grimes S,Matsuo H

    更新日期:2008-02-01 00:00:00

  • PCNA-MutSalpha-mediated binding of MutLalpha to replicative DNA with mismatched bases to induce apoptosis in human cells.

    abstract::Modified bases, such as O6-methylguanines, are produced in cells exposed to alkylating agents and cause apoptosis. In human cells treated with N-methyl-N-nitrosourea, we detected a protein complex composed of MutSalpha, MutLalpha and PCNA on damaged DNA by immunoprecipitation method using chromatin extracts, in which ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gki878

    authors: Hidaka M,Takagi Y,Takano TY,Sekiguchi M

    更新日期:2005-10-04 00:00:00

  • A WW-like module in the RAG1 N-terminal domain contributes to previously unidentified protein-protein interactions.

    abstract::More than one-third of the RAG1 protein can be truncated from the N-terminus with only subtle effects on the products of V(D)J recombination in vitro or in a mouse. What, then, is the function of the N-terminal domain? We believe it to be regulatory. We determined, several years ago, that an included RING motif could ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp192

    authors: Maitra R,Sadofsky MJ

    更新日期:2009-06-01 00:00:00

  • CORUM: the comprehensive resource of mammalian protein complexes.

    abstract::Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scient...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkm936

    authors: Ruepp A,Brauner B,Dunger-Kaltenbach I,Frishman G,Montrone C,Stransky M,Waegele B,Schmidt T,Doudieu ON,Stümpflen V,Mewes HW

    更新日期:2008-01-01 00:00:00

  • Complementary addressed modification of yeast tRNA Val 1 with alkylating derivative of d(pC-G)-A. The positions of the alkylated nucleotides and the course of the alkylation in the complex.

    abstract::Yeast tRNA Val 1 alkylation with 2', 3'-O-4-(N-2-chloroethyl-N-methylamino) benzylidene d(pC-G)-A proceeds at 20 degrees - 30 degrees C in the complementary complexes which are formed by d(pC-G)-A greater than RC1 binding to 3 sequences of tRNA Val 1 : psi-C-G58 in the T loop, C-G40 at the 3'-side of the anticodon loo...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/4.5.1609

    authors: Grineva NI,Karpova GG,Kuznetsova LM,Venkstern TV,Bayev AA

    更新日期:1977-01-01 00:00:00

  • CRISPR/Cas9-mediated modulation of splicing efficiency reveals short splicing isoform of Xist RNA is sufficient to induce X-chromosome inactivation.

    abstract::Alternative splicing of mRNA precursors results in multiple protein variants from a single gene and is critical for diverse cellular processes and development. Xist encodes a long noncoding RNA which is a central player to induce X-chromosome inactivation in female mammals and has two major splicing variants: long and...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkx1227

    authors: Yue M,Ogawa Y

    更新日期:2018-03-16 00:00:00

  • Primary structure differences between proteins C1 and C2 of HeLa 40S nuclear ribonucleoprotein particles.

    abstract::Partial acid cleavage, comparative HPLC tryptic peptide mapping and amino acid sequencing of the C1 and C2 proteins of HeLa heterogeneous nuclear ribonucleoprotein (hnRNP) particles demonstrate that proteins C1 and C2 differ in primary structure by the presence of a 13 amino acid insert sequence in C2. This C2 insert ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/17.21.8441

    authors: Merrill BM,Barnett SF,LeStourgeon WM,Williams KR

    更新日期:1989-11-11 00:00:00

  • Conserved 5' flank homologies in dipteran 5S RNA genes that would function on 'A' form DNA.

    abstract::We have sequenced the 480 base pair (bp) repeating unit of the 5S RNA genes of the Dipteran fly Calliphora erythrocephala and compared this sequence to the three known 5S RNA gene sequences from the Dipteran Genus Drosophila (1,2). A striking series of five perfectly conserved homologies identically positioned within ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/12.21.8193

    authors: Rubacha A,Sumner W 3rd,Richter L,Beckingham K

    更新日期:1984-11-12 00:00:00