BISR-RNAseq: an efficient and scalable RNAseq analysis workflow with interactive report generation.

Abstract:

BACKGROUND:RNA sequencing has become an increasingly affordable way to profile gene expression patterns. Here we introduce a workflow implementing several open-source softwares that can be run on a high performance computing environment. RESULTS:Developed as a tool by the Bioinformatics Shared Resource Group (BISR) at the Ohio State University, we have applied the pipeline to a few publicly available RNAseq datasets downloaded from GEO in order to demonstrate the feasibility of this workflow. Source code is available here: workflow: https://code.bmi.osumc.edu/gadepalli.3/BISR-RNAseq-ICIBM2019 and shiny: https://code.bmi.osumc.edu/gadepalli.3/BISR_RNASeq_ICIBM19. Example dataset is demonstrated here: https://dataportal.bmi.osumc.edu/RNA_Seq/. CONCLUSION:The workflow allows for the analysis (alignment, QC, gene-wise counts generation) of raw RNAseq data and seamless integration of quality analysis and differential expression results into a configurable R shiny web application.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Gadepalli VS,Ozer HG,Yilmaz AS,Pietrzak M,Webb A

doi

10.1186/s12859-019-3251-1

subject

Has Abstract

pub_date

2019-12-20 00:00:00

pages

670

issue

Suppl 24

issn

1471-2105

pii

10.1186/s12859-019-3251-1

journal_volume

20

pub_type

杂志文章
  • On the consistency of orthology relationships.

    abstract:BACKGROUND:Orthologs inference is the starting point of most comparative genomics studies, and a plethora of methods have been designed in the last decade to address this challenging task. In this paper we focus on the problems of deciding consistency with a species tree (known or not) of a partial set of orthology/par...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1267-3

    authors: Jones M,Paul C,Scornavacca C

    更新日期:2016-11-11 00:00:00

  • ImmunoGlobe: enabling systems immunology with a manually curated intercellular immune interaction network.

    abstract:BACKGROUND:While technological advances have made it possible to profile the immune system at high resolution, translating high-throughput data into knowledge of immune mechanisms has been challenged by the complexity of the interactions underlying immune processes. Tools to explore the immune network are critical for ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03702-3

    authors: Atallah MB,Tandon V,Hiam KJ,Boyce H,Hori M,Atallah W,Spitzer MH,Engleman E,Mallick P

    更新日期:2020-08-10 00:00:00

  • Comparative study of discretization methods of microarray data for inferring transcriptional regulatory networks.

    abstract:BACKGROUND:Microarray data discretization is a basic preprocess for many algorithms of gene regulatory network inference. Some common discretization methods in informatics are used to discretize microarray data. Selection of the discretization method is often arbitrary and no systematic comparison of different discreti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-520

    authors: Li Y,Liu L,Bai X,Cai H,Ji W,Guo D,Zhu Y

    更新日期:2010-10-19 00:00:00

  • SNP and gene networks construction and analysis from classification of copy number variations data.

    abstract:BACKGROUND:Detection of genomic DNA copy number variations (CNVs) can provide a complete and more comprehensive view of human disease. It is interesting to identify and represent relevant CNVs from a genome-wide data due to high data volume and the complexity of interactions. RESULTS:In this paper, we incorporate the ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S5-S4

    authors: Liu Y,Lee YF,Ng MK

    更新日期:2011-01-01 00:00:00

  • Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data.

    abstract:BACKGROUND:Microarray technology has become very popular for globally evaluating gene expression in biological samples. However, non-linear variation associated with the technology can make data interpretation unreliable. Therefore, methods to correct this kind of technical variation are critical. Here we consider a me...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-520

    authors: Pelz CR,Kulesz-Martin M,Bagby G,Sears RC

    更新日期:2008-12-04 00:00:00

  • Comparative study on gene set and pathway topology-based enrichment methods.

    abstract:BACKGROUND:Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0751-5

    authors: Bayerlová M,Jung K,Kramer F,Klemm F,Bleckmann A,Beißbarth T

    更新日期:2015-10-22 00:00:00

  • SegCorr a statistical procedure for the detection of genomic regions of correlated expression.

    abstract:BACKGROUND:Detecting local correlations in expression between neighboring genes along the genome has proved to be an effective strategy to identify possible causes of transcriptional deregulation in cancer. It has been successfully used to illustrate the role of mechanisms such as copy number variation (CNV) or epigene...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1742-5

    authors: Delatola EI,Lebarbier E,Mary-Huard T,Radvanyi F,Robin S,Wong J

    更新日期:2017-07-11 00:00:00

  • A novel parametric approach to mine gene regulatory relationship from microarray datasets.

    abstract:BACKGROUND:Microarray has been widely used to measure the gene expression level on the genome scale in the current decade. Many algorithms have been developed to reconstruct gene regulatory networks based on microarray data. Unfortunately, most of these models and algorithms focus on global properties of the expression...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S11-S15

    authors: Liu W,Li D,Liu Q,Zhu Y,He F

    更新日期:2010-12-14 00:00:00

  • Pathogenic Bacillus anthracis in the progressive gene losses and gains in adaptive evolution.

    abstract:BACKGROUND:Sequence mutations represent a driving force of adaptive evolution in bacterial pathogens. It is especially evident in reductive genome evolution where bacteria underwent lifestyles shifting from a free-living to a strictly intracellular or host-depending life. It resulted in loss-of-function mutations and/o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S1-S3

    authors: Yu GX

    更新日期:2009-01-30 00:00:00

  • Predicting blood pressure from physiological index data using the SVR algorithm.

    abstract:BACKGROUND:Blood pressure diseases have increasingly been identified as among the main factors threatening human health. How to accurately and conveniently measure blood pressure is the key to the implementation of effective prevention and control measures for blood pressure diseases. Traditional blood pressure measure...

    journal_title:BMC bioinformatics

    pub_type: 临床试验,杂志文章

    doi:10.1186/s12859-019-2667-y

    authors: Zhang B,Ren H,Huang G,Cheng Y,Hu C

    更新日期:2019-02-28 00:00:00

  • Evaluation of methods for differential expression analysis on multi-group RNA-seq count data.

    abstract:BACKGROUND:RNA-seq is a powerful tool for measuring transcriptomes, especially for identifying differentially expressed genes or transcripts (DEGs) between sample groups. A number of methods have been developed for this task, and several evaluation studies have also been reported. However, those evaluations so far have...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0794-7

    authors: Tang M,Sun J,Shimizu K,Kadota K

    更新日期:2015-11-04 00:00:00

  • The acquisition of novel N-glycosylation sites in conserved proteins during human evolution.

    abstract:BACKGROUND:N-linked protein glycosylation plays an important role in various biological processes, including protein folding and trafficking, and cell adhesion and signaling. The acquisition of a novel N-glycosylation site may have significant effect on protein structure and function, and therefore, on the phenotype. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0468-5

    authors: Kim DS,Hahn Y

    更新日期:2015-01-28 00:00:00

  • HAT: hypergeometric analysis of tiling-arrays with application to promoter-GeneChip data.

    abstract:BACKGROUND:Tiling-arrays are applicable to multiple types of biological research questions. Due to its advantages (high sensitivity, resolution, unbiased), the technology is often employed in genome-wide investigations. A major challenge in the analysis of tiling-array data is to define regions-of-interest, i.e., conti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-275

    authors: Taskesen E,Beekman R,de Ridder J,Wouters BJ,Peeters JK,Touw IP,Reinders MJ,Delwel R

    更新日期:2010-05-21 00:00:00

  • PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine.

    abstract:BACKGROUND:The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND) seeks to capture these data in a machine-rea...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-11

    authors: Donaldson I,Martin J,de Bruijn B,Wolting C,Lay V,Tuekam B,Zhang S,Baskin B,Bader GD,Michalickova K,Pawson T,Hogue CW

    更新日期:2003-03-27 00:00:00

  • Functional relevance of dynamic properties of Dimeric NADP-dependent Isocitrate Dehydrogenases.

    abstract:BACKGROUND:Isocitrate Dehydrogenases (IDHs) are important enzymes present in all living cells. Three subfamilies of functionally dimeric IDHs (subfamilies I, II, III) are known. Subfamily I are well-studied bacterial IDHs, like that of Escherischia coli. Subfamily II has predominantly eukaryotic members, but it also ha...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S17-S2

    authors: Vinekar R,Verma C,Ghosh I

    更新日期:2012-01-01 00:00:00

  • SAMPI: protein identification with mass spectra alignments.

    abstract:BACKGROUND:Mass spectrometry based peptide mass fingerprints (PMFs) offer a fast, efficient, and robust method for protein identification. A protein is digested (usually by trypsin) and its mass spectrum is compared to simulated spectra for protein sequences in a database. However, existing tools for analyzing PMFs oft...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-102

    authors: Kaltenbach HM,Wilke A,Böcker S

    更新日期:2007-03-26 00:00:00

  • AT excursion: a new approach to predict replication origins in viral genomes by locating AT-rich regions.

    abstract:BACKGROUND:Replication origins are considered important sites for understanding the molecular mechanisms involved in DNA replication. Many computational methods have been developed for predicting their locations in archaeal, bacterial and eukaryotic genomes. However, a prediction method designed for a particular kind o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-163

    authors: Chew DS,Leung MY,Choi KP

    更新日期:2007-05-21 00:00:00

  • Prediction of novel long non-coding RNAs based on RNA-Seq data of mouse Klf1 knockout study.

    abstract:BACKGROUND:Study on long non-coding RNAs (lncRNAs) has been promoted by high-throughput RNA sequencing (RNA-Seq). However, it is still not trivial to identify lncRNAs from the RNA-Seq data and it remains a challenge to uncover their functions. RESULTS:We present a computational pipeline for detecting novel lncRNAs fro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-331

    authors: Sun L,Zhang Z,Bailey TL,Perkins AC,Tallack MR,Xu Z,Liu H

    更新日期:2012-12-13 00:00:00

  • Integrating multiple molecular sources into a clinical risk prediction signature by extracting complementary information.

    abstract:BACKGROUND:High-throughput technology allows for genome-wide measurements at different molecular levels for the same patient, e.g. single nucleotide polymorphisms (SNPs) and gene expression. Correspondingly, it might be beneficial to also integrate complementary information from different molecular levels when building...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1183-6

    authors: Hieke S,Benner A,Schlenl RF,Schumacher M,Bullinger L,Binder H

    更新日期:2016-08-30 00:00:00

  • Stepwise kinetic equilibrium models of quantitative polymerase chain reaction.

    abstract:BACKGROUND:Numerous models for use in interpreting quantitative PCR (qPCR) data are present in recent literature. The most commonly used models assume the amplification in qPCR is exponential and fit an exponential model with a constant rate of increase to a select part of the curve. Kinetic theory may be used to model...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-203

    authors: Cobbs G

    更新日期:2012-08-16 00:00:00

  • ImiRP: a computational approach to microRNA target site mutation.

    abstract:BACKGROUND:MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs that function as post-transcriptional regulators of messenger RNA (mRNA) through base-pairing to 6-8 nucleotide long target sites, usually located within the mRNA 3' untranslated region. A common approach to validate and probe microRNA-mRNA interact...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1057-y

    authors: Ryan BC,Werner TS,Howard PL,Chow RL

    更新日期:2016-04-27 00:00:00

  • Construction and analysis of the protein-protein interaction networks for schizophrenia, bipolar disorder, and major depression.

    abstract:BACKGROUND:Schizophrenia, bipolar disorder, and major depression are devastating mental diseases, each with distinctive yet overlapping epidemiologic characteristics. Microarray and proteomics data have revealed genes which expressed abnormally in patients. Several single nucleotide polymorphisms (SNPs) and mutations a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S13-S20

    authors: Lee SA,Tsao TT,Yang KC,Lin H,Kuo YL,Hsu CH,Lee WK,Huang KC,Kao CY

    更新日期:2011-01-01 00:00:00

  • CONSTAX: a tool for improved taxonomic resolution of environmental fungal ITS sequences.

    abstract:BACKGROUND:One of the most crucial steps in high-throughput sequence-based microbiome studies is the taxonomic assignment of sequences belonging to operational taxonomic units (OTUs). Without taxonomic classification, functional and biological information of microbial communities cannot be inferred or interpreted. The ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1952-x

    authors: Gdanetz K,Benucci GMN,Vande Pol N,Bonito G

    更新日期:2017-12-06 00:00:00

  • DeepCryoPicker: fully automated deep neural network for single protein particle picking in cryo-EM.

    abstract:BACKGROUND:Cryo-electron microscopy (Cryo-EM) is widely used in the determination of the three-dimensional (3D) structures of macromolecules. Particle picking from 2D micrographs remains a challenging early step in the Cryo-EM pipeline due to the diversity of particle shapes and the extremely low signal-to-noise ratio ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03809-7

    authors: Al-Azzawi A,Ouadou A,Max H,Duan Y,Tanner JJ,Cheng J

    更新日期:2020-11-09 00:00:00

  • Promoter prediction in E. coli based on SIDD profiles and Artificial Neural Networks.

    abstract:BACKGROUND:One of the major challenges in biology is the correct identification of promoter regions. Computational methods based on motif searching have been the traditional approach taken. Recent studies have shown that DNA structural properties, such as curvature, stacking energy, and stress-induced duplex destabiliz...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S6-S17

    authors: Bland C,Newsome AS,Markovets AA

    更新日期:2010-10-07 00:00:00

  • Development and tuning of an original search engine for patent libraries in medicinal chemistry.

    abstract:BACKGROUND:The large increase in the size of patent collections has led to the need of efficient search strategies. But the development of advanced text-mining applications dedicated to patents of the biomedical field remains rare, in particular to address the needs of the pharmaceutical & biotech industry, which inten...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S1-S15

    authors: Pasche E,Gobeill J,Kreim O,Oezdemir-Zaech F,Vachon T,Lovis C,Ruch P

    更新日期:2014-01-01 00:00:00

  • PFBNet: a priori-fused boosting method for gene regulatory network inference.

    abstract:BACKGROUND:Inferring gene regulatory networks (GRNs) from gene expression data remains a challenge in system biology. In past decade, numerous methods have been developed for the inference of GRNs. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of pot...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03639-7

    authors: Che D,Guo S,Jiang Q,Chen L

    更新日期:2020-07-14 00:00:00

  • Quality determination and the repair of poor quality spots in array experiments.

    abstract:BACKGROUND:A common feature of microarray experiments is the occurrence of missing gene expression data. These missing values occur for a variety of reasons, in particular, because of the filtering of poor quality spots and the removal of undefined values when a logarithmic transformation is applied to negative backgro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-234

    authors: Tom BD,Gilks WR,Brooke-Powell ET,Ajioka JW

    更新日期:2005-09-26 00:00:00

  • Calibration and assessment of channel-specific biases in microarray data with extended dynamical range.

    abstract:BACKGROUND:Non-linearities in observed log-ratios of gene expressions, also known as intensity dependent log-ratios, can often be accounted for by global biases in the two channels being compared. Any step in a microarray process may introduce such offsets and in this article we study the biases introduced by the micro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-177

    authors: Bengtsson H,Jönsson G,Vallon-Christersson J

    更新日期:2004-11-12 00:00:00

  • Efficient and automated large-scale detection of structural relationships in proteins with a flexible aligner.

    abstract:BACKGROUND:The total number of known three-dimensional protein structures is rapidly increasing. Consequently, the need for fast structural search against complete databases without a significant loss of accuracy is increasingly demanding. Recently, TopSearch, an ultra-fast method for finding rigid structural relations...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0866-8

    authors: Gutiérrez FI,Rodriguez-Valenzuela F,Ibarra IL,Devos DP,Melo F

    更新日期:2016-01-05 00:00:00