PseUI: Pseudouridine sites identification based on RNA sequence information.

Abstract:

BACKGROUND:Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. However, the predictive accuracy of existing computational methods is not satisfactory and still needs improvement. RESULTS:In this study, we developed a new model, PseUI, for Ψ sites identification in three species, which are H. sapiens, S. cerevisiae, and M. musculus. Firstly, five different kinds of features including nucleotide composition (NC), dinucleotide composition (DC), pseudo dinucleotide composition (pseDNC), position-specific nucleotide propensity (PSNP), and position-specific dinucleotide propensity (PSDP) were generated based on RNA segments. Then, a sequential forward feature selection strategy was used to gain an effective feature subset with a compact representation but discriminative prediction power. Based on the selected feature subsets, we built our model by using a support vector machine (SVM). Finally, the generalization of our model was validated by both the jackknife test and independent validation tests on the benchmark datasets. The experimental results showed that our model is more accurate and stable than the previously published models. We have also provided a user-friendly web server for our model at http://zhulab.ahu.edu.cn/PseUI , and a brief instruction for the web server is provided in this paper. By using this instruction, the academic users can conveniently get their desired results without complicated calculations. CONCLUSION:In this study, we proposed a new predictor, PseUI, to detect Ψ sites in RNA sequences. It is shown that our model outperformed the existing state-of-art models. It is expected that our model, PseUI, will become a useful tool for accurate identification of RNA Ψ sites.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

He J,Fang T,Zhang Z,Huang B,Zhu X,Xiong Y

doi

10.1186/s12859-018-2321-0

subject

Has Abstract

pub_date

2018-08-29 00:00:00

pages

306

issue

1

issn

1471-2105

pii

10.1186/s12859-018-2321-0

journal_volume

19

pub_type

杂志文章
  • Mining published lists of cancer related microarray experiments: identification of a gene expression signature having a critical role in cell-cycle control.

    abstract:BACKGROUND:Routine application of gene expression microarray technology is rapidly producing large amounts of data that necessitate new approaches of analysis. The analysis of a specific microarray experiment profits enormously from cross-comparing to other experiments. This process is generally performed by numerical ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-S4-S14

    authors: Finocchiaro G,Mancuso F,Muller H

    更新日期:2005-12-01 00:00:00

  • SHIVA - a web application for drug resistance and tropism testing in HIV.

    abstract:BACKGROUND:Drug resistance testing is mandatory in antiretroviral therapy in human immunodeficiency virus (HIV) infected patients for successful treatment. The emergence of resistances against antiretroviral agents remains the major obstacle in inhibition of viral replication and thus to control infection. Due to the h...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1179-2

    authors: Riemenschneider M,Hummel T,Heider D

    更新日期:2016-08-22 00:00:00

  • Epiviz: a view inside the design of an integrated visual analysis software for genomics.

    abstract:BACKGROUND:Computational and visual data analysis for genomics has traditionally involved a combination of tools and resources, of which the most ubiquitous consist of genome browsers, focused mainly on integrative visualization of large numbers of big datasets, and computational environments, focused on data modeling ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-16-S11-S4

    authors: Chelaru F,Corrada Bravo H

    更新日期:2015-01-01 00:00:00

  • SDA: a semi-parametric differential abundance analysis method for metabolomics and proteomics data.

    abstract:BACKGROUND:Identifying differentially abundant features between different experimental groups is a common goal for many metabolomics and proteomics studies. However, analyzing data from mass spectrometry (MS) is difficult because the data may not be normally distributed and there is often a large fraction of zero value...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3067-z

    authors: Li Y,Fan TWM,Lane AN,Kang WY,Arnold SM,Stromberg AJ,Wang C,Chen L

    更新日期:2019-10-17 00:00:00

  • Image-based classification of plant genus and family for trained and untrained plant species.

    abstract:BACKGROUND:Modern plant taxonomy reflects phylogenetic relationships among taxa based on proposed morphological and genetic similarities. However, taxonomical relation is not necessarily reflected by close overall resemblance, but rather by commonality of very specific morphological characters or similarity on the mole...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2474-x

    authors: Seeland M,Rzanny M,Boho D,Wäldchen J,Mäder P

    更新日期:2019-01-03 00:00:00

  • Improving interoperability between microbial information and sequence databases.

    abstract:BACKGROUND:Biological resources are essential tools for biomedical research. Their availability is promoted through on-line catalogues. Common Access to Biological Resources and Information (CABRI) is a service for distribution of biological resources and related data collected by 28 European culture collections. Linki...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-S4-S23

    authors: Romano P,Dawyndt P,Piersigilli F,Swings J

    更新日期:2005-12-01 00:00:00

  • Verification and validation of bioinformatics software without a gold standard: a case study of BWA and Bowtie.

    abstract:BACKGROUND:Bioinformatics software quality assurance is essential in genomic medicine. Systematic verification and validation of bioinformatics software is difficult because it is often not possible to obtain a realistic "gold standard" for systematic evaluation. Here we apply a technique that originates from the softw...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S16-S15

    authors: Giannoulatou E,Park SH,Humphreys DT,Ho JW

    更新日期:2014-01-01 00:00:00

  • circRNAprofiler: an R-based computational framework for the downstream analysis of circular RNAs.

    abstract:BACKGROUND:Circular RNAs (circRNAs) are a newly appreciated class of non-coding RNA molecules. Numerous tools have been developed for the detection of circRNAs, however computational tools to perform downstream functional analysis of circRNAs are scarce. RESULTS:We present circRNAprofiler, an R-based computational fra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3500-3

    authors: Aufiero S,Reckman YJ,Tijsen AJ,Pinto YM,Creemers EE

    更新日期:2020-04-29 00:00:00

  • Non-coding RNA detection methods combined to improve usability, reproducibility and precision.

    abstract:BACKGROUND:Non-coding RNAs gain more attention as their diverse roles in many cellular processes are discovered. At the same time, the need for efficient computational prediction of ncRNAs increases with the pace of sequencing technology. Existing tools are based on various approaches and techniques, but none of them p...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-491

    authors: Raasch P,Schmitz U,Patenge N,Vera J,Kreikemeyer B,Wolkenhauer O

    更新日期:2010-09-29 00:00:00

  • IPRStats: visualization of the functional potential of an InterProScan run.

    abstract:BACKGROUND:InterPro is a collection of protein signatures for the classification and automated annotation of proteins. Interproscan is a software tool that scans protein sequences against Interpro member databases using a variety of profile-based, hidden markov model and positional specific score matrix methods. It not...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S12-S13

    authors: Kelly RJ,Vincent DE,Friedberg I

    更新日期:2010-12-21 00:00:00

  • SPdb--a signal peptide database.

    abstract:BACKGROUND:The signal peptide plays an important role in protein targeting and protein translocation in both prokaryotic and eukaryotic cells. This transient, short peptide sequence functions like a postal address on an envelope by targeting proteins for secretion or for transfer to specific organelles for further proc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-249

    authors: Choo KH,Tan TW,Ranganathan S

    更新日期:2005-10-13 00:00:00

  • Bioinformatics Resource Manager: a systems biology web tool for microRNA and omics data integration.

    abstract:BACKGROUND:The Bioinformatics Resource Manager (BRM) is a web-based tool developed to facilitate identifier conversion and data integration for Homo sapiens (human), Mus musculus (mouse), Rattus norvegicus (rat), Danio rerio (zebrafish), and Macaca mulatta (macaque), as well as perform orthologous conversions among the...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2805-6

    authors: Brown J,Phillips AR,Lewis DA,Mans MA,Chang Y,Tanguay RL,Peterson ES,Waters KM,Tilton SC

    更新日期:2019-05-17 00:00:00

  • COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project.

    abstract:BACKGROUND:With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0369-z

    authors: Bergmann FT,Adams R,Moodie S,Cooper J,Glont M,Golebiewski M,Hucka M,Laibe C,Miller AK,Nickerson DP,Olivier BG,Rodriguez N,Sauro HM,Scharm M,Soiland-Reyes S,Waltemath D,Yvon F,Le Novère N

    更新日期:2014-12-14 00:00:00

  • Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.

    abstract:BACKGROUND:Microarray technologies produced large amount of data. The hierarchical clustering is commonly used to identify clusters of co-expressed genes. However, microarray datasets often contain missing values (MVs) representing a major drawback for the use of the clustering methods. Usually the MVs are not treated,...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-114

    authors: de Brevern AG,Hazout S,Malpertuy A

    更新日期:2004-08-23 00:00:00

  • TableButler - a Windows based tool for processing large data tables generated with high-throughput methods.

    abstract:BACKGROUND:High-throughput "omics" based data analysis play emerging roles in life sciences and molecular diagnostics. This emphasizes the urgent need for user-friendly windows-based software interfaces that could process the diversity of large tab-delimited raw data files generated by these methods. Depending on the s...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-235

    authors: Schwager C,Wirkner U,Abdollahi A,Huber PE

    更新日期:2009-07-29 00:00:00

  • The tumor as an organ: comprehensive spatial and temporal modeling of the tumor and its microenvironment.

    abstract:BACKGROUND:Research related to cancer is vast, and continues in earnest in many directions. Due to the complexity of cancer, a better understanding of tumor growth dynamics can be gleaned from a dynamic computational model. We present a comprehensive, fully executable, spatial and temporal 3D computational model of the...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1168-5

    authors: Bloch N,Harel D

    更新日期:2016-08-24 00:00:00

  • An improved method for identifying functionally linked proteins using phylogenetic profiles.

    abstract:BACKGROUND:Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likelihood that the two prote...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S4-S7

    authors: Cokus S,Mizutani S,Pellegrini M

    更新日期:2007-05-22 00:00:00

  • Exploring community structure in biological networks with random graphs.

    abstract:BACKGROUND:Community structure is ubiquitous in biological networks. There has been an increased interest in unraveling the community structure of biological systems as it may provide important insights into a system's functional components and the impact of local structures on dynamics at a global scale. Choosing an a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-220

    authors: Sah P,Singh LO,Clauset A,Bansal S

    更新日期:2014-06-25 00:00:00

  • GObar: a gene ontology based analysis and visualization tool for gene sets.

    abstract:BACKGROUND:Microarray experiments, as well as other genomic analyses, often result in large gene sets containing up to several hundred genes. The biological significance of such sets of genes is, usually, not readily apparent. Identification of the functions of the genes in the set can help highlight features of intere...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-189

    authors: Lee JS,Katari G,Sachidanandam R

    更新日期:2005-07-25 00:00:00

  • ETHNOPRED: a novel machine learning method for accurate continental and sub-continental ancestry identification and population stratification correction.

    abstract:BACKGROUND:Population stratification is a systematic difference in allele frequencies between subpopulations. This can lead to spurious association findings in the case-control genome wide association studies (GWASs) used to identify single nucleotide polymorphisms (SNPs) associated with disease-linked phenotypes. Meth...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-61

    authors: Hajiloo M,Sapkota Y,Mackey JR,Robson P,Greiner R,Damaraju S

    更新日期:2013-02-22 00:00:00

  • De novo profile generation based on sequence context specificity with the long short-term memory network.

    abstract:BACKGROUND:Long short-term memory (LSTM) is one of the most attractive deep learning methods to learn time series or contexts of input data. Increasing studies, including biological sequence analyses in bioinformatics, utilize this architecture. Amino acid sequence profiles are widely used for bioinformatics studies, s...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2284-1

    authors: Yamada KD,Kinoshita K

    更新日期:2018-07-18 00:00:00

  • Identification of properties important to protein aggregation using feature selection.

    abstract:BACKGROUND:Protein aggregation is a significant problem in the biopharmaceutical industry (protein drug stability) and is associated medically with over 40 human diseases. Although a number of computational models have been developed for predicting aggregation propensity and identifying aggregation-prone regions in pro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-314

    authors: Fang Y,Gao S,Tai D,Middaugh CR,Fang J

    更新日期:2013-10-28 00:00:00

  • A new method for 2D gel spot alignment: application to the analysis of large sample sets in clinical proteomics.

    abstract:BACKGROUND:In current comparative proteomics studies, the large number of images generated by 2D gels is currently compared using spot matching algorithms. Unfortunately, differences in gel migration and sample variability make efficient spot alignment very difficult to obtain, and, as consequence most of the software ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-460

    authors: Pérès S,Molina L,Salvetat N,Granier C,Molina F

    更新日期:2008-10-28 00:00:00

  • Computing all hybridization networks for multiple binary phylogenetic input trees.

    abstract:BACKGROUND:The computation of phylogenetic trees on the same set of species that are based on different orthologous genes can lead to incongruent trees. One possible explanation for this behavior are interspecific hybridization events recombining genes of different species. An important approach to analyze such events ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0660-7

    authors: Albrecht B

    更新日期:2015-07-30 00:00:00

  • Robustness of signal detection in cryo-electron microscopy via a bi-objective-function approach.

    abstract:BACKGROUND:The detection of weak signals and selection of single particles from low-contrast micrographs of frozen hydrated biomolecules by cryo-electron microscopy (cryo-EM) represents a major practical bottleneck in cryo-EM data analysis. Template-based particle picking by an objective function using fast local corre...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2714-8

    authors: Wang WL,Yu Z,Castillo-Menendez LR,Sodroski J,Mao Y

    更新日期:2019-04-03 00:00:00

  • proTRAC--a software for probabilistic piRNA cluster detection, visualization and analysis.

    abstract:BACKGROUND:Throughout the metazoan lineage, typically gonadal expressed Piwi proteins and their guiding piRNAs (~26-32nt in length) form a protective mechanism of RNA interference directed against the propagation of transposable elements (TEs). Most piRNAs are generated from genomic piRNA clusters. Annotation of experi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-5

    authors: Rosenkranz D,Zischler H

    更新日期:2012-01-10 00:00:00

  • Automated NMR relaxation dispersion data analysis using NESSY.

    abstract:BACKGROUND:Proteins are dynamic molecules with motions ranging from picoseconds to longer than seconds. Many protein functions, however, appear to occur on the micro to millisecond timescale and therefore there has been intense research of the importance of these motions in catalysis and molecular interactions. Nuclear...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-421

    authors: Bieri M,Gooley PR

    更新日期:2011-10-27 00:00:00

  • The exploration of disease-specific gene regulatory networks in esophageal carcinoma and stomach adenocarcinoma.

    abstract:BACKGROUND:Feed-forward loops (FFLs), consisting of miRNAs, transcription factors (TFs) and their common target genes, have been validated to be important for the initialization and development of complex diseases, including cancer. Esophageal Carcinoma (ESCA) and Stomach Adenocarcinoma (STAD) are two types of malignan...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3230-6

    authors: Qin G,Yang L,Ma Y,Liu J,Huo Q

    更新日期:2019-12-30 00:00:00

  • A novel statistical approach for identification of the master regulator transcription factor.

    abstract:BACKGROUND:Transcription factors are known to play key roles in carcinogenesis and therefore, are gaining popularity as potential therapeutic targets in drug development. A 'master regulator' transcription factor often appears to control most of the regulatory activities of the other transcription factors and the assoc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1499-x

    authors: Sikdar S,Datta S

    更新日期:2017-02-02 00:00:00

  • Ontological representation, integration, and analysis of LINCS cell line cells and their cellular responses.

    abstract:BACKGROUND:Aiming to understand cellular responses to different perturbations, the NIH Common Fund Library of Integrated Network-based Cellular Signatures (LINCS) program involves many institutes and laboratories working on over a thousand cell lines. The community-based Cell Line Ontology (CLO) is selected as the defa...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1981-5

    authors: Ong E,Xie J,Ni Z,Liu Q,Sarntivijai S,Lin Y,Cooper D,Terryn R,Stathias V,Chung C,Schürer S,He Y

    更新日期:2017-12-21 00:00:00