REW-ISA: unveiling local functional blocks in epi-transcriptome profiling data via an RNA expression-weighted iterative signature algorithm.

Abstract:

BACKGROUND:Recent studies have shown that N6-methyladenosine (m6A) plays a critical role in numbers of biological processes and complex human diseases. However, the regulatory mechanisms of most methylation sites remain uncharted. Thus, in-depth study of the epi-transcriptomic patterns of m6A may provide insights into its complex functional and regulatory mechanisms. RESULTS:Due to the high economic and time cost of wet experimental methods, revealing methylation patterns through computational models has become a more preferable way, and drawn more and more attention. Considering the theoretical basics and applications of conventional clustering methods, an RNA Expression Weighted Iterative Signature Algorithm (REW-ISA) is proposed to find potential local functional blocks (LFBs) based on MeRIP-Seq data, where sites are hyper-methylated or hypo-methylated simultaneously across the specific conditions. REW-ISA adopts RNA expression levels of each site as weights to make sites of lower expression level less significant. It starts from random sets of sites, then follows iterative search strategies by thresholds of rows and columns to find the LFBs in m6A methylation profile. Its application on MeRIP-Seq data of 69,446 methylation sites under 32 experimental conditions unveiled 6 LFBs, which achieve higher enrichment scores than ISA. Pathway analysis and enzyme specificity test showed that sites remained in LFBs are highly relevant to the m6A methyltransferase, such as METTL3, METTL14, WTAP and KIAA1429. Further detailed analyses for each LFB even showed that some LFBs are condition-specific, indicating that methylation profiles of some specific sites may be condition relevant. CONCLUSIONS:REW-ISA finds potential local functional patterns presented in m6A profiles, where sites are co-methylated under specific conditions.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Zhang L,Chen S,Zhu J,Meng J,Liu H

doi

10.1186/s12859-020-03787-w

subject

Has Abstract

pub_date

2020-10-09 00:00:00

pages

447

issue

1

issn

1471-2105

pii

10.1186/s12859-020-03787-w

journal_volume

21

pub_type

杂志文章
  • Quantiprot - a Python package for quantitative analysis of protein sequences.

    abstract:BACKGROUND:The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where seq...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1751-4

    authors: Konopka BM,Marciniak M,Dyrka W

    更新日期:2017-07-17 00:00:00

  • Proceedings of the 2018 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) conference.

    abstract:: ...

    journal_title:BMC bioinformatics

    pub_type: 历史文章,杂志文章

    doi:10.1186/s12859-019-2618-7

    authors: Wren JD,Doerkson RJ,Toby IT,Nanduri B,Homayouni R,Manda P,Thakkar S

    更新日期:2019-03-14 00:00:00

  • Structator: fast index-based search for RNA sequence-structure patterns.

    abstract:BACKGROUND:The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running ti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-214

    authors: Meyer F,Kurtz S,Backofen R,Will S,Beckstette M

    更新日期:2011-05-27 00:00:00

  • A weighted string kernel for protein fold recognition.

    abstract:BACKGROUND:Alignment-free methods for comparing protein sequences have proved to be viable alternatives to approaches that first rely on an alignment of the sequences to be compared. Much work however need to be done before those methods provide reliable fold recognition for proteins whose sequences share little simila...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1795-5

    authors: Nojoomi S,Koehl P

    更新日期:2017-08-25 00:00:00

  • Network-based group variable selection for detecting expression quantitative trait loci (eQTL).

    abstract:BACKGROUND:Analysis of expression quantitative trait loci (eQTL) aims to identify the genetic loci associated with the expression level of genes. Penalized regression with a proper penalty is suitable for the high-dimensional biological data. Its performance should be enhanced when we incorporate biological knowledge o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-269

    authors: Wang W,Zhang X

    更新日期:2011-06-30 00:00:00

  • A computational framework for modeling and studying pertussis epidemiology and vaccination.

    abstract:BACKGROUND:Emerging and re-emerging infectious diseases such as Zika, SARS, ncovid19 and Pertussis, pose a compelling challenge for epidemiologists due to their significant impact on global public health. In this context, computational models and computer simulations are one of the available research tools that epidemi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03648-6

    authors: Castagno P,Pernice S,Ghetti G,Povero M,Pradelli L,Paolotti D,Balbo G,Sereno M,Beccuti M

    更新日期:2020-09-16 00:00:00

  • Predicting blood pressure from physiological index data using the SVR algorithm.

    abstract:BACKGROUND:Blood pressure diseases have increasingly been identified as among the main factors threatening human health. How to accurately and conveniently measure blood pressure is the key to the implementation of effective prevention and control measures for blood pressure diseases. Traditional blood pressure measure...

    journal_title:BMC bioinformatics

    pub_type: 临床试验,杂志文章

    doi:10.1186/s12859-019-2667-y

    authors: Zhang B,Ren H,Huang G,Cheng Y,Hu C

    更新日期:2019-02-28 00:00:00

  • 3DScapeCS: application of three dimensional, parallel, dynamic network visualization in Cytoscape.

    abstract:BACKGROUND:The exponential growth of gigantic biological data from various sources, such as protein-protein interaction (PPI), genome sequences scaffolding, Mass spectrometry (MS) molecular networking and metabolic flux, demands an efficient way for better visualization and interpretation beyond the conventional, two-d...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-322

    authors: Wang Q,Tang B,Song L,Ren B,Liang Q,Xie F,Zhuo Y,Liu X,Zhang L

    更新日期:2013-11-14 00:00:00

  • Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology.

    abstract:BACKGROUND:An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biolo...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-94

    authors: Hill SM,Neve RM,Bayani N,Kuo WL,Ziyad S,Spellman PT,Gray JW,Mukherjee S

    更新日期:2012-05-11 00:00:00

  • Application of the common base method to regression and analysis of covariance (ANCOVA) in qPCR experiments and subsequent relative expression calculation.

    abstract:BACKGROUND:Quantitative polymerase chain reaction (qPCR) is the technique of choice for quantifying gene expression. While the technique itself is well established, approaches for the analysis of qPCR data continue to improve. RESULTS:Here we expand on the common base method to develop procedures for testing linear re...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03696-y

    authors: Ganger MT,Dietz GD,Headley P,Ewing SJ

    更新日期:2020-09-29 00:00:00

  • Predicting substrates of the human breast cancer resistance protein using a support vector machine method.

    abstract:BACKGROUND:Human breast cancer resistance protein (BCRP) is an ATP-binding cassette (ABC) efflux transporter that confers multidrug resistance in cancers and also plays an important role in the absorption, distribution and elimination of drugs. Prediction as to if drugs or new molecular entities are BCRP substrates sho...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-130

    authors: Hazai E,Hazai I,Ragueneau-Majlessi I,Chung SP,Bikadi Z,Mao Q

    更新日期:2013-04-15 00:00:00

  • ConEVA: a toolbox for comprehensive assessment of protein contacts.

    abstract:BACKGROUND:In recent years, successful contact prediction methods and contact-guided ab initio protein structure prediction methods have highlighted the importance of incorporating contact information into protein structure prediction methods. It is also observed that for almost all globular proteins, the quality of co...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1404-z

    authors: Adhikari B,Nowotny J,Bhattacharya D,Hou J,Cheng J

    更新日期:2016-12-07 00:00:00

  • VITCOMIC: visualization tool for taxonomic compositions of microbial communities based on 16S rRNA gene sequences.

    abstract:BACKGROUND:Understanding the community structure of microbes is typically accomplished by sequencing 16S ribosomal RNA (16S rRNA) genes. These community data can be represented by constructing a phylogenetic tree and comparing it with other samples using statistical methods. However, owing to high computational complex...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-332

    authors: Mori H,Maruyama F,Kurokawa K

    更新日期:2010-06-18 00:00:00

  • tcR: an R package for T cell receptor repertoire advanced data analysis.

    abstract:BACKGROUND:The Immunoglobulins (IG) and the T cell receptors (TR) play the key role in antigen recognition during the adaptive immune response. Recent progress in next-generation sequencing technologies has provided an opportunity for the deep T cell receptor repertoire profiling. However, a specialised software is req...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0613-1

    authors: Nazarov VI,Pogorelyy MV,Komech EA,Zvyagin IV,Bolotin DA,Shugay M,Chudakov DM,Lebedev YB,Mamedov IZ

    更新日期:2015-05-28 00:00:00

  • DLAD4U: deriving and prioritizing disease lists from PubMed literature.

    abstract:BACKGROUND:Due to recent technology advancements, disease related knowledge is growing rapidly. It becomes nontrivial to go through all published literature to identify associations between human diseases and genetic, environmental, and life style factors, disease symptoms, and treatment strategies. Here we report DLAD...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2463-0

    authors: Shen J,Vasaikar S,Zhang B

    更新日期:2018-12-28 00:00:00

  • CorrelaGenes: a new tool for the interpretation of the human transcriptome.

    abstract:BACKGROUND:The amount of gene expression data available in public repositories has grown exponentially in the last years, now requiring new data mining tools to transform them in information easily accessible to biologists. RESULTS:By exploiting expression data publicly available in the Gene Expression Omnibus (GEO) d...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S1-S6

    authors: Cremaschi P,Rovida S,Sacchi L,Lisa A,Calvi F,Montecucco A,Biamonti G,Bione S,Sacchi G

    更新日期:2014-01-01 00:00:00

  • Multi-label literature classification based on the Gene Ontology graph.

    abstract:BACKGROUND:The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-525

    authors: Jin B,Muller B,Zhai C,Lu X

    更新日期:2008-12-08 00:00:00

  • GO2MSIG, an automated GO based multi-species gene set generator for gene set enrichment analysis.

    abstract:BACKGROUND:Despite the widespread use of high throughput expression platforms and the availability of a desktop implementation of Gene Set Enrichment Analysis (GSEA) that enables non-experts to perform gene set based analyses, the availability of the necessary precompiled gene sets is rare for species other than human....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-146

    authors: Powell JA

    更新日期:2014-05-17 00:00:00

  • AT excursion: a new approach to predict replication origins in viral genomes by locating AT-rich regions.

    abstract:BACKGROUND:Replication origins are considered important sites for understanding the molecular mechanisms involved in DNA replication. Many computational methods have been developed for predicting their locations in archaeal, bacterial and eukaryotic genomes. However, a prediction method designed for a particular kind o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-163

    authors: Chew DS,Leung MY,Choi KP

    更新日期:2007-05-21 00:00:00

  • A simple method for assessing sample sizes in microarray experiments.

    abstract:BACKGROUND:In this short article, we discuss a simple method for assessing sample size requirements in microarray experiments. RESULTS:Our method starts with the output from a permutation-based analysis for a set of pilot data, e.g. from the SAM package. Then for a given hypothesized mean difference and various sample...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-106

    authors: Tibshirani R

    更新日期:2006-03-02 00:00:00

  • New challenges for text mining: mapping between text and manually curated pathways.

    abstract:BACKGROUND:Associating literature with pathways poses new challenges to the Text Mining (TM) community. There are three main challenges to this task: (1) the identification of the mapping position of a specific entity or reaction in a given pathway, (2) the recognition of the causal relationships among multiple reactio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S3-S5

    authors: Oda K,Kim JD,Ohta T,Okanohara D,Matsuzaki T,Tateisi Y,Tsujii J

    更新日期:2008-04-11 00:00:00

  • The tumor as an organ: comprehensive spatial and temporal modeling of the tumor and its microenvironment.

    abstract:BACKGROUND:Research related to cancer is vast, and continues in earnest in many directions. Due to the complexity of cancer, a better understanding of tumor growth dynamics can be gleaned from a dynamic computational model. We present a comprehensive, fully executable, spatial and temporal 3D computational model of the...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1168-5

    authors: Bloch N,Harel D

    更新日期:2016-08-24 00:00:00

  • An improved method for identifying functionally linked proteins using phylogenetic profiles.

    abstract:BACKGROUND:Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likelihood that the two prote...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S4-S7

    authors: Cokus S,Mizutani S,Pellegrini M

    更新日期:2007-05-22 00:00:00

  • Sequence-based identification of recombination spots using pseudo nucleic acid representation and recursive feature extraction by linear kernel SVM.

    abstract:BACKGROUND:Identification of the recombination hot/cold spots is critical for understanding the mechanism of recombination as well as the genome evolution process. However, experimental identification of recombination spots is both time-consuming and costly. Developing an accurate and automated method for reliably and ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-340

    authors: Li L,Yu S,Xiao W,Li Y,Huang L,Zheng X,Zhou S,Yang H

    更新日期:2014-11-20 00:00:00

  • Critique of the pairwise method for estimating qPCR amplification efficiency: beware of correlated data!

    abstract:BACKGROUND:A recently proposed method for estimating qPCR amplification efficiency E analyzes fluorescence intensity ratios from pairs of points deemed to lie in the exponential growth region on the amplification curves for all reactions in a dilution series. This method suffers from a serious problem: The resulting ra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03604-4

    authors: Tellinghuisen J

    更新日期:2020-07-08 00:00:00

  • Relation extraction between bacteria and biotopes from biomedical texts with attention mechanisms and domain-specific contextual representations.

    abstract:BACKGROUND:The Bacteria Biotope (BB) task is a biomedical relation extraction (RE) that aims to study the interaction between bacteria and their locations. This task is considered to pertain to fundamental knowledge in applied microbiology. Some previous investigations conducted the study by applying feature-based mode...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3217-3

    authors: Jettakul A,Wichadakul D,Vateekul P

    更新日期:2019-12-03 00:00:00

  • PIGS: improved estimates of identity-by-descent probabilities by probabilistic IBD graph sampling.

    abstract::Identifying segments in the genome of different individuals that are identical-by-descent (IBD) is a fundamental element of genetics. IBD data is used for numerous applications including demographic inference, heritability estimation, and mapping disease loci. Simultaneous detection of IBD over multiple haplotypes has...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-16-S5-S9

    authors: Park DS,Baran Y,Hormozdiari F,Eng C,Torgerson DG,Burchard EG,Zaitlen N

    更新日期:2015-01-01 00:00:00

  • Linear space string correction algorithm using the Damerau-Levenshtein distance.

    abstract:BACKGROUND:The Damerau-Levenshtein (DL) distance metric has been widely used in the biological science. It tries to identify the similar region of DNA,RNA and protein sequences by transforming one sequence to the another using the substitution, insertion, deletion and transposition operations. Lowrance and Wagner have ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3184-8

    authors: Zhao C,Sahni S

    更新日期:2020-12-09 00:00:00

  • Logical development of the cell ontology.

    abstract:BACKGROUND:The Cell Ontology (CL) is an ontology for the representation of in vivo cell types. As biological ontologies such as the CL grow in complexity, they become increasingly difficult to use and maintain. By making the information in the ontology computable, we can use automated reasoners to detect errors and ass...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-6

    authors: Meehan TF,Masci AM,Abdulla A,Cowell LG,Blake JA,Mungall CJ,Diehl AD

    更新日期:2011-01-05 00:00:00

  • Hierarchical modularity of nested bow-ties in metabolic networks.

    abstract:BACKGROUND:The exploration of the structural topology and the organizing principles of genome-based large-scale metabolic networks is essential for studying possible relations between structure and functionality of metabolic networks. Topological analysis of graph models has often been applied to study the structural c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-386

    authors: Zhao J,Yu H,Luo JH,Cao ZW,Li YX

    更新日期:2006-08-18 00:00:00