Graph regularized L2,1-nonnegative matrix factorization for miRNA-disease association prediction.

Abstract:

BACKGROUND:The aberrant expression of microRNAs is closely connected to the occurrence and development of a great deal of human diseases. To study human diseases, numerous effective computational models that are valuable and meaningful have been presented by researchers. RESULTS:Here, we present a computational framework based on graph Laplacian regularized L2, 1-nonnegative matrix factorization (GRL2, 1-NMF) for inferring possible human disease-connected miRNAs. First, manually validated disease-connected microRNAs were integrated, and microRNA functional similarity information along with two kinds of disease semantic similarities were calculated. Next, we measured Gaussian interaction profile (GIP) kernel similarities for both diseases and microRNAs. Then, we adopted a preprocessing step, namely, weighted K nearest known neighbours (WKNKN), to decrease the sparsity of the miRNA-disease association matrix network. Finally, the GRL2,1-NMF framework was used to predict links between microRNAs and diseases. CONCLUSIONS:The new method (GRL2, 1-NMF) achieved AUC values of 0.9280 and 0.9276 in global leave-one-out cross validation (global LOOCV) and five-fold cross validation (5-CV), respectively, showing that GRL2, 1-NMF can powerfully discover potential disease-related miRNAs, even if there is no known associated disease.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Gao Z,Wang YT,Wu QW,Ni JC,Zheng CH

doi

10.1186/s12859-020-3409-x

subject

Has Abstract

pub_date

2020-02-18 00:00:00

pages

61

issue

1

issn

1471-2105

pii

10.1186/s12859-020-3409-x

journal_volume

21

pub_type

杂志文章
  • The identification of informative genes from multiple datasets with increasing complexity.

    abstract:BACKGROUND:In microarray data analysis, factors such as data quality, biological variation, and the increasingly multi-layered nature of more complex biological systems complicates the modelling of regulatory networks that can represent and capture the interactions among genes. We believe that the use of multiple datas...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-32

    authors: Anvar SY,'t Hoen PA,Tucker A

    更新日期:2010-01-15 00:00:00

  • A semi-supervised learning approach to predict synthetic genetic interactions by combining functional and topological properties of functional gene network.

    abstract:BACKGROUND:Genetic interaction profiles are highly informative and helpful for understanding the functional linkages between genes, and therefore have been extensively exploited for annotating gene functions and dissecting specific pathway structures. However, our understanding is rather limited to the relationship bet...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-343

    authors: You ZH,Yin Z,Han K,Huang DS,Zhou X

    更新日期:2010-06-24 00:00:00

  • Analysis of cancer metabolism with high-throughput technologies.

    abstract:BACKGROUND:Recent advances in genomics and proteomics have allowed us to study the nuances of the Warburg effect--a long-standing puzzle in cancer energy metabolism--at an unprecedented level of detail. While modern next-generation sequencing technologies are extremely powerful, the lack of appropriate data analysis to...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S10-S8

    authors: Markovets AA,Herman D

    更新日期:2011-10-18 00:00:00

  • Optimal neighborhood indexing for protein similarity search.

    abstract:BACKGROUND:Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional informa...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-534

    authors: Peterlongo P,Noé L,Lavenier D,Nguyen VH,Kucherov G,Giraud M

    更新日期:2008-12-16 00:00:00

  • Determining gene expression on a single pair of microarrays.

    abstract:BACKGROUND:In microarray experiments the numbers of replicates are often limited due to factors such as cost, availability of sample or poor hybridization. There are currently few choices for the analysis of a pair of microarrays where N = 1 in each condition. In this paper, we demonstrate the effectiveness of a new al...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-489

    authors: Reid RW,Fodor AA

    更新日期:2008-11-21 00:00:00

  • Enhancing HMM-based protein profile-profile alignment with structural features and evolutionary coupling information.

    abstract:BACKGROUND:Protein sequence profile-profile alignment is an important approach to recognizing remote homologs and generating accurate pairwise alignments. It plays an important role in protein sequence database search, protein structure prediction, protein function prediction, and phylogenetic analysis. RESULTS:In thi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-252

    authors: Deng X,Cheng J

    更新日期:2014-07-25 00:00:00

  • Enrichment of homologs in insignificant BLAST hits by co-complex network alignment.

    abstract:BACKGROUND:Homology is a crucial concept in comparative genomics. The algorithm probably most widely used for homology detection in comparative genomics, is BLAST. Usually a stringent score cutoff is applied to distinguish putative homologs from possible false positive hits. As a consequence, some BLAST hits are discar...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-86

    authors: Fokkens L,Botelho SM,Boekhorst J,Snel B

    更新日期:2010-02-12 00:00:00

  • A benchmark study of sequence alignment methods for protein clustering.

    abstract:BACKGROUND:Protein sequence alignment analyses have become a crucial step for many bioinformatics studies during the past decades. Multiple sequence alignment (MSA) and pair-wise sequence alignment (PSA) are two major approaches in sequence alignment. Former benchmark studies revealed drawbacks of MSA methods on nucleo...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2524-4

    authors: Wang Y,Wu H,Cai Y

    更新日期:2018-12-31 00:00:00

  • CNV-WebStore: online CNV analysis, storage and interpretation.

    abstract:BACKGROUND:Microarray technology allows the analysis of genomic aberrations at an ever increasing resolution, making functional interpretation of these vast amounts of data the main bottleneck in routine implementation of high resolution array platforms, and emphasising the need for a centralised and easy to use CNV da...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-4

    authors: Vandeweyer G,Reyniers E,Wuyts W,Rooms L,Kooy RF

    更新日期:2011-01-05 00:00:00

  • Shared data science infrastructure for genomics data.

    abstract:BACKGROUND:Creating a scalable computational infrastructure to analyze the wealth of information contained in data repositories is difficult due to significant barriers in organizing, extracting and analyzing relevant data. Shared data science infrastructures like Boag is needed to efficiently process and parse data co...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2967-2

    authors: Bagheri H,Muppirala U,Masonbrink RE,Severin AJ,Rajan H

    更新日期:2019-08-22 00:00:00

  • Knowledge discovery of drug data on the example of adverse reaction prediction.

    abstract:BACKGROUND:Antibiotics are the widely prescribed drugs for children and most likely to be related with adverse reactions. Record on adverse reactions and allergies from antibiotics considerably affect the prescription choices. We consider this a biomedical decision-making problem and explore hidden knowledge in survey ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S6-S7

    authors: Yildirim P,Majnarić L,Ekmekci O,Holzinger A

    更新日期:2014-01-01 00:00:00

  • GlyStruct: glycation prediction using structural properties of amino acid residues.

    abstract:BACKGROUND:Glycation is a one of the post-translational modifications (PTM) where sugar molecules and residues in protein sequences are covalently bonded. It has become one of the clinically important PTM in recent times attributed to many chronic and age related complications. Being a non-enzymatic reaction, it is a g...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2547-x

    authors: Reddy HM,Sharma A,Dehzangi A,Shigemizu D,Chandra AA,Tsunoda T

    更新日期:2019-02-04 00:00:00

  • Bayesian semiparametric regression models to characterize molecular evolution.

    abstract:BACKGROUND:Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Di...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-278

    authors: Datta S,Rodriguez A,Prado R

    更新日期:2012-10-30 00:00:00

  • NEAT: an efficient network enrichment analysis test.

    abstract:BACKGROUND:Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1203-6

    authors: Signorelli M,Vinciotti V,Wit EC

    更新日期:2016-09-05 00:00:00

  • Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span.

    abstract:BACKGROUND:The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of molecular sequence and profiling data. Here, the potential of such modeling is demonstrated by examining the 5,225 free-text items in the Caeno...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-250

    authors: Blei DM,Franks K,Jordan MI,Mian IS

    更新日期:2006-05-08 00:00:00

  • Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation.

    abstract:BACKGROUND:Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic cl...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-63

    authors: Smith DD,Saetrom P,Snøve O Jr,Lundberg C,Rivas GE,Glackin C,Larson GP

    更新日期:2008-01-28 00:00:00

  • Assessing stationary distributions derived from chromatin contact maps.

    abstract:BACKGROUND:The spatial configuration of chromosomes is essential to various cellular processes, notably gene regulation, while architecture related alterations, such as translocations and gene fusions, are often cancer drivers. Thus, eliciting chromatin conformation is important, yet challenging due to compaction, dyna...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3424-y

    authors: Segal MR,Fletez-Brant K

    更新日期:2020-02-24 00:00:00

  • Bayesian mixture regression analysis for regulation of Pluripotency in ES cells.

    abstract:BACKGROUND:Observed levels of gene expression strongly depend on both activity of DNA binding transcription factors (TFs) and chromatin state through different histone modifications (HMs). In order to recover the functional relationship between local chromatin state, TF binding and observed levels of gene expression, r...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3331-2

    authors: Aflakparast M,Geeven G,de Gunst MCM

    更新日期:2020-01-02 00:00:00

  • Efficient prediction of human protein-protein interactions at a global scale.

    abstract:BACKGROUND:Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods. RESULTS:On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0383-1

    authors: Schoenrock A,Samanfar B,Pitre S,Hooshyar M,Jin K,Phillips CA,Wang H,Phanse S,Omidi K,Gui Y,Alamgir M,Wong A,Barrenäs F,Babu M,Benson M,Langston MA,Green JR,Dehne F,Golshani A

    更新日期:2014-12-10 00:00:00

  • Algal Functional Annotation Tool: a web-based analysis suite to functionally interpret large gene lists using integrated annotation and expression data.

    abstract:BACKGROUND:Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-282

    authors: Lopez D,Casero D,Cokus SJ,Merchant SS,Pellegrini M

    更新日期:2011-07-12 00:00:00

  • A novel similarity-measure for the analysis of genetic data in complex phenotypes.

    abstract:BACKGROUND:Recent technological advances in DNA sequencing and genotyping have led to the accumulation of a remarkable quantity of data on genetic polymorphisms. However, the development of new statistical and computational tools for effective processing of these data has not been equally as fast. In particular, Machin...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S6-S24

    authors: Lagani V,Montesanto A,Di Cianni F,Moreno V,Landi S,Conforti D,Rose G,Passarino G

    更新日期:2009-06-16 00:00:00

  • Critique of the pairwise method for estimating qPCR amplification efficiency: beware of correlated data!

    abstract:BACKGROUND:A recently proposed method for estimating qPCR amplification efficiency E analyzes fluorescence intensity ratios from pairs of points deemed to lie in the exponential growth region on the amplification curves for all reactions in a dilution series. This method suffers from a serious problem: The resulting ra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03604-4

    authors: Tellinghuisen J

    更新日期:2020-07-08 00:00:00

  • ProbPS: a new model for peak selection based on quantifying the dependence of the existence of derivative peaks on primary ion intensity.

    abstract:BACKGROUND:The analysis of mass spectra suggests that the existence of derivative peaks is strongly dependent on the intensity of the primary peaks. Peak selection from tandem mass spectrum is used to filter out noise and contaminant peaks. It is widely accepted that a valid primary peak tends to have high intensity an...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-346

    authors: Zhang S,Wang Y,Bu D,Zhang H,Sun S

    更新日期:2011-08-17 00:00:00

  • GOmotif: A web server for investigating the biological role of protein sequence motifs.

    abstract:BACKGROUND:Many proteins contain conserved sequence patterns (motifs) that contribute to their functionality. The process of experimentally identifying and validating novel protein motifs can be difficult, expensive, and time consuming. A means for helping to identify in advance the possible function of a novel motif i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-379

    authors: Bristow F,He R,Van Domselaar G

    更新日期:2011-09-26 00:00:00

  • Algorithm-driven artifacts in median polish summarization of microarray data.

    abstract:BACKGROUND:High-throughput measurement of transcript intensities using Affymetrix type oligonucleotide microarrays has produced a massive quantity of data during the last decade. Different preprocessing techniques exist to convert the raw signal intensities measured by these chips into gene expression estimates. Althou...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-553

    authors: Giorgi FM,Bolger AM,Lohse M,Usadel B

    更新日期:2010-11-11 00:00:00

  • Meta-eQTL: a tool set for flexible eQTL meta-analysis.

    abstract:BACKGROUND:Increasing number of eQTL (Expression Quantitative Trait Loci) datasets facilitate genetics and systems biology research. Meta-analysis tools are in need to jointly analyze datasets of same or similar issue types to improve statistical power especially in trans-eQTL mapping. Meta-analysis framework is also n...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0392-0

    authors: Di Narzo AF,Cheng H,Lu J,Hao K

    更新日期:2014-11-28 00:00:00

  • Current approaches to gene regulatory network modelling.

    abstract::Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these ca...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S6-S9

    authors: Schlitt T,Brazma A

    更新日期:2007-09-27 00:00:00

  • A format for databasing and comparison of AFLP fingerprint profiles.

    abstract:BACKGROUND:Amplified fragment length polymorphism (AFLP) is a PCR-based technique that involves restriction of genomic DNA followed by ligation of adaptors to the fragments generated and selective PCR amplification of a subset of these fragments. The amplified fragments are separated on a sequencing gel and visualized ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-7

    authors: Hong Y,Chuah A

    更新日期:2003-02-25 00:00:00

  • Identification of exonic regions in DNA sequences using cross-correlation and noise suppression by discrete wavelet transform.

    abstract:BACKGROUND:The identification of protein coding regions (exons) in DNA sequences using signal processing techniques is an important component of bioinformatics and biological signal processing. In this paper, a new method is presented for the identification of exonic regions in DNA sequences. This method is based on th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-430

    authors: Abbasi O,Rostami A,Karimian G

    更新日期:2011-11-03 00:00:00

  • An automated framework for understanding structural variations in the binding grooves of MHC class II molecules.

    abstract:BACKGROUND:MHC/HLA class II molecules are important components of the immune system and play a critical role in processes such as phagocytosis. Understanding peptide recognition properties of the hundreds of MHC class II alleles is essential to appreciate determinants of antigenicity and ultimately to predict epitopes....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S1-S55

    authors: Yeturu K,Utriainen T,Kemp GJ,Chandra N

    更新日期:2010-01-18 00:00:00