Graph regularized L2,1-nonnegative matrix factorization for miRNA-disease association prediction.

Abstract:

BACKGROUND:The aberrant expression of microRNAs is closely connected to the occurrence and development of a great deal of human diseases. To study human diseases, numerous effective computational models that are valuable and meaningful have been presented by researchers. RESULTS:Here, we present a computational framework based on graph Laplacian regularized L2, 1-nonnegative matrix factorization (GRL2, 1-NMF) for inferring possible human disease-connected miRNAs. First, manually validated disease-connected microRNAs were integrated, and microRNA functional similarity information along with two kinds of disease semantic similarities were calculated. Next, we measured Gaussian interaction profile (GIP) kernel similarities for both diseases and microRNAs. Then, we adopted a preprocessing step, namely, weighted K nearest known neighbours (WKNKN), to decrease the sparsity of the miRNA-disease association matrix network. Finally, the GRL2,1-NMF framework was used to predict links between microRNAs and diseases. CONCLUSIONS:The new method (GRL2, 1-NMF) achieved AUC values of 0.9280 and 0.9276 in global leave-one-out cross validation (global LOOCV) and five-fold cross validation (5-CV), respectively, showing that GRL2, 1-NMF can powerfully discover potential disease-related miRNAs, even if there is no known associated disease.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Gao Z,Wang YT,Wu QW,Ni JC,Zheng CH

doi

10.1186/s12859-020-3409-x

subject

Has Abstract

pub_date

2020-02-18 00:00:00

pages

61

issue

1

issn

1471-2105

pii

10.1186/s12859-020-3409-x

journal_volume

21

pub_type

杂志文章
  • Prediction of heart disease and classifiers' sensitivity analysis.

    abstract:BACKGROUND:Heart disease (HD) is one of the most common diseases nowadays, and an early diagnosis of such a disease is a crucial task for many health care providers to prevent their patients for such a disease and to save lives. In this paper, a comparative analysis of different classifiers was performed for the classi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03626-y

    authors: Almustafa KM

    更新日期:2020-07-02 00:00:00

  • FANTOM: Functional and taxonomic analysis of metagenomes.

    abstract:BACKGROUND:Interpretation of quantitative metagenomics data is important for our understanding of ecosystem functioning and assessing differences between various environmental samples. There is a need for an easy to use tool to explore the often complex metagenomics data in taxonomic and functional context. RESULTS:He...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-38

    authors: Sanli K,Karlsson FH,Nookaew I,Nielsen J

    更新日期:2013-02-01 00:00:00

  • The identification of informative genes from multiple datasets with increasing complexity.

    abstract:BACKGROUND:In microarray data analysis, factors such as data quality, biological variation, and the increasingly multi-layered nature of more complex biological systems complicates the modelling of regulatory networks that can represent and capture the interactions among genes. We believe that the use of multiple datas...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-32

    authors: Anvar SY,'t Hoen PA,Tucker A

    更新日期:2010-01-15 00:00:00

  • HTPheno: an image analysis pipeline for high-throughput plant phenotyping.

    abstract:BACKGROUND:In the last few years high-throughput analysis methods have become state-of-the-art in the life sciences. One of the latest developments is automated greenhouse systems for high-throughput plant phenotyping. Such systems allow the non-destructive screening of plants over a period of time by means of image ac...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-148

    authors: Hartmann A,Czauderna T,Hoffmann R,Stein N,Schreiber F

    更新日期:2011-05-12 00:00:00

  • CGHpower: exploring sample size calculations for chromosomal copy number experiments.

    abstract:BACKGROUND:Determining a suitable sample size is an important step in the planning of microarray experiments. Increasing the number of arrays gives more statistical power, but adds to the total cost of the experiment. Several approaches for sample size determination have been developed for expression array studies, but...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-331

    authors: Scheinin I,Ferreira JA,Knuutila S,Meijer GA,van de Wiel MA,Ylstra B

    更新日期:2010-06-17 00:00:00

  • Structural alignment of protein descriptors - a combinatorial model.

    abstract:BACKGROUND:Structural alignment of proteins is one of the most challenging problems in molecular biology. The tertiary structure of a protein strictly correlates with its function and computationally predicted structures are nowadays a main premise for understanding the latter. However, computationally derived 3D model...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1237-9

    authors: Antczak M,Kasprzak M,Lukasiak P,Blazewicz J

    更新日期:2016-09-17 00:00:00

  • VKCDB: voltage-gated potassium channel database.

    abstract:BACKGROUND:The family of voltage-gated potassium channels comprises a functionally diverse group of membrane proteins. They help maintain and regulate the potassium ion-based component of the membrane potential and are thus central to many critical physiological processes. VKCDB (Voltage-gated potassium [K] Channel Dat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1186/1471-2105-5-3

    authors: Li B,Gallin WJ

    更新日期:2004-01-09 00:00:00

  • A novel substitution matrix fitted to the compositional bias in Mollicutes improves the prediction of homologous relationships.

    abstract:BACKGROUND:Substitution matrices are key parameters for the alignment of two protein sequences, and consequently for most comparative genomics studies. The composition of biological sequences can vary importantly between species and groups of species, and classical matrices such as those in the BLOSUM series fail to ac...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-457

    authors: Lemaitre C,Barré A,Citti C,Tardy F,Thiaucourt F,Sirand-Pugnet P,Thébault P

    更新日期:2011-11-24 00:00:00

  • Blazing Signature Filter: a library for fast pairwise similarity comparisons.

    abstract:BACKGROUND:Identifying similarities between datasets is a fundamental task in data mining and has become an integral part of modern scientific investigation. Whether the task is to identify co-expressed genes in large-scale expression surveys or to predict combinations of gene knockouts which would elicit a similar phe...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2210-6

    authors: Lee JY,Fujimoto GM,Wilson R,Wiley HS,Payne SH

    更新日期:2018-06-11 00:00:00

  • MATLIGN: a motif clustering, comparison and matching tool.

    abstract:BACKGROUND:Sequence motifs representing transcription factor binding sites (TFBS) are commonly encoded as position frequency matrices (PFM) or degenerate consensus sequences (CS). These formats are used to represent the characterised TFBS profiles stored in transcription factor databases, as well as to represent the po...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-189

    authors: Kankainen M,Löytynoja A

    更新日期:2007-06-08 00:00:00

  • Reverse engineering gene regulatory networks: coupling an optimization algorithm with a parameter identification technique.

    abstract:BACKGROUND:To infer gene regulatory networks from time series gene profiles, two important tasks that are related to biological systems must be undertaken. One task is to determine a valid network structure that has topological properties that can influence the network dynamics profoundly. The other task is to optimize...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S15-S8

    authors: Hsiao YT,Lee WP

    更新日期:2014-01-01 00:00:00

  • Amino acid sequence associated with bacteriophage recombination site helps to reveal genes potentially acquired through horizontal gene transfer.

    abstract:BACKGROUND:Horizontal gene transfer, i.e. the acquisition of genetic material from nonparent organism, is considered an important force driving species evolution. Many cases of horizontal gene transfer from prokaryotes to eukaryotes have been registered, but no transfer mechanism has been deciphered so far, although vi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03599-y

    authors: Daugavet MA,Shabelnikov SV,Podgornaya OI

    更新日期:2020-07-24 00:00:00

  • ICoVax 2013: the 3rd ISV Pre-conference Computational Vaccinology Workshop.

    abstract::Following last year's computational vaccinology workshop in Shanghai, China, the third ISV Pre-conference Computational Vaccinology Workshop (ICoVax 2013) was held in Barcelona, Spain. ICoVax 2013 provided an international platform for the attendees to showcase their research and discuss problems and solutions in the ...

    journal_title:BMC bioinformatics

    pub_type:

    doi:10.1186/1471-2105-15-S4-I1

    authors: De Groot AS,De Groot P,He Y

    更新日期:2014-01-01 00:00:00

  • m6Acomet: large-scale functional prediction of individual m6A RNA methylation sites from an RNA co-methylation network.

    abstract:BACKGROUND:Over one hundred different types of post-transcriptional RNA modifications have been identified in human. Researchers discovered that RNA modifications can regulate various biological processes, and RNA methylation, especially N6-methyladenosine, has become one of the most researched topics in epigenetics. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2840-3

    authors: Wu X,Wei Z,Chen K,Zhang Q,Su J,Liu H,Zhang L,Meng J

    更新日期:2019-05-02 00:00:00

  • Predicting substrates of the human breast cancer resistance protein using a support vector machine method.

    abstract:BACKGROUND:Human breast cancer resistance protein (BCRP) is an ATP-binding cassette (ABC) efflux transporter that confers multidrug resistance in cancers and also plays an important role in the absorption, distribution and elimination of drugs. Prediction as to if drugs or new molecular entities are BCRP substrates sho...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-130

    authors: Hazai E,Hazai I,Ragueneau-Majlessi I,Chung SP,Bikadi Z,Mao Q

    更新日期:2013-04-15 00:00:00

  • MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data.

    abstract:BACKGROUND:Mass spectrometry (MS) coupled with online separation methods is commonly applied for differential and quantitative profiling of biological samples in metabolomic as well as proteomic research. Such approaches are used for systems biology, functional genomics, and biomarker discovery, among others. An ongoin...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-395

    authors: Pluskal T,Castillo S,Villar-Briones A,Oresic M

    更新日期:2010-07-23 00:00:00

  • Qxpak.5: old mixed model solutions for new genomics problems.

    abstract:BACKGROUND:Mixed models have a long and fruitful history in statistics. They are pertinent to genomics problems because they are highly versatile, accommodating a wide variety of situations within the same theoretical and algorithmic framework. RESULTS:Qxpak is a package for versatile statistical genomics, specificall...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-202

    authors: Pérez-Enciso M,Misztal I

    更新日期:2011-05-25 00:00:00

  • MapMi: automated mapping of microRNA loci.

    abstract:BACKGROUND:A large effort to discover microRNAs (miRNAs) has been under way. Currently miRBase is their primary repository, providing annotations of primary sequences, precursors and probable genomic loci. In many cases miRNAs are identical or very similar between related (or in some cases more distant) species. Howeve...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-133

    authors: Guerra-Assunção JA,Enright AJ

    更新日期:2010-03-16 00:00:00

  • Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashing.

    abstract:BACKGROUND:Clustering of protein sequences is of key importance in predicting the structure and function of newly sequenced proteins and is also of use for their annotation. With the advent of multiple high-throughput sequencing technologies, new protein sequences are becoming available at an extraordinary rate. The ra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2080-y

    authors: Abnousi A,Broschat SL,Kalyanaraman A

    更新日期:2018-03-05 00:00:00

  • MIENTURNET: an interactive web tool for microRNA-target enrichment and network-based analysis.

    abstract:BACKGROUND:miRNAs regulate the expression of several genes with one miRNA able to target multiple genes and with one gene able to be simultaneously targeted by more than one miRNA. Therefore, it has become indispensable to shorten the long list of miRNA-target interactions to put in the spotlight in order to gain insig...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3105-x

    authors: Licursi V,Conte F,Fiscon G,Paci P

    更新日期:2019-11-04 00:00:00

  • Insertion and deletion correcting DNA barcodes based on watermarks.

    abstract:BACKGROUND:Barcode multiplexing is a key strategy for sharing the rising capacity of next-generation sequencing devices: Synthetic DNA tags, called barcodes, are attached to natural DNA fragments within the library preparation procedure. Different libraries, can individually be labeled with barcodes for a joint sequenc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0482-7

    authors: Kracht D,Schober S

    更新日期:2015-02-18 00:00:00

  • Colony size measurement of the yeast gene deletion strains for functional genomics.

    abstract:BACKGROUND:Numerous functional genomics approaches have been developed to study the model organism yeast, Saccharomyces cerevisiae, with the aim of systematically understanding the biology of the cell. Some of these techniques are based on yeast growth differences under different conditions, such as those generated by ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-117

    authors: Memarian N,Jessulat M,Alirezaie J,Mir-Rashed N,Xu J,Zareie M,Smith M,Golshani A

    更新日期:2007-04-04 00:00:00

  • MQAPRank: improved global protein model quality assessment by learning-to-rank.

    abstract:BACKGROUND:Protein structure prediction has achieved a lot of progress during the last few decades and a greater number of models for a certain sequence can be predicted. Consequently, assessing the qualities of predicted protein models in perspective is one of the key components of successful protein structure predict...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1691-z

    authors: Jing X,Dong Q

    更新日期:2017-05-25 00:00:00

  • CaPSID: a bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes.

    abstract:BACKGROUND:It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opp...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-206

    authors: Borozan I,Wilson S,Blanchette P,Laflamme P,Watt SN,Krzyzanowski PM,Sircoulomb F,Rottapel R,Branton PE,Ferretti V

    更新日期:2012-08-17 00:00:00

  • Network-based group variable selection for detecting expression quantitative trait loci (eQTL).

    abstract:BACKGROUND:Analysis of expression quantitative trait loci (eQTL) aims to identify the genetic loci associated with the expression level of genes. Penalized regression with a proper penalty is suitable for the high-dimensional biological data. Its performance should be enhanced when we incorporate biological knowledge o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-269

    authors: Wang W,Zhang X

    更新日期:2011-06-30 00:00:00

  • A computational diffusion model to study antibody transport within reconstructed tumor microenvironments.

    abstract:BACKGROUND:Antibodies revolutionized cancer treatment over the past decades. Despite their successfully application, there are still challenges to overcome to improve efficacy, such as the heterogeneous distribution of antibodies within tumors. Tumor microenvironment features, such as the distribution of tumor and othe...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03854-2

    authors: Cartaxo AL,Almeida J,Gualda EJ,Marsal M,Loza-Alvarez P,Brito C,Isidro IA

    更新日期:2020-11-17 00:00:00

  • A study on multi-omic oscillations in Escherichia coli metabolic networks.

    abstract:BACKGROUND:Two important challenges in the analysis of molecular biology information are data (multi-omic information) integration and the detection of patterns across large scale molecular networks and sequences. They are are actually coupled beause the integration of omic information may provide better means to detec...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2175-5

    authors: Bardozzo F,Lió P,Tagliaferri R

    更新日期:2018-07-09 00:00:00

  • Fast and robust group-wise eQTL mapping using sparse graphical models.

    abstract:BACKGROUND:Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. The traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression tra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0421-z

    authors: Cheng W,Shi Y,Zhang X,Wang W

    更新日期:2015-01-16 00:00:00

  • Fast batch searching for protein homology based on compression and clustering.

    abstract:BACKGROUND:In bioinformatics community, many tasks associate with matching a set of protein query sequences in large sequence datasets. To conduct multiple queries in the database, a common used method is to run BLAST on each original querey or on the concatenated queries. It is inefficient since it doesn't exploit the...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1938-8

    authors: Ge H,Sun L,Yu J

    更新日期:2017-11-21 00:00:00

  • Ferret: a sentence-based literature scanning system.

    abstract:BACKGROUND:The rapid pace of bioscience research makes it very challenging to track relevant articles in one's area of interest. MEDLINE, a primary source for biomedical literature, offers access to more than 20 million citations with three-quarters of a million new ones added each year. Thus it is not surprising to se...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0630-0

    authors: Srinivasan P,Zhang XN,Bouten R,Chang C

    更新日期:2015-06-20 00:00:00