MeSHHeading2vec: a new method for representing MeSH headings as vectors based on graph embedding algorithm.

Abstract:

:Effectively representing Medical Subject Headings (MeSH) headings (terms) such as disease and drug as discriminative vectors could greatly improve the performance of downstream computational prediction models. However, these terms are often abstract and difficult to quantify. In this paper, we converted the MeSH tree structure into a relationship network and applied several graph embedding algorithms on it to represent these terms. Specifically, the relationship network consisting of nodes (MeSH headings) and edges (relationships), which can be constructed by the tree num. Then, five graph embedding algorithms including DeepWalk, LINE, SDNE, LAP and HOPE were implemented on the relationship network to represent MeSH headings as vectors. In order to evaluate the performance of the proposed methods, we carried out the node classification and relationship prediction tasks. The results show that the MeSH headings characterized by graph embedding algorithms can not only be treated as an independent carrier for representation, but also can be utilized as additional information to enhance the representation ability of vectors. Thus, it can serve as an input and continue to play a significant role in any computational models related to disease, drug, microbe, etc. Besides, our method holds great hope to inspire relevant researchers to study the representation of terms in this network perspective.

journal_name

Brief Bioinform

authors

Guo ZH,You ZH,Huang DS,Yi HC,Zheng K,Chen ZH,Wang YB

doi

10.1093/bib/bbaa037

subject

Has Abstract

pub_date

2020-03-31 00:00:00

eissn

1467-5463

issn

1477-4054

pii

5813844

pub_type

杂志文章
  • Accounting for differential variability in detecting differentially methylated regions.

    abstract::DNA methylation plays an essential role in cancer. Differential variability (DV) in cancer was recently observed that contributes to cancer heterogeneity and has been shown to be crucial in detecting epigenetic field defects, DNA methylation alterations happening early in carcinogenesis. As neighboring CpG sites are h...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx097

    authors: Wang Y,Teschendorff AE,Widschwendter M,Wang S

    更新日期:2019-01-18 00:00:00

  • Deep learning for brain disorders: from data processing to disease treatment.

    abstract::In order to reach precision medicine and improve patients' quality of life, machine learning is increasingly used in medicine. Brain disorders are often complex and heterogeneous, and several modalities such as demographic, clinical, imaging, genetics and environmental data have been studied to improve their understan...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa310

    authors: Burgos N,Bottani S,Faouzi J,Thibeau-Sutre E,Colliot O

    更新日期:2020-12-15 00:00:00

  • Proteome-scale analysis of phase-separated proteins in immunofluorescence images.

    abstract::Phase separation is an important mechanism that mediates the spatial distribution of proteins in different cellular compartments. While phase-separated proteins share certain sequence characteristics, including intrinsically disordered regions (IDRs) and prion-like domains, such characteristics are insufficient for ma...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa187

    authors: Yu C,Shen B,You K,Huang Q,Shi M,Wu C,Chen Y,Zhang C,Li T

    更新日期:2020-09-02 00:00:00

  • TOD-CUP: a gene expression rank-based majority vote algorithm for tissue origin diagnosis of cancers of unknown primary.

    abstract::Gene expression profiling holds great potential as a new approach to histological diagnosis and precision medicine of cancers of unknown primary (CUP). Batch effects and different data types greatly decrease the predictive performance of biomarker-based algorithms, and few methods have been widely applied to identify ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa031

    authors: Shen Y,Chu Q,Yin X,He Y,Bai P,Wang Y,Fang W,Timko MP,Fan L,Jiang W

    更新日期:2020-04-08 00:00:00

  • Comparative study of computational methods to detect the correlated reaction sets in biochemical networks.

    abstract::Correlated reaction sets (Co-Sets) are mathematically defined modules in biochemical reaction networks which facilitate the study of biological processes by decomposing complex reaction networks into conceptually simple units. According to the degree of association, Co-Sets can be classified into three types: perfect,...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbp068

    authors: Xi Y,Chen YP,Qian C,Wang F

    更新日期:2011-03-01 00:00:00

  • LARMD: integration of bioinformatic resources to profile ligand-driven protein dynamics with a case on the activation of estrogen receptor.

    abstract::Protein dynamics is central to all biological processes, including signal transduction, cellular regulation and biological catalysis. Among them, in-depth exploration of ligand-driven protein dynamics contributes to an optimal understanding of protein function, which is particularly relevant to drug discovery. Hence, ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz141

    authors: Yang JF,Wang F,Chen YZ,Hao GF,Yang GF

    更新日期:2020-12-01 00:00:00

  • Investigating microRNA-mediated regulation of the nascent nuclear transcripts in plants: a bioinformatics workflow.

    abstract::Most of the microRNAs (miRNAs) play their regulatory roles through posttranscriptional target decay or translational inhibition. For both plants and animals, these regulatory events were previously considered to take place in cytoplasm, as mature miRNAs were observed to be exported to the cytoplasm for Argonaute (AGO)...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx069

    authors: Yu D,Tang Z,Shao C,Ma X,Xiang T,Fan Z,Wang H,Meng Y

    更新日期:2018-11-27 00:00:00

  • Understanding the unimodal distributions of cancer occurrence rates: it takes two factors for a cancer to occur.

    abstract::Data from the SEER reports reveal that the occurrence rate of a cancer type generally follows a unimodal distribution over age, peaking at an age that is cancer-type specific and ranges from 30+ through 70+. Previous studies attribute such bell-shaped distributions to the reduced proliferative potential in senior year...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa349

    authors: Qiu S,An Z,Tan R,He PA,Jing J,Li H,Wu S,Xu Y

    更新日期:2020-12-30 00:00:00

  • Bioinformatics tools and challenges in structural analysis of lipidomics MS/MS data.

    abstract::Lipidomics, the systematic study of the lipid composition of a cell or tissue, is an invaluable complement to knowledge gained by genomics and proteomics research. Mass spectrometry provides a means to detect hundreds of lipids in parallel, and this includes low abundance species of lipids. Nevertheless, frequently oc...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbs030

    authors: Hartler J,Tharakan R,Köfeler HC,Graham DR,Thallinger GG

    更新日期:2013-05-01 00:00:00

  • Bioinformatics resources for SARS-CoV-2 discovery and surveillance.

    abstract::In early January 2020, the novel coronavirus (SARS-CoV-2) responsible for a pneumonia outbreak in Wuhan, China, was identified using next-generation sequencing (NGS) and readily available bioinformatics pipelines. In addition to virus discovery, these NGS technologies and bioinformatics resources are currently being e...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa386

    authors: Hu T,Li J,Zhou H,Li C,Holmes EC,Shi W

    更新日期:2021-01-08 00:00:00

  • Using prior knowledge from cellular pathways and molecular networks for diagnostic specimen classification.

    abstract::For many complex diseases, an earlier and more reliable diagnosis is considered a key prerequisite for developing more effective therapies to prevent or delay disease progression. Classical statistical learning approaches for specimen classification using omics data, however, often cannot provide diagnostic models wit...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv044

    authors: Glaab E

    更新日期:2016-05-01 00:00:00

  • Cloud 3D-QSAR: a web tool for the development of quantitative structure-activity relationship models in drug discovery.

    abstract::Effective drug discovery contributes to the treatment of numerous diseases but is limited by high costs and long cycles. The Quantitative Structure-Activity Relationship (QSAR) method was introduced to evaluate the activity of a large number of compounds virtually, reducing the time and labor costs required for chemic...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa276

    authors: Wang YL,Wang F,Shi XX,Jia CY,Wu FX,Hao GF,Yang GF

    更新日期:2020-11-03 00:00:00

  • Empirical comparison and analysis of web-based cell-penetrating peptide prediction tools.

    abstract::Cell-penetrating peptides (CPPs) facilitate the delivery of therapeutically relevant molecules, including DNA, proteins and oligonucleotides, into cells both in vitro and in vivo. This unique ability explores the possibility of CPPs as therapeutic delivery and its potential applications in clinical therapy. Over the l...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bby124

    authors: Su R,Hu J,Zou Q,Manavalan B,Wei L

    更新日期:2020-03-23 00:00:00

  • Comprehensive characterization of tissue-specific circular RNAs in the human and mouse genomes.

    abstract::Circular RNA (circRNA) is a group of RNA family generated by RNA circularization, which was discovered ubiquitously across different species and tissues. However, there is no global view of tissue specificity for circRNAs to date. Here we performed the comprehensive analysis to characterize the features of human and m...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbw081

    authors: Xia S,Feng J,Lei L,Hu J,Xia L,Wang J,Xiang Y,Liu L,Zhong S,Han L,He C

    更新日期:2017-11-01 00:00:00

  • GenoPheno: cataloging large-scale phenotypic and next-generation sequencing data within human datasets.

    abstract::Precision medicine promises to revolutionize treatment, shifting therapeutic approaches from the classical one-size-fits-all to those more tailored to the patient's individual genomic profile, lifestyle and environmental exposures. Yet, to advance precision medicine's main objective-ensuring the optimum diagnosis, tre...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa033

    authors: Gutiérrez-Sacristán A,De Niz C,Kothari C,Kong SW,Mandl KD,Avillach P

    更新日期:2021-01-18 00:00:00

  • AlzRiskMR database: an online database for the impact of exposure factors on Alzheimer's disease.

    abstract::In view of great difficulties in the pathogenesis analysis of Alzheimer's disease (AD) presently, profiling the modifiable risk factors is crucial for early detection and intervention of AD. However, the causal associations among them have yet to be identified, and the effective integration and application of these da...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa213

    authors: Wang Z,Meng L,Liu H,Shen L,Ji HF

    更新日期:2020-09-21 00:00:00

  • Computational aspects of host-parasite phylogenies.

    abstract::Computational aspects of host-parasite phylogenies form part of a set of general associations between areas and organisms, hosts and parasites, and species and genes. The problem is not new and the commonalities of exploring vicariance biogeography (organisms tracking areas) and host-parasite co-speciation (parasites ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/5.4.339

    authors: Stevens J

    更新日期:2004-12-01 00:00:00

  • Conceptual and computational framework for logical modelling of biological networks deregulated in diseases.

    abstract::Mathematical models can serve as a tool to formalize biological knowledge from diverse sources, to investigate biological questions in a formal way, to test experimental hypotheses, to predict the effect of perturbations and to identify underlying mechanisms. We present a pipeline of computational tools that performs ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx163

    authors: Montagud A,Traynard P,Martignetti L,Bonnet E,Barillot E,Zinovyev A,Calzone L

    更新日期:2019-07-19 00:00:00

  • A feature-based approach to predict hot spots in protein-DNA binding interfaces.

    abstract::DNA-binding hot spot residues of proteins are dominant and fundamental interface residues that contribute most of the binding free energy of protein-DNA interfaces. As experimental methods for identifying hot spots are expensive and time consuming, computational approaches are urgently required in predicting hot spots...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz037

    authors: Zhang S,Zhao L,Zheng CH,Xia J

    更新日期:2020-05-21 00:00:00

  • Comparison of software packages for detecting differential expression in RNA-seq studies.

    abstract::RNA-sequencing (RNA-seq) has rapidly become a popular tool to characterize transcriptomes. A fundamental research problem in many RNA-seq studies is the identification of reliable molecular markers that show differential expression between distinct sample groups. Together with the growing popularity of RNA-seq, a numb...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbt086

    authors: Seyednasrollah F,Laiho A,Elo LL

    更新日期:2015-01-01 00:00:00

  • Result verification, code verification and computation of support values in phylogenetics.

    abstract::Verification in phylogenetics represents an extremely difficult subject. Phylogenetic analysis deals with the reconstruction of evolutionary histories of species, and as long as mankind is not able to travel in time, it will not be possible to verify deep evolutionary histories reconstructed with modern computational ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbq079

    authors: Stamatakis A,Izquierdo-Carrasco F

    更新日期:2011-05-01 00:00:00

  • HITS-PR-HHblits: protein remote homology detection by combining PageRank and Hyperlink-Induced Topic Search.

    abstract::As one of the most important fundamental problems in protein sequence analysis, protein remote homology detection is critical for both theoretical research (protein structure and function studies) and real world applications (drug design). Although several computational predictors have been proposed, their detection p...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bby104

    authors: Liu B,Jiang S,Zou Q

    更新日期:2018-11-07 00:00:00

  • Development of biomarker classifiers from high-dimensional data.

    abstract::Recent development of high-throughput technology has accelerated interest in the development of molecular biomarker classifiers for safety assessment, disease diagnostics and prognostics, and prediction of response for patient assignment. This article reviews and evaluates some important aspects and key issues in the ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbp016

    authors: Baek S,Tsai CA,Chen JJ

    更新日期:2009-09-01 00:00:00

  • Advanced bioinformatics methods for practical applications in proteomics.

    abstract::Mass spectrometry (MS)-based proteomics has undergone rapid advancements in recent years, creating challenging problems for bioinformatics. We focus on four aspects where bioinformatics plays a crucial role (and proteomics is needed for clinical application): peptide-spectra matching (PSM) based on the new data-indepe...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx128

    authors: Goh WWB,Wong L

    更新日期:2019-01-18 00:00:00

  • The mechanistic, diagnostic and therapeutic novel nucleic acids for hepatocellular carcinoma emerging in past score years.

    abstract::Despite The Central Dogma states the destiny of gene as 'DNA makes RNA and RNA makes protein', the nucleic acids not only store and transmit genetic information but also, surprisingly, join in intracellular vital movement as a regulator of gene expression. Bioinformatics has contributed to knowledge for a series of em...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa023

    authors: Zhang S,Zhou Y,Wang Y,Wang Z,Xiao Q,Zhang Y,Lou Y,Qiu Y,Zhu F

    更新日期:2020-04-06 00:00:00

  • Multiple Testing of Gene Sets from Gene Ontology: Possibilities and Pitfalls.

    abstract::The use of multiple testing procedures in the context of gene-set testing is an important but relatively underexposed topic. If a multiple testing method is used, this is usually a standard familywise error rate (FWER) or false discovery rate (FDR) controlling procedure in which the logical relationships that exist be...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv091

    authors: Meijer RJ,Goeman JJ

    更新日期:2016-09-01 00:00:00

  • Exon array data analysis using Affymetrix power tools and R statistical software.

    abstract::The use of microarray technology to measure gene expression on a genome-wide scale has been well established for more than a decade. Methods to process and analyse the vast quantity of expression data generated by a typical microarray experiment are similarly well-established. The Affymetrix Exon 1.0 ST array is a rel...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbq086

    authors: Lockstone HE

    更新日期:2011-11-01 00:00:00

  • Systematic review of computational methods for identifying miRNA-mediated RNA-RNA crosstalk.

    abstract::Posttranscriptional crosstalk and communication between RNAs yield large regulatory competing endogenous RNA (ceRNA) networks via shared microRNAs (miRNAs), as well as miRNA synergistic networks. The ceRNA crosstalk represents a novel layer of gene regulation that controls both physiological and pathological processes...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx137

    authors: Li Y,Jin X,Wang Z,Li L,Chen H,Lin X,Yi S,Zhang Y,Xu J

    更新日期:2019-07-19 00:00:00

  • The Beta Workbench: a computational tool to study the dynamics of biological systems.

    abstract::We introduce the Beta Workbench (BWB), a scalable tool built on top of the newly defined BlenX language to model, simulate and analyse biological systems. We show the features and the incremental modelling process supported by the BWB on a running example based on the mitogen-activated kinase pathway. Finally, we prov...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbn023

    authors: Dematté L,Priami C,Romanel A

    更新日期:2008-09-01 00:00:00

  • A proteogenomic approach to understand splice isoform functions through sequence and expression-based computational modeling.

    abstract::The products of multi-exon genes are a mixture of alternatively spliced isoforms, from which the translated proteins can have similar, different or even opposing functions. It is therefore essential to differentiate and annotate functions for individual isoforms. Computational approaches provide an efficient complemen...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv109

    authors: Li HD,Omenn GS,Guan Y

    更新日期:2016-11-01 00:00:00