LGscore: A method to identify disease-related genes using biological literature and Google data.


:Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimer's disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods.


J Biomed Inform


Kim J,Kim H,Yoon Y,Park S




Has Abstract


2015-04-01 00:00:00












  • Use of morphological analysis in protein name recognition.

    abstract::Protein name recognition aims to detect each and every protein names appearing in a PubMed abstract. The task is not simple, as the graphic word boundary (space separator) assumed in conventional preprocessing does not necessarily coincide with the protein name boundary. Such boundary disagreement caused by tokenizati...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Yamamoto K,Kudo T,Konagaya A,Matsumoto Y

    更新日期:2004-12-01 00:00:00

  • Word sense disambiguation across two domains: biomedical literature and clinical notes.

    abstract::The aim of this study is to explore the word sense disambiguation (WSD) problem across two biomedical domains-biomedical literature and clinical notes. A supervised machine learning technique was used for the WSD task. One of the challenges addressed is the creation of a suitable clinical corpus with manual sense anno...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Savova GK,Coden AR,Sominsky IL,Johnson R,Ogren PV,de Groen PC,Chute CG

    更新日期:2008-12-01 00:00:00

  • The EU-ADR corpus: annotated drugs, diseases, targets, and their relationships.

    abstract::Corpora with specific entities and relationships annotated are essential to train and evaluate text-mining systems that are developed to extract specific structured information from a large corpus. In this paper we describe an approach where a named-entity recognition system produces a first annotation and annotators ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: van Mulligen EM,Fourrier-Reglat A,Gurwitz D,Molokhia M,Nieto A,Trifiro G,Kors JA,Furlong LI

    更新日期:2012-10-01 00:00:00

  • Consensus and Meta-analysis regulatory networks for combining multiple microarray gene expression datasets.

    abstract::Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with great...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,meta分析


    authors: Steele E,Tucker A

    更新日期:2008-12-01 00:00:00

  • Predicting severe clinical events by learning about life-saving actions and outcomes using distant supervision.

    abstract::Medical error is a leading cause of patient death in the United States. Among the different types of medical errors, harm to patients caused by doctors missing early signs of deterioration is especially challenging to address due to the heterogeneity of patients' physiological patterns. In this study, we implemented r...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Lee DH,Yetisgen M,Vanderwende L,Horvitz E

    更新日期:2020-07-01 00:00:00

  • A model-driven methodology for exploring complex disease comorbidities applied to autism spectrum disorder and inflammatory bowel disease.

    abstract::We propose a model-driven methodology aimed to shed light on complex disorders. Our approach enables exploring shared etiologies of comorbid diseases at the molecular pathway level. The method, Comparative Comorbidities Simulation (CCS), uses stochastic Petri net simulation for examining the phenotypic effects of pert...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Somekh J,Peleg M,Eran A,Koren I,Feiglin A,Demishtein A,Shiloh R,Heiner M,Kong SW,Elazar Z,Kohane I

    更新日期:2016-10-01 00:00:00

  • Predicting the function of transplanted kidney in long-term care processes: Application of a hybrid model.

    abstract:BACKGROUND:A tool that can predict the estimated glomerular filtration rate (eGFR) in routine daily care can help clinicians to make better decisions for kidney transplant patients and to improve transplantation outcome. In this paper, we proposed a hybrid prediction model for predicting a future value for eGFR during ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Rashidi Khazaee P,Bagherzadeh M J,Niazkhani Z,Pirnejad H

    更新日期:2019-03-01 00:00:00

  • Towards an on-demand peer feedback system for a clinical knowledge base: a case study with order sets.

    abstract:OBJECTIVE:We have developed an automated knowledge base peer feedback system as part of an effort to facilitate the creation and refinement of sound clinical knowledge content within an enterprise-wide knowledge base. The program collects clinical data stored in our Clinical Data Repository during usage of a physician ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Hulse NC,Del Fiol G,Bradshaw RL,Roemer LK,Rocha RA

    更新日期:2008-02-01 00:00:00

  • HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

    abstract::The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function....

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: O'Driscoll A,Belogrudov V,Carroll J,Kropp K,Walsh P,Ghazal P,Sleator RD

    更新日期:2015-04-01 00:00:00

  • Integrating cancer diagnosis terminologies based on logical definitions of SNOMED CT concepts.

    abstract::In oncology, the reuse of data is confronted with the heterogeneity of terminologies. It is necessary to semantically integrate these distinct terminologies. The semantic integration by using a third terminology as a support is a conventional approach for the integration of two terminologies that are not very structur...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Nikiema JN,Jouhet V,Mougin F

    更新日期:2017-10-01 00:00:00

  • Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus.

    abstract::The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on the de-identification of longitudinal medical records. For this track, we de-identified a set of 1304 longitudinal medical records describing 296 patients. This corpus was de-identified under a broad interpretation of the HIPAA ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Stubbs A,Uzuner Ö

    更新日期:2015-12-01 00:00:00

  • Interacting agents through a web-based health serviceflow management system.

    abstract::The management of chronic and out-patients is a complex process which requires the cooperation of different agents belonging to several organizational units. Patients have to move to different locations to access the necessary services and to communicate their health status data. From their point of view there should ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Leonardi G,Panzarasa S,Quaglini S,Stefanelli M,van der Aalst WM

    更新日期:2007-10-01 00:00:00

  • Mapping high-dimensional data onto a relative distance plane--an exact method for visualizing and characterizing high-dimensional patterns.

    abstract::We introduce a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances between points with respect to any two reference patterns in a special two-dimensional coordinate system, the relative distance pl...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Somorjai RL,Dolenko B,Demko A,Mandelzweig M,Nikulin AE,Baumgartner R,Pizzi NJ

    更新日期:2004-10-01 00:00:00

  • A Bayesian system to detect and characterize overlapping outbreaks.

    abstract::Outbreaks of infectious diseases such as influenza are a significant threat to human health. Because there are different strains of influenza which can cause independent outbreaks, and influenza can affect demographic groups at different rates and times, there is a need to recognize and characterize multiple outbreaks...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Aronis JM,Millett NE,Wagner MM,Tsui F,Ye Y,Ferraro JP,Haug PJ,Gesteland PH,Cooper GF

    更新日期:2017-09-01 00:00:00

  • A reference ontology for biomedical informatics: the Foundational Model of Anatomy.

    abstract::The Foundational Model of Anatomy (FMA), initially developed as an enhancement of the anatomical content of UMLS, is a domain ontology of the concepts and relationships that pertain to the structural organization of the human body. It encompasses the material objects from the molecular to the macroscopic levels that c...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Rosse C,Mejino JL Jr

    更新日期:2003-12-01 00:00:00

  • A novel web informatics approach for automated surveillance of cancer mortality trends.

    abstract::Cancer surveillance data are collected every year in the United States via the National Program of Cancer Registries (NPCR) and the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI). General trends are closely monitored to measure the nation's progress against cancer. The...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Tourassi G,Yoon HJ,Xu S

    更新日期:2016-06-01 00:00:00

  • The impact of SNOMED CT revisions on a mapped interface terminology: terminology development and implementation issues.

    abstract::Large-scale mapping efforts have been done in attempts to migrate systems that use proprietary concepts to ones that use terminological standards such as SNOMED CT. As efforts move towards implementation, the target maps should retain a predictable structure including those targets requiring post-coordination of SNOME...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Wade G,Rosenbloom ST

    更新日期:2009-06-01 00:00:00

  • Comparison of orthogonal NLP methods for clinical phenotyping and assessment of bone scan utilization among prostate cancer patients.

    abstract:OBJECTIVE:Clinical care guidelines recommend that newly diagnosed prostate cancer patients at high risk for metastatic spread receive a bone scan prior to treatment and that low risk patients not receive it. The objective was to develop an automated pipeline to interrogate heterogeneous data to evaluate the use of bone...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Coquet J,Bozkurt S,Kan KM,Ferrari MK,Blayney DW,Brooks JD,Hernandez-Boussard T

    更新日期:2019-06-01 00:00:00

  • DyKOSMap: A framework for mapping adaptation between biomedical knowledge organization systems.

    abstract:BACKGROUND:Knowledge Organization Systems (KOS) and their associated mappings play a central role in several decision support systems. However, by virtue of knowledge evolution, KOS entities are modified over time, impacting mappings and potentially turning them invalid. This requires semi-automatic methods to maintain...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Dos Reis JC,Pruski C,Da Silveira M,Reynaud-Delaître C

    更新日期:2015-06-01 00:00:00

  • Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier.

    abstract::Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Kumar M,Rath NK,Rath SK

    更新日期:2016-04-01 00:00:00

  • A deep learning approach for predicting the quality of online health expert question-answering services.

    abstract::Recently, online health expert question-answering (HQA) services (systems) have attracted more and more health consumers to ask health-related questions everywhere at any time due to the convenience and effectiveness. However, the quality of answers in existing HQA systems varies in different situations. It is signifi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Hu Z,Zhang Z,Yang H,Chen Q,Zuo D

    更新日期:2017-07-01 00:00:00

  • Automated annotation and classification of BI-RADS assessment from radiology reports.

    abstract::The Breast Imaging Reporting and Data System (BI-RADS) was developed to reduce variation in the descriptions of findings. Manual analysis of breast radiology report data is challenging but is necessary for clinical and healthcare quality assurance activities. The objective of this study is to develop a natural languag...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Castro SM,Tseytlin E,Medvedeva O,Mitchell K,Visweswaran S,Bekhuis T,Jacobson RS

    更新日期:2017-05-01 00:00:00

  • Development of a clinician reputation metric to identify appropriate problem-medication pairs in a crowdsourced knowledge base.

    abstract:BACKGROUND:Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: McCoy AB,Wright A,Rogith D,Fathiamini S,Ottenbacher AJ,Sittig DF

    更新日期:2014-04-01 00:00:00

  • Selecting significant genes by randomization test for cancer classification using gene expression data.

    abstract::Gene selection is an important task in bioinformatics studies, because the accuracy of cancer classification generally depends upon the genes that have biological relevance to the classifying problems. In this work, randomization test (RT) is used as a gene selection method for dealing with gene expression data. In th...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Mao Z,Cai W,Shao X

    更新日期:2013-08-01 00:00:00

  • Evaluating performance of early warning indices to predict physiological instabilities.

    abstract::Patient monitoring algorithms that analyze multiple features from physiological signals can produce an index that serves as a predictive or prognostic measure for a specific critical health event or physiological instability. Classical detection metrics such as sensitivity and positive predictive value are often used ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Scully CG,Daluwatte C

    更新日期:2017-11-01 00:00:00

  • Monitoring Obstructive Sleep Apnea by means of a real-time mobile system based on the automatic extraction of sets of rules through Differential Evolution.

    abstract::Real-time Obstructive Sleep Apnea (OSA) episode detection and monitoring are important for society in terms of an improvement in the health of the general population and of a reduction in mortality and healthcare costs. Currently, to diagnose OSA patients undergo PolySomnoGraphy (PSG), a complicated and invasive test ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Sannino G,De Falco I,De Pietro G

    更新日期:2014-06-01 00:00:00

  • A framework for modeling health behavior protocols and their linkage to behavioral theory.

    abstract::With the rise in chronic, behavior-related disease, computerized behavioral protocols (CBPs) that help individuals improve behaviors have the potential to play an increasing role in the future health of society. To be effective and widely used CBPs should be based on accepted behavioral theory. However, designing CBPs...

    journal_title:Journal of biomedical informatics

    pub_type: 临床试验,杂志文章


    authors: Lenert L,Norman GJ,Mailhot M,Patrick K

    更新日期:2005-08-01 00:00:00

  • Inductive creation of an annotation schema for manually indexing clinical conditions from emergency department reports.

    abstract::Evaluating automated indexing applications requires comparing automatically indexed terms against manual reference standard annotations. However, there are no standard guidelines for determining which words from a textual document to include in manual annotations, and the vague task can result in substantial variation...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Chapman WW,Dowling JN

    更新日期:2006-04-01 00:00:00

  • Classification of forensic autopsy reports through conceptual graph-based document representation model.

    abstract::Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation,...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Mujtaba G,Shuib L,Raj RG,Rajandram R,Shaikh K,Al-Garadi MA

    更新日期:2018-06-01 00:00:00

  • Using natural language processing to extract mammographic findings.

    abstract:OBJECTIVE:Structured data on mammographic findings are difficult to obtain without manual review. We developed and evaluated a rule-based natural language processing (NLP) system to extract mammographic findings from free-text mammography reports. MATERIALS AND METHODS:The NLP system extracted four mammographic findin...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Gao H,Aiello Bowles EJ,Carrell D,Buist DS

    更新日期:2015-04-01 00:00:00