RedMed: Extending drug lexicons for social media applications.

Abstract:

:Social media has been identified as a promising potential source of information for pharmacovigilance. The adoption of social media data has been hindered by the massive and noisy nature of the data. Initial attempts to use social media data have relied on exact text matches to drugs of interest, and therefore suffer from the gap between formal drug lexicons and the informal nature of social media. The Reddit comment archive represents an ideal corpus for bridging this gap. We trained a word embedding model, RedMed, to facilitate the identification and retrieval of health entities from Reddit data. We compare the performance of our model trained on a consumer-generated corpus against publicly available models trained on expert-generated corpora. Our automated classification pipeline achieves an accuracy of 0.88 and a specificity of >0.9 across four different term classes. Of all drug mentions, an average of 79% (±0.5%) were exact matches to a generic or trademark drug name, 14% (±0.5%) were misspellings, 6.4% (±0.3%) were synonyms, and 0.13% (±0.05%) were pill marks. We find that our system captures an additional 20% of mentions; these would have been missed by approaches that rely solely on exact string matches. We provide a lexicon of misspellings and synonyms for 2978 drugs and a word embedding model trained on a health-oriented subset of Reddit.

journal_name

J Biomed Inform

authors

Lavertu A,Altman RB

doi

10.1016/j.jbi.2019.103307

subject

Has Abstract

pub_date

2019-11-01 00:00:00

pages

103307

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(19)30226-6

journal_volume

99

pub_type

杂志文章
  • Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: the SHARPn project.

    abstract::The Strategic Health IT Advanced Research Projects (SHARP) Program, established by the Office of the National Coordinator for Health Information Technology in 2010 supports research findings that remove barriers for increased adoption of health IT. The improvements envisioned by the SHARP Area 4 Consortium (SHARPn) wi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.01.009

    authors: Rea S,Pathak J,Savova G,Oniki TA,Westberg L,Beebe CE,Tao C,Parker CG,Haug PJ,Huff SM,Chute CG

    更新日期:2012-08-01 00:00:00

  • A method and software framework for enriching private biomedical sources with data from public online repositories.

    abstract::Modern biomedical research relies on the semantic integration of heterogeneous data sources to find data correlations. Researchers access multiple datasets of disparate origin, and identify elements-e.g. genes, compounds, pathways-that lead to interesting correlations. Normally, they must refer to additional public da...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.02.004

    authors: Anguita A,García-Remesal M,Graf N,Maojo V

    更新日期:2016-04-01 00:00:00

  • A survey on literature based discovery approaches in biomedical domain.

    abstract::Literature Based Discovery (LBD) refers to the problem of inferring new and interesting knowledge by logically connecting independent fragments of information units through explicit or implicit means. This area of research, which incorporates techniques from Natural Language Processing (NLP), Information Retrieval and...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2019.103141

    authors: Gopalakrishnan V,Jha K,Jin W,Zhang A

    更新日期:2019-05-01 00:00:00

  • Quantifying semantic similarity of clinical evidence in the biomedical literature to facilitate related evidence synthesis.

    abstract:OBJECTIVE:Published clinical trials and high quality peer reviewed medical publications are considered as the main sources of evidence used for synthesizing systematic reviews or practicing Evidence Based Medicine (EBM). Finding all relevant published evidence for a particular medical case is a time and labour intensiv...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103321

    authors: Hassanzadeh H,Nguyen A,Verspoor K

    更新日期:2019-12-01 00:00:00

  • The impact of SNOMED CT revisions on a mapped interface terminology: terminology development and implementation issues.

    abstract::Large-scale mapping efforts have been done in attempts to migrate systems that use proprietary concepts to ones that use terminological standards such as SNOMED CT. As efforts move towards implementation, the target maps should retain a predictable structure including those targets requiring post-coordination of SNOME...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.03.004

    authors: Wade G,Rosenbloom ST

    更新日期:2009-06-01 00:00:00

  • Cognitive simulators for medical education and training.

    abstract::Simulators for honing procedural skills (such as surgical skills and central venous catheter placement) have proven to be valuable tools for medical educators and students. While such simulations represent an effective paradigm in surgical education, there is an opportunity to add a layer of cognitive exercises to the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.02.008

    authors: Kahol K,Vankipuram M,Smith ML

    更新日期:2009-08-01 00:00:00

  • Classification of forensic autopsy reports through conceptual graph-based document representation model.

    abstract::Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation,...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.04.013

    authors: Mujtaba G,Shuib L,Raj RG,Rajandram R,Shaikh K,Al-Garadi MA

    更新日期:2018-06-01 00:00:00

  • Computing with evidence Part II: An evidential approach to predicting metabolic drug-drug interactions.

    abstract::We describe a novel experiment that we conducted with the Drug Interaction Knowledge-base (DIKB) to determine which combinations of evidence enable a rule-based theory of metabolic drug-drug interactions to make the most optimal set of predictions. The focus of the experiment was a group of 16 drugs including six memb...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.05.010

    authors: Boyce R,Collins C,Horn J,Kalet I

    更新日期:2009-12-01 00:00:00

  • A Hidden Semi-Markov Model based approach for rehabilitation exercise assessment.

    abstract::In this paper, a Hidden Semi-Markov Model (HSMM) based approach is proposed to evaluate and monitor body motion during a rehabilitation training program. The approach extracts clinically relevant motion features from skeleton joint trajectories, acquired by the RGB-D camera, and provides a score for the subject's perf...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.12.012

    authors: Capecci M,Ceravolo MG,Ferracuti F,Iarlori S,Kyrki V,Monteriù A,Romeo L,Verdini F

    更新日期:2018-02-01 00:00:00

  • Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches.

    abstract::The accurate diagnosis of heart failure in emergency room patients is quite important, but can also be quite difficult due to our insufficient understanding of the characteristics of heart failure. The purpose of this study is to design a decision-making model that provides critical factors and knowledge associated wi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.04.013

    authors: Son CS,Kim YN,Kim HS,Park HS,Kim MS

    更新日期:2012-10-01 00:00:00

  • Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.

    abstract::Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the w...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.04.013

    authors: De-Arteaga M,Eggel I,Do B,Rubin D,Kahn CE Jr,Müller H

    更新日期:2015-08-01 00:00:00

  • A model-driven methodology for exploring complex disease comorbidities applied to autism spectrum disorder and inflammatory bowel disease.

    abstract::We propose a model-driven methodology aimed to shed light on complex disorders. Our approach enables exploring shared etiologies of comorbid diseases at the molecular pathway level. The method, Comparative Comorbidities Simulation (CCS), uses stochastic Petri net simulation for examining the phenotypic effects of pert...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.08.008

    authors: Somekh J,Peleg M,Eran A,Koren I,Feiglin A,Demishtein A,Shiloh R,Heiner M,Kong SW,Elazar Z,Kohane I

    更新日期:2016-10-01 00:00:00

  • Using UMLS to construct a generalized hierarchical concept-based dictionary of brain functions for information extraction from the fMRI literature.

    abstract::With a rapid progress in the field, a great many fMRI studies are published every year, to the extent that it is now becoming difficult for researchers to keep up with the literature, since reading papers is extremely time-consuming and labor-intensive. Thus, automatic information extraction has become an important is...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.04.003

    authors: Hsiao MY,Chen CC,Chen JH

    更新日期:2009-10-01 00:00:00

  • Systematic comparison of the protein-protein interaction databases from a user's perspective.

    abstract::In absence of periodic systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among the many protein-protein interaction (PPI) databases and tools. We conducted a comprehensive compilation and comparison of such resources. We compiled 375 PPI resources, short-listed 125 impor...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103380

    authors: Bajpai AK,Davuluri S,Tiwary K,Narayanan S,Oguru S,Basavaraju K,Dayalan D,Thirumurugan K,Acharya KK

    更新日期:2020-03-01 00:00:00

  • Automatic signal extraction, prioritizing and filtering approaches in detecting post-marketing cardiovascular events associated with targeted cancer drugs from the FDA Adverse Event Reporting System (FAERS).

    abstract:OBJECTIVE:Targeted drugs dramatically improve the treatment outcomes in cancer patients; however, these innovative drugs are often associated with unexpectedly high cardiovascular toxicity. Currently, cardiovascular safety represents both a challenging issue for drug developers, regulators, researchers, and clinicians ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.10.008

    authors: Xu R,Wang Q

    更新日期:2014-02-01 00:00:00

  • Quality assurance of chemical ingredient classification for the National Drug File - Reference Terminology.

    abstract::The National Drug File - Reference Terminology (NDF-RT) is a large and complex drug terminology consisting of several classification hierarchies on top of an extensive collection of drug concepts. These hierarchies provide important information about clinical drugs, e.g., their chemical ingredients, mechanisms of acti...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.07.013

    authors: Zheng L,Yumak H,Chen L,Ochs C,Geller J,Kapusnik-Uner J,Perl Y

    更新日期:2017-09-01 00:00:00

  • LGscore: A method to identify disease-related genes using biological literature and Google data.

    abstract::Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which iden...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.01.003

    authors: Kim J,Kim H,Yoon Y,Park S

    更新日期:2015-04-01 00:00:00

  • Methodological variations in lagged regression for detecting physiologic drug effects in EHR data.

    abstract::We studied how lagged linear regression can be used to detect the physiologic effects of drugs from data in the electronic health record (EHR). We systematically examined the effect of methodological variations ((i) time series construction, (ii) temporal parameterization, (iii) intra-subject normalization, (iv) diffe...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.08.014

    authors: Levine ME,Albers DJ,Hripcsak G

    更新日期:2018-10-01 00:00:00

  • Concept and implementation of a study dashboard module for a continuous monitoring of trial recruitment and documentation.

    abstract:BACKGROUND:The difficulty of managing patient recruitment and documentation for clinical trials prompts a demand for instruments for closely monitoring these critical but unpredictable processes. Increasingly adopted Electronic Data Capture (EDC) applications provide novel opportunities to reutilize stored information ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.10.010

    authors: Toddenroth D,Sivagnanasundaram J,Prokosch HU,Ganslandt T

    更新日期:2016-12-01 00:00:00

  • Medical diagnosis of atherosclerosis from Carotid Artery Doppler Signals using principal component analysis (PCA), k-NN based weighting pre-processing and Artificial Immune Recognition System (AIRS).

    abstract::In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the f...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2007.04.001

    authors: Latifoğlu F,Polat K,Kara S,Güneş S

    更新日期:2008-02-01 00:00:00

  • Comparison between passive vision-based system and a wearable inertial-based system for estimating temporal gait parameters related to the GAITRite electronic walkway.

    abstract::Quantitative gait analysis allows clinicians to assess the inherent gait variability over time which is a functional marker to aid in the diagnosis of disabilities or diseases such as frailty, the onset of cognitive decline and neurodegenerative diseases, among others. However, despite the accuracy achieved by the cur...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.07.009

    authors: González I,López-Nava IH,Fontecha J,Muñoz-Meléndez A,Pérez-SanPablo AI,Quiñones-Urióstegui I

    更新日期:2016-08-01 00:00:00

  • A hybrid of whale optimization and late acceptance hill climbing based imputation to enhance classification performance in electronic health records.

    abstract::Electronic health records (EHR) are a major source of information in biomedical informatics. Yet, missing values are prominent characteristics of EHR. Prediction on dataset with missing values results in inaccurate inferences. Nearest neighbour imputation based on lazy learning approach is a proven technique for missi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103190

    authors: Nagarajan G,Dhinesh Babu LD

    更新日期:2019-06-01 00:00:00

  • Mining association language patterns using a distributional semantic model for negative life event classification.

    abstract:PURPOSE:Negative life events, such as the death of a family member, an argument with a spouse or the loss of a job, play an important role in triggering depressive episodes. Therefore, it is worthwhile to develop psychiatric services that can automatically identify such events. This study describes the use of associati...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.01.006

    authors: Yu LC,Chan CL,Lin CC,Lin IC

    更新日期:2011-08-01 00:00:00

  • Unstructured medical image query using big data - An epilepsy case study.

    abstract::Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.12.005

    authors: Istephan S,Siadat MR

    更新日期:2016-02-01 00:00:00

  • Using natural language processing to extract mammographic findings.

    abstract:OBJECTIVE:Structured data on mammographic findings are difficult to obtain without manual review. We developed and evaluated a rule-based natural language processing (NLP) system to extract mammographic findings from free-text mammography reports. MATERIALS AND METHODS:The NLP system extracted four mammographic findin...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.01.010

    authors: Gao H,Aiello Bowles EJ,Carrell D,Buist DS

    更新日期:2015-04-01 00:00:00

  • Evaluating performance of early warning indices to predict physiological instabilities.

    abstract::Patient monitoring algorithms that analyze multiple features from physiological signals can produce an index that serves as a predictive or prognostic measure for a specific critical health event or physiological instability. Classical detection metrics such as sensitivity and positive predictive value are often used ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.09.008

    authors: Scully CG,Daluwatte C

    更新日期:2017-11-01 00:00:00

  • A graph-based approach to auditing RxNorm.

    abstract:OBJECTIVES:RxNorm is a standardized nomenclature for clinical drug entities developed by the National Library of Medicine. In this paper, we audit relations in RxNorm for consistency and completeness through the systematic analysis of the graph of its concepts and relationships. METHODS:The representation of multi-ing...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.04.004

    authors: Bodenreider O,Peters LB

    更新日期:2009-06-01 00:00:00

  • Modelling and analysing the dynamics of disease progression from cross-sectional studies.

    abstract::Clinical trials are typically conducted over a population within a defined time period in order to illuminate certain characteristics of a health issue or disease process. These cross-sectional studies give us a 'snapshot' of this disease process over a large number of people but do not allow us to model the temporal ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.11.003

    authors: Li Y,Swift S,Tucker A

    更新日期:2013-04-01 00:00:00

  • A Bayesian system to detect and characterize overlapping outbreaks.

    abstract::Outbreaks of infectious diseases such as influenza are a significant threat to human health. Because there are different strains of influenza which can cause independent outbreaks, and influenza can affect demographic groups at different rates and times, there is a need to recognize and characterize multiple outbreaks...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.08.003

    authors: Aronis JM,Millett NE,Wagner MM,Tsui F,Ye Y,Ferraro JP,Haug PJ,Gesteland PH,Cooper GF

    更新日期:2017-09-01 00:00:00

  • Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function.

    abstract::Computerized survival prediction in healthcare identifying the risk of disease mortality, helps healthcare providers to effectively manage their patients by providing appropriate treatment options. In this study, we propose to apply a classification algorithm, Contrast Pattern Aided Logistic Regression (CPXR(Log)) wit...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.01.009

    authors: Taslimitehrani V,Dong G,Pereira NL,Panahiazar M,Pathak J

    更新日期:2016-04-01 00:00:00