A clustering approach for detecting implausible observation values in electronic health records data.

Abstract:

BACKGROUND:Identifying implausible clinical observations (e.g., laboratory test and vital sign values) in Electronic Health Record (EHR) data using rule-based procedures is challenging. Anomaly/outlier detection methods can be applied as an alternative algorithmic approach to flagging such implausible values in EHRs. METHODS:The primary objectives of this research were to develop and test an unsupervised clustering-based anomaly/outlier detection approach for detecting implausible observations in EHR data as an alternative algorithmic solution to the existing procedures. Our approach is built upon two underlying hypotheses that, (i) when there are large number of observations, implausible records should be sparse, and therefore (ii) if these data are clustered properly, clusters with sparse populations should represent implausible observations. To test these hypotheses, we applied an unsupervised clustering algorithm to EHR observation data on 50 laboratory tests from Partners HealthCare. We tested different specifications of the clustering approach and computed confusion matrix indices against a set of silver-standard plausibility thresholds. We compared the results from the proposed approach with conventional anomaly detection (CAD) approaches, including standard deviation and Mahalanobis distance. RESULTS:We found that the clustering approach produced results with exceptional specificity and high sensitivity. Compared with the conventional anomaly detection approaches, our proposed clustering approach resulted in significantly smaller number of false positive cases. CONCLUSION:Our contributions include (i) a clustering approach for identifying implausible EHR observations, (ii) evidence that implausible observations are sparse in EHR laboratory test results, (iii) a parallel implementation of the clustering approach on i2b2 star schema, and (3) a set of silver-standard plausibility thresholds for 50 laboratory tests that can be used in other studies for validation. The proposed algorithmic solution can augment human decisions to improve data quality. Therefore, a workflow is needed to complement the algorithm's job and initiate necessary actions that need to be taken in order to improve the quality of data.

authors

Estiri H,Klann JG,Murphy SN

doi

10.1186/s12911-019-0852-6

subject

Has Abstract

pub_date

2019-07-23 00:00:00

pages

142

issue

1

issn

1472-6947

pii

10.1186/s12911-019-0852-6

journal_volume

19

pub_type

杂志文章
  • Shared decision making in surgery: a scoping review of patient and surgeon preferences.

    abstract:BACKGROUND:Many suggest that shared decision-making (SDM) is the most effective approach to clinical counseling. It is unclear if this applies to surgical decision-making-especially regarding urgent, highly-morbid operations. In this scoping review, we identify articles that address patient and surgeon preferences towa...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章,评审

    doi:10.1186/s12911-020-01211-0

    authors: Shinkunas LA,Klipowicz CJ,Carlisle EM

    更新日期:2020-08-12 00:00:00

  • The impact of health information technologies on quality improvement methodologies' efficiency, throughput and financial outcomes: a retrospective observational study.

    abstract:BACKGROUND:To evaluate whether or not the utilization of Health Information Technologies (HITs) in Quality Improvement Methodologies (QIMs) has impacts on QIMs' efficiency, throughput and financial outcomes at healthcare organizations and physician practices in the United States. METHODS:This is a retrospective observ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-016-0395-z

    authors: AlHazme RH,Haque SS,Wiggin H,Rana AM

    更新日期:2016-12-05 00:00:00

  • Legal assessment tool (LAT): an interactive tool to address privacy and data protection issues for data sharing.

    abstract:BACKGROUND:In an unprecedented rate data in the life sciences is generated and stored in many different databases. An ever increasing part of this data is human health data and therefore falls under data protected by legal regulations. As part of the BioMedBridges project, which created infrastructures that connect mor...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-016-0325-0

    authors: Kuchinke W,Krauth C,Bergmann R,Karakoyun T,Woollard A,Schluender I,Braasch B,Eckert M,Ohmann C

    更新日期:2016-07-07 00:00:00

  • Information sharing across generations and environments (InfoSAGE): study design and methodology protocol.

    abstract:BACKGROUND:Longevity creates increasing care needs for healthcare providers and family caregivers. Increasingly, the burden of care falls to one primary caregiver, increasing stress and reducing health outcomes. Additionally, little has been published on adults', over the age of 75, preferences in the development of he...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0697-4

    authors: Quintana Y,Crotty B,Fahy D,Lipsitz L,Davis RB,Safran C

    更新日期:2018-11-20 00:00:00

  • Healthcare professional acceptance of telemonitoring for chronic care patients in primary care.

    abstract:BACKGROUND:A pilot experimentation of a telemonitoring system for chronic care patients is conducted in the Bilbao Primary Care Health Region (Basque Country, Spain). It seems important to understand the factors related to healthcare professionals' acceptance of this new technology in order to inform its extension to t...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-12-139

    authors: Asua J,Orruño E,Reviriego E,Gagnon MP

    更新日期:2012-11-30 00:00:00

  • Quality of human-computer interaction--results of a national usability survey of hospital-IT in Germany.

    abstract:BACKGROUND:Due to the increasing functionality of medical information systems, it is hard to imagine day to day work in hospitals without IT support. Therefore, the design of dialogues between humans and information systems is one of the most important issues to be addressed in health care. This survey presents an anal...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-11-69

    authors: Bundschuh BB,Majeed RW,Bürkle T,Kuhn K,Sax U,Seggewies C,Vosseler C,Röhrig R

    更新日期:2011-11-09 00:00:00

  • Web-based online resources about adverse interactions or side effects associated with complementary and alternative medicine: a systematic review, summarization and quality assessment.

    abstract:BACKGROUND:Given an increased global prevalence of complementary and alternative medicine (CAM) use, healthcare providers commonly seek CAM-related health information online. Numerous online resources containing CAM-specific information exist, many of which are readily available/accessible, containing information share...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01298-5

    authors: Ng JY,Munford V,Thakar H

    更新日期:2020-11-09 00:00:00

  • A simulation study comparing aberration detection algorithms for syndromic surveillance.

    abstract:BACKGROUND:The usefulness of syndromic surveillance for early outbreak detection depends in part on effective statistical aberration detection. However, few published studies have compared different detection algorithms on identical data. In the largest simulation study conducted to date, we compared the performance of...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-7-6

    authors: Jackson ML,Baer A,Painter I,Duchin J

    更新日期:2007-03-01 00:00:00

  • Novel methodology to measure pre-procedure antimicrobial prophylaxis: integrating text searches with structured data from the Veterans Health Administration's electronic medical record.

    abstract:BACKGROUND:Antimicrobial prophylaxis is an evidence-proven strategy for reducing procedure-related infections; however, measuring this key quality metric typically requires manual review, due to the way antimicrobial prophylaxis is documented in the electronic medical record (EMR). Our objective was to electronically m...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-1031-5

    authors: Mull HJ,Stolzmann K,Kalver E,Shin MH,Schweizer ML,Asundi A,Mehta P,Stanislawski M,Branch-Elliman W

    更新日期:2020-01-30 00:00:00

  • Implementation of machine learning algorithms to create diabetic patient re-admission profiles.

    abstract:BACKGROUND:Machine learning is a branch of Artificial Intelligence that is concerned with the design and development of algorithms, and it enables today's computers to have the property of learning. Machine learning is gradually growing and becoming a critical approach in many domains such as health, education, and bus...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-019-0990-x

    authors: Alloghani M,Aljaaf A,Hussain A,Baker T,Mustafina J,Al-Jumeily D,Khalaf M

    更新日期:2019-12-12 00:00:00

  • Automatic schizophrenic discrimination on fNIRS by using complex brain network analysis and SVM.

    abstract:BACKGROUND:Schizophrenia is a kind of serious mental illness. Due to the lack of an objective physiological data supporting and a unified data analysis method, doctors can only rely on the subjective experience of the data to distinguish normal people and patients, which easily lead to misdiagnosis. In recent years, fu...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-017-0559-5

    authors: Song H,Chen L,Gao R,Bogdan IIM,Yang J,Wang S,Dong W,Quan W,Dang W,Yu X

    更新日期:2017-12-20 00:00:00

  • Use of and attitudes to a hospital information system by medical secretaries, nurses and physicians deprived of the paper-based medical record: a case report.

    abstract:BACKGROUND:Most hospitals keep and update their paper-based medical records after introducing an electronic medical record or a hospital information system (HIS). This case report describes a HIS in a hospital where the paper-based medical records are scanned and eliminated. To evaluate the HIS comprehensively, the per...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-4-18

    authors: Laerum H,Karlsen TH,Faxvaag A

    更新日期:2004-10-16 00:00:00

  • Real-time automatic hospital-wide surveillance of nosocomial infections and outbreaks in a large Chinese tertiary hospital.

    abstract:BACKGROUND:We aimed to develop a real-time nosocomial infection surveillance system (RT-NISS) to monitor all nosocomial infections (NIs) and outbreaks in a Chinese comprehensive hospital to better prevent and control NIs. METHODS:The screening algorithm used in RT-NISS included microbiological reports, antibiotic usag...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-14-9

    authors: Du M,Xing Y,Suo J,Liu B,Jia N,Huo R,Chen C,Liu Y

    更新日期:2014-01-29 00:00:00

  • Evaluating performance of health care facilities at meeting HIV-indicator reporting requirements in Kenya: an application of K-means clustering algorithm.

    abstract:BACKGROUND:The ability to report complete, accurate and timely data by HIV care providers and other entities is a key aspect in monitoring trends in HIV prevention, treatment and care, hence contributing to its eradication. In many low-middle-income-countries (LMICs), aggregate HIV data reporting is done through the Di...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01367-9

    authors: Gesicho MB,Were MC,Babic A

    更新日期:2021-01-06 00:00:00

  • The development of a web- and a print-based decision aid for prostate cancer screening.

    abstract:BACKGROUND:Whether early detection and treatment of prostate cancer (PCa) will reduce disease-related mortality remains uncertain. As a result, tools are needed to facilitate informed decision making. While there have been several decision aids (DAs) developed and tested, very few have included an exercise to help men ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章,随机对照试验

    doi:10.1186/1472-6947-10-12

    authors: Dorfman CS,Williams RM,Kassan EC,Red SN,Dawson DL,Tuong W,Parker ER,Ohene-Frempong J,Davis KM,Krist AH,Woolf SH,Schwartz MD,Fishman MB,Cole C,Taylor KL

    更新日期:2010-03-03 00:00:00

  • Investigating the satisfaction level of physicians in regards to implementing medical Picture Archiving and Communication System (PACS).

    abstract:BACKGROUND:User satisfaction with PACS is considered as one of the important criteria for assessing success in using PACS. The objective of this study was to determine the level of user satisfaction with PACS and to compare its functional features with traditional film-based systems. METHODS:This study was conducted i...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01203-0

    authors: Abbasi R,Sadeqi Jabali M,Khajouei R,Tadayon H

    更新日期:2020-08-05 00:00:00

  • A Severe Acute Respiratory Syndrome Extranet: supporting local communication and information dissemination.

    abstract:BACKGROUND:The objective of this study was to explore the use and perceptions of a local Severe Acute Respiratory Syndrome (SARS) Extranet and its potential to support future information and communication applications. The SARS Extranet was a single, managed electronic and limited access system to manage local, provinc...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-5-17

    authors: Valaitis RK,Akhtar-Danesh N,Kealey CM,Brunetti GM,Thomas H

    更新日期:2005-06-20 00:00:00

  • Caregivers' role in using a personal electronic health record: a qualitative study of cancer patients and caregivers in Germany.

    abstract:BACKGROUND:Particularly in the context of severe diseases like cancer, many patients wish to include caregivers in the planning of treatment and care. Many caregivers like to be involved but feel insufficiently enabled. This study aimed at providing insight into patients' and caregivers' perspectives on caregivers' rol...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01172-4

    authors: Weis A,Pohlmann S,Poss-Doering R,Strauss B,Ullrich C,Hofmann H,Ose D,Winkler EC,Szecsenyi J,Wensing M

    更新日期:2020-07-13 00:00:00

  • Establishing a baseline for literature mining human genetic variants and their relationships to disease cohorts.

    abstract:BACKGROUND:The Variome corpus, a small collection of published articles about inherited colorectal cancer, includes annotations of 11 entity types and 13 relation types related to the curation of the relationship between genetic variation and disease. Due to the richness of these annotations, the corpus provides a good...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-016-0294-3

    authors: Verspoor KM,Heo GE,Kang KY,Song M

    更新日期:2016-07-18 00:00:00

  • Leveraging healthcare utilization to explore outcomes from musculoskeletal disorders: methodology for defining relevant variables from a health services data repository.

    abstract:BACKGROUND:Large healthcare databases, with their ability to collect many variables from daily medical practice, greatly enable health services research. These longitudinal databases provide large cohorts and longitudinal time frames, allowing for highly pragmatic assessment of healthcare delivery. The purpose of this ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0588-8

    authors: Rhon DI,Clewley D,Young JL,Sissel CD,Cook CE

    更新日期:2018-01-31 00:00:00

  • Initial development of Supportive care Assessment, Prioritization and Recommendations for Kids (SPARK), a symptom screening and management application.

    abstract:BACKGROUND:We developed Supportive care Prioritization, Assessment and Recommendations for Kids (SPARK), a web-based application designed to facilitate symptom screening by children receiving cancer treatments and access to supportive care clinical practice guidelines primarily by healthcare providers. The objective wa...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0715-6

    authors: Cook S,Vettese E,Soman D,Hyslop S,Kuczynski S,Spiegler B,Davis H,Duong N,Ou Wai S,Golabek R,Golabek P,Antoszek-Rallo A,Schechter T,Lee Dupuis L,Sung L

    更新日期:2019-01-10 00:00:00

  • Development of a web-based patient decision aid for initiating disease modifying anti-rheumatic drugs using user-centred design methods.

    abstract:BACKGROUND:A main element of patient-centred care, Patient Decision Aids (PtDAs) facilitate shared decision-making (SDM). A recent update of the International Patient Decision Aids Standards (IPDAS) emphasised patient involvement during PtDA development, but omitted a methodology for doing so. This article reports on t...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-017-0433-5

    authors: Nota I,Drossaert CHC,Melissant HC,Taal E,Vonkeman HE,Haagsma CJ,van de Laar MAFJ

    更新日期:2017-04-26 00:00:00

  • Violence detection explanation via semantic roles embeddings.

    abstract:BACKGROUND:Emergency room reports pose specific challenges to natural language processing techniques. In this setting, violence episodes on women, elderly and children are often under-reported. Categorizing textual descriptions as containing violence-related injuries (V) vs. non-violence-related injuries (NV) is thus a...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01237-4

    authors: Mensa E,Colla D,Dalmasso M,Giustini M,Mamo C,Pitidis A,Radicioni DP

    更新日期:2020-10-15 00:00:00

  • A cohort study of a tailored web intervention for preconception care.

    abstract:BACKGROUND:Preconception care may be an efficacious tool to reduce risk factors for adverse pregnancy outcomes that are associated with lifestyles and health status before pregnancy. We conducted a web-based cohort study in Italian women planning a pregnancy to assess whether a tailored web intervention may change know...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-14-33

    authors: Agricola E,Pandolfi E,Gonfiantini MV,Gesualdo F,Romano M,Carloni E,Mastroiacovo P,Tozzi AE

    更新日期:2014-04-15 00:00:00

  • Surveillance of dengue vectors using spatio-temporal Bayesian modeling.

    abstract:BACKGROUND:At present, dengue control focuses on reducing the density of the primary vector for the disease, Aedes aegypti, which is the only vulnerable link in the chain of transmission. The use of new approaches for dengue entomological surveillance is extremely important, since present methods are inefficient. With ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-015-0219-6

    authors: C Costa AC,Codeço CT,Honório NA,Pereira GR,N Pinheiro CF,Nobre AA

    更新日期:2015-11-13 00:00:00

  • AliClu - Temporal sequence alignment for clustering longitudinal clinical data.

    abstract:BACKGROUND:Patient stratification is a critical task in clinical decision making since it can allow physicians to choose treatments in a personalized way. Given the increasing availability of electronic medical records (EMRs) with longitudinal data, one crucial problem is how to efficiently cluster the patients based o...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-019-1013-7

    authors: Rama K,Canhão H,Carvalho AM,Vinga S

    更新日期:2019-12-30 00:00:00

  • Stratification of coronary artery disease patients for revascularization procedure based on estimating adverse effects.

    abstract:BACKGROUND:Percutaneous coronary intervention (PCI) is the most commonly performed treatment for coronary atherosclerosis. It is associated with a higher incidence of repeat revascularization procedures compared to coronary artery bypass grafting surgery. Recent results indicate that PCI is only cost-effective for a su...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-015-0131-0

    authors: Pölsterl S,Singh M,Katouzian A,Navab N,Kastrati A,Ladic L,Kamen A

    更新日期:2015-02-14 00:00:00

  • Utility-preserving anonymization for health data publishing.

    abstract:BACKGROUND:Publishing raw electronic health records (EHRs) may be considered as a breach of the privacy of individuals because they usually contain sensitive information. A common practice for the privacy-preserving data publishing is to anonymize the data before publishing, and thus satisfy privacy models such as k-an...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-017-0499-0

    authors: Lee H,Kim S,Kim JW,Chung YD

    更新日期:2017-07-11 00:00:00

  • The validity of synthetic clinical data: a validation study of a leading synthetic data generator (Synthea) using clinical quality measures.

    abstract:BACKGROUND:Clinical data synthesis aims at generating realistic data for healthcare research, system implementation and training. It protects patient confidentiality, deepens our understanding of the complexity in healthcare, and is a promising tool for situations where real world data is difficult to obtain or unneces...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-019-0793-0

    authors: Chen J,Chun D,Patel M,Chiang E,James J

    更新日期:2019-03-14 00:00:00

  • A method for managing re-identification risk from small geographic areas in Canada.

    abstract:BACKGROUND:A common disclosure control practice for health datasets is to identify small geographic areas and either suppress records from these small areas or aggregate them into larger ones. A recent study provided a method for deciding when an area is too small based on the uniqueness criterion. The uniqueness crite...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-10-18

    authors: El Emam K,Brown A,AbdelMalik P,Neisa A,Walker M,Bottomley J,Roffey T

    更新日期:2010-04-02 00:00:00