The validity of synthetic clinical data: a validation study of a leading synthetic data generator (Synthea) using clinical quality measures.

Abstract:

BACKGROUND:Clinical data synthesis aims at generating realistic data for healthcare research, system implementation and training. It protects patient confidentiality, deepens our understanding of the complexity in healthcare, and is a promising tool for situations where real world data is difficult to obtain or unnecessary. However, its validity has not been fully examined, and no previous study has validated it from the perspective of healthcare quality, a critical aspect of a healthcare system. This study fills this gap by calculating clinical quality measures using synthetic data. METHODS:We examined an open-source well-documented synthetic data generator Synthea, which was composed of the key advancements in this emerging technique. We selected a representative 1.2-million Massachusetts patient cohort generated by Synthea. Four quality measures, Colorectal Cancer Screening, Chronic Obstructive Pulmonary Disease (COPD) 30-Day Mortality, Rate of Complications after Hip/Knee Replacement, and Controlling High Blood Pressure, were selected based on clinical significance. Calculated rates were then compared with publicly reported rates based on real-world data of Massachusetts and United States. RESULTS:Of the total Synthea Massachusetts population (n = 1,193,439), 394,476 were eligible for the "colorectal cancer screening" quality measure, and 248,433 (63%) were considered compliant, compared to the publicly reported Massachusetts and national rates being 77.3 and 69.8%, respectively. Of the 409 eligible patients, 0.7% of died within 30 days after COPD exacerbation, versus 7% reported in Massachusetts and 8% nationally. Using an expanded logic, this rate increased to 5.7%. No Synthea residents had complications after Hip/Knee Replacement (Massachusetts: 2.9%, national: 2.8%) or had their blood pressure controlled after being diagnosed with hypertension (Massachusetts: 74.52%, national: 69.7%). Results show that Synthea is quite reliable in modeling demographics and probabilities of services being offered in an average healthcare setting. However, its capabilities to model heterogeneous health outcomes post services are limited. CONCLUSIONS:Synthea and other synthetic patient generators do not currently model for deviations in care and the potential outcomes that may result from care deviations. To output a more realistic data set, we propose that synthetic data generators should consider important quality measures in their logic and model when clinicians may deviate from standard practice.

authors

Chen J,Chun D,Patel M,Chiang E,James J

doi

10.1186/s12911-019-0793-0

subject

Has Abstract

pub_date

2019-03-14 00:00:00

pages

44

issue

1

issn

1472-6947

pii

10.1186/s12911-019-0793-0

journal_volume

19

pub_type

杂志文章
  • Comparison of clinical knowledge management capabilities of commercially-available and leading internally-developed electronic health records.

    abstract:BACKGROUND:We have carried out an extensive qualitative research program focused on the barriers and facilitators to successful adoption and use of various features of advanced, state-of-the-art electronic health records (EHRs) within large, academic, teaching facilities with long-standing EHR research and development ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-11-13

    authors: Sittig DF,Wright A,Meltzer S,Simonaitis L,Evans RS,Nichol WP,Ash JS,Middleton B

    更新日期:2011-02-17 00:00:00

  • Implementation of automated reporting of estimated glomerular filtration rate among Veterans Affairs laboratories: a retrospective study.

    abstract:BACKGROUND:Automated reporting of estimated glomerular filtration rate (eGFR) is a recent advance in laboratory information technology (IT) that generates a measure of kidney function with chemistry laboratory results to aid early detection of chronic kidney disease (CKD). Because accurate diagnosis of CKD is critical ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-12-69

    authors: Hall RK,Wang V,Jackson GL,Hammill BG,Maciejewski ML,Yano EM,Svetkey LP,Patel UD

    更新日期:2012-07-12 00:00:00

  • Adaptation of a web-based, open source electronic medical record system platform to support a large study of tuberculosis epidemiology.

    abstract:BACKGROUND:In 2006, we were funded by the US National Institutes of Health to implement a study of tuberculosis epidemiology in Peru. The study required a secure information system to manage data from a target goal of 16,000 subjects who needed to be followed for at least one year. With previous experience in the devel...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-12-125

    authors: Fraser HS,Thomas D,Tomaylla J,Garcia N,Lecca L,Murray M,Becerra MC

    更新日期:2012-11-07 00:00:00

  • User acceptance of a picture archiving and communication system (PACS) in a Saudi Arabian hospital radiology department.

    abstract:BACKGROUND:Compared with the increasingly widespread use of picture archiving and communication systems (PACSs), knowledge concerning users' acceptance of such systems is limited. Knowledge of acceptance is needed given the large (and growing) financial investment associated with the implementation of PACSs, and becaus...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-12-44

    authors: Aldosari B

    更新日期:2012-05-28 00:00:00

  • Simulating an emergency department: the importance of modeling the interactions between physicians and delegates in a discrete event simulation.

    abstract:BACKGROUND:Computer simulation studies of the emergency department (ED) are often patient driven and consider the physician as a human resource whose primary activity is interacting directly with the patient. In many EDs, physicians supervise delegates such as residents, physician assistants and nurse practitioners eac...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-13-59

    authors: Lim ME,Worster A,Goeree R,Tarride JÉ

    更新日期:2013-05-22 00:00:00

  • Information sharing across generations and environments (InfoSAGE): study design and methodology protocol.

    abstract:BACKGROUND:Longevity creates increasing care needs for healthcare providers and family caregivers. Increasingly, the burden of care falls to one primary caregiver, increasing stress and reducing health outcomes. Additionally, little has been published on adults', over the age of 75, preferences in the development of he...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0697-4

    authors: Quintana Y,Crotty B,Fahy D,Lipsitz L,Davis RB,Safran C

    更新日期:2018-11-20 00:00:00

  • Prediction of blood culture outcome using hybrid neural network model based on electronic health records.

    abstract:BACKGROUND:Blood cultures are often performed to detect patients who has a serious illness without infections and patients with bloodstream infections. Early positive blood culture prediction is important, as bloodstream infections may cause inflammation of the body, even organ failure or death. However, existing work ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-1113-4

    authors: Cheng M,Zhao X,Ding X,Gao J,Xiong S,Ren Y

    更新日期:2020-07-09 00:00:00

  • Duke Surgery Research Central: an open-source Web application for the improvement of compliance with research regulation.

    abstract:BACKGROUND:Although regulatory compliance in academic research is enforced by law to ensure high quality and safety to participants, its implementation is frequently hindered by cost and logistical barriers. In order to decrease these barriers, we have developed a Web-based application, Duke Surgery Research Central (D...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-6-32

    authors: Pietrobon R,Shah A,Kuo P,Harker M,McCready M,Butler C,Martins H,Moorman CT,Jacobs DO

    更新日期:2006-07-27 00:00:00

  • Mobile phone usage in patients with type II diabetes and their intention to use it for self-management: a cross-sectional study in Iran.

    abstract:BACKGROUND:Mobile health has potential for promotion of self-management in patients with chronic diseases. This study was conducted to investigate smartphone usage in patients with type II diabetes and their intention to use it for self-management. METHODS:This cross-sectional study was conducted in 2018 with 176 pati...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-1038-y

    authors: Rangraz Jeddi F,Nabovati E,Hamidi R,Sharif R

    更新日期:2020-02-07 00:00:00

  • Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches.

    abstract:BACKGROUND:Text mining and natural language processing of clinical text, such as notes from electronic health records, requires specific consideration of the specialized characteristics of these texts. Deep learning methods could potentially mitigate domain specific challenges such as limited access to in-domain tools ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-019-0981-y

    authors: Weegar R,Pérez A,Casillas A,Oronoz M

    更新日期:2019-12-23 00:00:00

  • Shared decision making in surgery: a scoping review of patient and surgeon preferences.

    abstract:BACKGROUND:Many suggest that shared decision-making (SDM) is the most effective approach to clinical counseling. It is unclear if this applies to surgical decision-making-especially regarding urgent, highly-morbid operations. In this scoping review, we identify articles that address patient and surgeon preferences towa...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章,评审

    doi:10.1186/s12911-020-01211-0

    authors: Shinkunas LA,Klipowicz CJ,Carlisle EM

    更新日期:2020-08-12 00:00:00

  • Data mining EEG signals in depression for their diagnostic value.

    abstract:BACKGROUND:Quantitative electroencephalogram (EEG) is one neuroimaging technique that has been shown to differentiate patients with major depressive disorder (MDD) and non-depressed healthy volunteers (HV) at the group-level, but its diagnostic potential for detecting differences at the individual level has yet to be r...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-015-0227-6

    authors: Mohammadi M,Al-Azab F,Raahemi B,Richards G,Jaworska N,Smith D,de la Salle S,Blier P,Knott V

    更新日期:2015-12-23 00:00:00

  • Stratification of coronary artery disease patients for revascularization procedure based on estimating adverse effects.

    abstract:BACKGROUND:Percutaneous coronary intervention (PCI) is the most commonly performed treatment for coronary atherosclerosis. It is associated with a higher incidence of repeat revascularization procedures compared to coronary artery bypass grafting surgery. Recent results indicate that PCI is only cost-effective for a su...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-015-0131-0

    authors: Pölsterl S,Singh M,Katouzian A,Navab N,Kastrati A,Ladic L,Kamen A

    更新日期:2015-02-14 00:00:00

  • Investigating the satisfaction level of physicians in regards to implementing medical Picture Archiving and Communication System (PACS).

    abstract:BACKGROUND:User satisfaction with PACS is considered as one of the important criteria for assessing success in using PACS. The objective of this study was to determine the level of user satisfaction with PACS and to compare its functional features with traditional film-based systems. METHODS:This study was conducted i...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01203-0

    authors: Abbasi R,Sadeqi Jabali M,Khajouei R,Tadayon H

    更新日期:2020-08-05 00:00:00

  • The effect of model selection on cost-effectiveness research: a comparison of kidney function-based microsimulation and disease grade-based microsimulation in chronic kidney disease modeling.

    abstract:BACKGROUND:Cost effectiveness research is emerging in the chronic kidney disease (CKD) research field. Especially, an individual-level state transition model (microsimulation) is widely used for these researches. Some researchers set CKD grades as discrete health states, and the transition probabilities between these s...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0678-7

    authors: Hiragi S,Tamura H,Goto R,Kuroda T

    更新日期:2018-11-09 00:00:00

  • An automated pipeline for analyzing medication event reports in clinical settings.

    abstract:BACKGROUND:Medication events in clinical settings are significant threats to patient safety. Analyzing and learning from the medication event reports is an important way to prevent the recurrence of these events. Currently, the analysis of medication event reports is ineffective and requires heavy workloads for clinici...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0687-6

    authors: Zhou S,Kang H,Yao B,Gong Y

    更新日期:2018-12-07 00:00:00

  • Recommended practices for computerized clinical decision support and knowledge management in community settings: a qualitative study.

    abstract:BACKGROUND:The purpose of this study was to identify recommended practices for computerized clinical decision support (CDS) development and implementation and for knowledge management (KM) processes in ambulatory clinics and community hospitals using commercial or locally developed systems in the U.S. METHODS:Guided b...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-12-6

    authors: Ash JS,Sittig DF,Guappone KP,Dykstra RH,Richardson J,Wright A,Carpenter J,McMullen C,Shapiro M,Bunce A,Middleton B

    更新日期:2012-02-14 00:00:00

  • Multiple constraints compromise decision-making about implantable medical devices for individual patients: qualitative interviews with physicians.

    abstract:BACKGROUND:Little research has examined how physicians choose medical devices for treating individual patients to reveal if interventions are needed to support decision-making and reduce device-associated morbidity and mortality. This study explored factors that influence choice of implantable device from among availab...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-017-0577-3

    authors: Gagliardi AR,Ducey A,Lehoux P,Turgeon T,Kolbunik J,Ross S,Trbovich P,Easty A,Bell C,Urbach DR

    更新日期:2017-12-22 00:00:00

  • Fine-grained information extraction from German transthoracic echocardiography reports.

    abstract:BACKGROUND:Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and cli...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-015-0215-x

    authors: Toepfer M,Corovic H,Fette G,Klügl P,Störk S,Puppe F

    更新日期:2015-11-12 00:00:00

  • An app for supporting older people receiving home care - usage, aspects of health and health literacy: a quasi-experimental study.

    abstract:BACKGROUND:During the last decade, there has been an increase in studies describing use of mHealth, using smartphones with apps, in the healthcare system by a variety of populations. Despite this, few interventions including apps are targeting older people receiving home care. Developing mobile technology to its full p...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01246-3

    authors: Göransson C,Wengström Y,Hälleberg-Nyman M,Langius-Eklöf A,Ziegert K,Blomberg K

    更新日期:2020-09-15 00:00:00

  • Data cleaning process for HIV-indicator data extracted from DHIS2 national reporting system: a case study of Kenya.

    abstract:BACKGROUND:The District Health Information Software-2 (DHIS2) is widely used by countries for national-level aggregate reporting of health-data. To best leverage DHIS2 data for decision-making, countries need to ensure that data within their systems are of the highest quality. Comprehensive, systematic, and transparent...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01315-7

    authors: Gesicho MB,Were MC,Babic A

    更新日期:2020-11-13 00:00:00

  • A tool for sharing annotated research data: the "Category 0" UMLS (Unified Medical Language System) vocabularies.

    abstract:BACKGROUND:Large biomedical data sets have become increasingly important resources for medical researchers. Modern biomedical data sets are annotated with standard terms to describe the data and to support data linking between databases. The largest curated listing of biomedical terms is the the National Library of Med...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-3-6

    authors: Berman JJ

    更新日期:2003-06-16 00:00:00

  • Perception of healthcare workers on mobile app-based clinical guideline for the detection and treatment of mental health problems in primary care: a qualitative study in Nepal.

    abstract:BACKGROUND:In recent years, a significant change has taken place in the health care delivery systems due to the availability of smartphones and mobile software applications. The use of mobile technology can help to reduce a number of barriers for mental health care such as providers' workload, lack of qualified personn...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-021-01386-0

    authors: Pokhrel P,Karmacharya R,Taylor Salisbury T,Carswell K,Kohrt BA,Jordans MJD,Lempp H,Thornicroft G,Luitel NP

    更新日期:2021-01-19 00:00:00

  • Algorithms for optimizing drug therapy.

    abstract:BACKGROUND:Drug therapy has become increasingly efficient, with more drugs available for treatment of an ever-growing number of conditions. Yet, drug use is reported to be sub optimal in several aspects, such as dosage, patient's adherence and outcome of therapy. The aim of the current study was to investigate the poss...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-4-10

    authors: Wanger P,Martin L

    更新日期:2004-07-20 00:00:00

  • Assessing accuracy of an electronic provincial medication repository.

    abstract:BACKGROUND:Jurisdictional drug information systems are being implemented in many regions around the world. British Columbia, Canada has had a provincial medication dispensing record, PharmaNet, system since 1995. Little is known about how accurately PharmaNet reflects actual medication usage. METHODS:This prospective,...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章,多中心研究

    doi:10.1186/1472-6947-12-42

    authors: Price M,Bowen M,Lau F,Kitson N,Bardal S

    更新日期:2012-05-23 00:00:00

  • The anxious wait: assessing the impact of patient accessible EHRs for breast cancer patients.

    abstract:BACKGROUND:Personal health records (PHRs) provide patients with access to personal health information (PHI) and targeted education. The use of PHRs has the potential to improve a wide range of outcomes, including empowering patients to be more active participants in their care. There are a number of widespread barriers...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-10-46

    authors: Wiljer D,Leonard KJ,Urowitz S,Apatu E,Massey C,Quartey NK,Catton P

    更新日期:2010-09-01 00:00:00

  • Non-redundant association rules between diseases and medications: an automated method for knowledge base construction.

    abstract:BACKGROUND:The widespread use of electronic health records (EHRs) has generated massive clinical data storage. Association rules mining is a feasible technique to convert this large amount of data into usable knowledge for clinical decision making, research or billing. We present a data driven method to create a knowle...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-015-0151-9

    authors: Séverac F,Sauleau EA,Meyer N,Lefèvre H,Nisand G,Jay N

    更新日期:2015-04-15 00:00:00

  • A hybrid solution for extracting structured medical information from unstructured data in medical records via a double-reading/entry system.

    abstract:BACKGROUND:Healthcare providers generate a huge amount of biomedical data stored in either legacy system (paper-based) format or electronic medical records (EMR) around the world, which are collectively referred to as big biomedical data (BBD). To realize the promise of BBD for clinical use and research, it is an essen...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-016-0357-5

    authors: Luo L,Li L,Hu J,Wang X,Hou B,Zhang T,Zhao LP

    更新日期:2016-08-30 00:00:00

  • SciReader enables reading of medical content with instantaneous definitions.

    abstract:BACKGROUND:A major problem patients encounter when reading about health related issues is document interpretation, which limits reading comprehension and therefore negatively impacts health care. Currently, searching for medical definitions from an external source is time consuming, distracting, and negatively impacts ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-11-4

    authors: Gradie PR,Litster M,Thomas R,Vyas J,Schiller MR

    更新日期:2011-01-25 00:00:00

  • Leveraging healthcare utilization to explore outcomes from musculoskeletal disorders: methodology for defining relevant variables from a health services data repository.

    abstract:BACKGROUND:Large healthcare databases, with their ability to collect many variables from daily medical practice, greatly enable health services research. These longitudinal databases provide large cohorts and longitudinal time frames, allowing for highly pragmatic assessment of healthcare delivery. The purpose of this ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0588-8

    authors: Rhon DI,Clewley D,Young JL,Sissel CD,Cook CE

    更新日期:2018-01-31 00:00:00