Customization scenarios for de-identification of clinical notes.

Abstract:

BACKGROUND:Automated machine-learning systems are able to de-identify electronic medical records, including free-text clinical notes. Use of such systems would greatly boost the amount of data available to researchers, yet their deployment has been limited due to uncertainty about their performance when applied to new datasets. OBJECTIVE:We present practical options for clinical note de-identification, assessing performance of machine learning systems ranging from off-the-shelf to fully customized. METHODS:We implement a state-of-the-art machine learning de-identification system, training and testing on pairs of datasets that match the deployment scenarios. We use clinical notes from two i2b2 competition corpora, the Physionet Gold Standard corpus, and parts of the MIMIC-III dataset. RESULTS:Fully customized systems remove 97-99% of personally identifying information. Performance of off-the-shelf systems varies by dataset, with performance mostly above 90%. Providing a small labeled dataset or large unlabeled dataset allows for fine-tuning that improves performance over off-the-shelf systems. CONCLUSION:Health organizations should be aware of the levels of customization available when selecting a de-identification deployment solution, in order to choose the one that best matches their resources and target performance level.

authors

Hartman T,Howell MD,Dean J,Hoory S,Slyper R,Laish I,Gilon O,Vainstein D,Corrado G,Chou K,Po MJ,Williams J,Ellis S,Bee G,Hassidim A,Amira R,Beryozkin G,Szpektor I,Matias Y

doi

10.1186/s12911-020-1026-2

subject

Has Abstract

pub_date

2020-01-30 00:00:00

pages

14

issue

1

issn

1472-6947

pii

10.1186/s12911-020-1026-2

journal_volume

20

pub_type

杂志文章
  • Designing a multifaceted survivorship care plan to meet the information and communication needs of breast cancer patients and their family physicians: results of a qualitative pilot study.

    abstract:BACKGROUND:Following the completion of treatment and as they enter the follow-up phase, breast cancer patients (BCPs) often recount feeling 'lost in transition', and are left with many questions concerning how their ongoing care and monitoring for recurrence will be managed. Family physicians (FPs) also frequently repo...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-13-76

    authors: Haq R,Heus L,Baker NA,Dastur D,Leung FH,Leung E,Li B,Vu K,Parsons JA

    更新日期:2013-07-25 00:00:00

  • Considering patient safety in autonomous e-mental health systems - detecting risk situations and referring patients back to human care.

    abstract:BACKGROUND:Digital health interventions can fill gaps in mental healthcare provision. However, autonomous e-mental health (AEMH) systems also present challenges for effective risk management. To balance autonomy and safety, AEMH systems need to detect risk situations and act on these appropriately. One option is sendin...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-019-0796-x

    authors: Tielman ML,Neerincx MA,Pagliari C,Rizzo A,Brinkman WP

    更新日期:2019-03-18 00:00:00

  • A predictive model for the early identification of patients at risk for a prolonged intensive care unit length of stay.

    abstract:BACKGROUND:Patients with a prolonged intensive care unit (ICU) length of stay account for a disproportionate amount of resource use. Early identification of patients at risk for a prolonged length of stay can lead to quality enhancements that reduce ICU stay. This study developed and validated a model that identifies p...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-10-27

    authors: Kramer AA,Zimmerman JE

    更新日期:2010-05-13 00:00:00

  • A machine learning framework for accurately recognizing circular RNAs for clinical decision-supporting.

    abstract:BACKGROUND:Circular RNAs (circRNAs) are those RNA molecules that lack the poly (A) tails, which present the closed-loop structure. Recent studies emphasized that some circRNAs imply different functions from canonical transcripts, and further associated with complex diseases. Several computational methods have been deve...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-1117-0

    authors: Wang Y,Zhang X,Wang T,Xing J,Wu Z,Li W,Wang J

    更新日期:2020-07-09 00:00:00

  • Automated extraction of Biomarker information from pathology reports.

    abstract:BACKGROUND:Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports. METHODS:We designed a new data model for representing biomarker knowledge. The ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0609-7

    authors: Lee J,Song HJ,Yoon E,Park SB,Park SH,Seo JW,Park P,Choi J

    更新日期:2018-05-21 00:00:00

  • An end-to-end hybrid algorithm for automated medication discrepancy detection.

    abstract:BACKGROUND:In this study we implemented and developed state-of-the-art machine learning (ML) and natural language processing (NLP) technologies and built a computerized algorithm for medication reconciliation. Our specific aims are: (1) to develop a computerized algorithm for medication discrepancy detection between pa...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-015-0160-8

    authors: Li Q,Spooner SA,Kaiser M,Lingren N,Robbins J,Lingren T,Tang H,Solti I,Ni Y

    更新日期:2015-05-06 00:00:00

  • Adherence to standardized assessments through a complexity-based model for categorizing rehabilitation©: design and implementation in an acute hospital.

    abstract:BACKGROUND:The use of measurement instruments has become a major issue in physical therapy, but their use in daily practice is rare. The aim of this paper is to describe adherence to standardized assessments by physical therapists using a complexity-based model for categorizing rehabilitation (CMCR) at the Clínica Alem...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0590-1

    authors: Gutiérrez Panchana T,Hidalgo Cabalín V

    更新日期:2018-03-12 00:00:00

  • Digital health system for personalised COPD long-term management.

    abstract:BACKGROUND:Recent telehealth studies have demonstrated minor impact on patients affected by long-term conditions. The use of technology does not guarantee the compliance required for sustained collection of high-quality symptom and physiological data. Remote monitoring alone is not sufficient for successful disease man...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章,随机对照试验

    doi:10.1186/s12911-017-0414-8

    authors: Velardo C,Shah SA,Gibson O,Clifford G,Heneghan C,Rutter H,Farmer A,Tarassenko L,EDGE COPD Team.

    更新日期:2017-02-20 00:00:00

  • Process evaluation of discharge planning implementation in healthcare using normalization process theory.

    abstract:BACKGROUND:Discharge planning is a care process that aims to secure the transfer of care for the patient at transition from home to the hospital and back home. Information exchange and collaboration between care providers are essential, but deficits are common. A wide range of initiatives to improve the discharge plann...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-016-0285-4

    authors: Nordmark S,Zingmark K,Lindberg I

    更新日期:2016-04-27 00:00:00

  • A survey of factors affecting clinician acceptance of clinical decision support.

    abstract:BACKGROUND:Real-time clinical decision support (CDS) integrated into clinicians' workflow has the potential to profoundly affect the cost, quality, and safety of health care delivery. Recent reports have identified a surprisingly low acceptance rate for different types of CDS. We hypothesized that factors affecting CDS...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-6-6

    authors: Sittig DF,Krall MA,Dykstra RH,Russell A,Chin HL

    更新日期:2006-02-01 00:00:00

  • The information imperative: to study the impact of informational discontinuity on clinical decision making among doctors.

    abstract:BACKGROUND:Informational discontinuity can have far reaching consequences like medical errors, increased re-hospitalization rates and adverse events among others. Thus the holy grail of seamless informational continuity in healthcare has been an enigma with some nations going the digital way. Digitization in healthcare...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01190-2

    authors: Gowda NR,Kumar A,Arya SK,H V

    更新日期:2020-07-28 00:00:00

  • Data mining EEG signals in depression for their diagnostic value.

    abstract:BACKGROUND:Quantitative electroencephalogram (EEG) is one neuroimaging technique that has been shown to differentiate patients with major depressive disorder (MDD) and non-depressed healthy volunteers (HV) at the group-level, but its diagnostic potential for detecting differences at the individual level has yet to be r...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-015-0227-6

    authors: Mohammadi M,Al-Azab F,Raahemi B,Richards G,Jaworska N,Smith D,de la Salle S,Blier P,Knott V

    更新日期:2015-12-23 00:00:00

  • Web-based online resources about adverse interactions or side effects associated with complementary and alternative medicine: a systematic review, summarization and quality assessment.

    abstract:BACKGROUND:Given an increased global prevalence of complementary and alternative medicine (CAM) use, healthcare providers commonly seek CAM-related health information online. Numerous online resources containing CAM-specific information exist, many of which are readily available/accessible, containing information share...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01298-5

    authors: Ng JY,Munford V,Thakar H

    更新日期:2020-11-09 00:00:00

  • Design and evaluation of a mobile application to assist the self-monitoring of the chronic kidney disease in developing countries.

    abstract:BACKGROUND:The chronic kidney disease (CKD) is a worldwide critical problem, especially in developing countries. CKD patients usually begin their treatment in advanced stages, which requires dialysis and kidney transplantation, and consequently, affects mortality rates. This issue is faced by a mobile health (mHealth) ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0587-9

    authors: Sobrinho A,da Silva LD,Perkusich A,Pinheiro ME,Cunha P

    更新日期:2018-01-12 00:00:00

  • A practical approach for incorporating dependence among fields in probabilistic record linkage.

    abstract:BACKGROUND:Methods for linking real-world healthcare data often use a latent class model, where the latent, or unknown, class is the true match status of candidate record-pairs. This commonly used model assumes that agreement patterns among multiple fields within a latent class are independent. When this assumption is ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-13-97

    authors: Daggy JK,Xu H,Hui SL,Gamache RE,Grannis SJ

    更新日期:2013-08-30 00:00:00

  • Selected articles from the Third International Workshop on Semantics-Powered Data Analytics (SEPDA 2018).

    abstract::In this editorial, we first summarize the Third International Workshop on Semantics-Powered Data Analytics (SEPDA 2018) held on December 3, 2018 in conjunction with the 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2018) in Madrid, Spain, and then briefly introduce five research articles i...

    journal_title:BMC medical informatics and decision making

    pub_type: 社论

    doi:10.1186/s12911-019-0855-3

    authors: He Z,Bian J,Tao C,Zhang R

    更新日期:2019-08-08 00:00:00

  • Attitudes of pediatric intensive care unit physicians towards the use of cognitive aids: a qualitative study.

    abstract:BACKGROUND:Cognitive aids are increasingly recommended in clinical practice, yet little is known about the attitudes of physicians towards these tools. METHODS:We employed a qualitative, descriptive design to explore physician attitudes towards cognitive aids in pediatric intensive care units (PICUs). Semi-structured ...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-016-0291-6

    authors: Weiss MJ,Kramer C,Tremblay S,Côté L

    更新日期:2016-05-21 00:00:00

  • Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project.

    abstract:BACKGROUND:Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of this study is to present an evaluat...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-017-0566-6

    authors: Sakr S,Elshawi R,Ahmed AM,Qureshi WT,Brawner CA,Keteyian SJ,Blaha MJ,Al-Mallah MH

    更新日期:2017-12-19 00:00:00

  • Visibility of medical informatics regarding bibliometric indices and databases.

    abstract:BACKGROUND:The quantitative study of the publication output (bibliometrics) deeply influences how scientific work is perceived (bibliometric visibility). Recently, new bibliometric indices and databases have been established, which may change the visibility of disciplines, institutions and individuals. This study exami...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-11-24

    authors: Spreckelsen C,Deserno TM,Spitzer K

    更新日期:2011-04-15 00:00:00

  • Information sharing across generations and environments (InfoSAGE): study design and methodology protocol.

    abstract:BACKGROUND:Longevity creates increasing care needs for healthcare providers and family caregivers. Increasingly, the burden of care falls to one primary caregiver, increasing stress and reducing health outcomes. Additionally, little has been published on adults', over the age of 75, preferences in the development of he...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0697-4

    authors: Quintana Y,Crotty B,Fahy D,Lipsitz L,Davis RB,Safran C

    更新日期:2018-11-20 00:00:00

  • TASKA: A modular task management system to support health research studies.

    abstract:BACKGROUND:Many healthcare databases have been routinely collected over the past decades, to support clinical practice and administrative services. However, their secondary use for research is often hindered by restricted governance rules. Furthermore, health research studies typically involve many participants with co...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-019-0844-6

    authors: Almeida JR,Gini R,Roberto G,Rijnbeek P,Oliveira JL

    更新日期:2019-07-02 00:00:00

  • Comprehensive user requirements engineering methodology for secure and interoperable health data exchange.

    abstract:BACKGROUND:Increased digitalization of healthcare comes along with the cost of cybercrime proliferation. This results to patients' and healthcare providers' skepticism to adopt Health Information Technologies (HIT). In Europe, this shortcoming hampers efficient cross-border health data exchange, which requires a holist...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0664-0

    authors: Natsiavas P,Rasmussen J,Voss-Knude M,Votis Κ,Coppolino L,Campegiani P,Cano I,Marí D,Faiella G,Clemente F,Nalin M,Grivas E,Stan O,Gelenbe E,Dumortier J,Petersen J,Tzovaras D,Romano L,Komnios I,Koutkias V

    更新日期:2018-10-16 00:00:00

  • Perception of healthcare workers on mobile app-based clinical guideline for the detection and treatment of mental health problems in primary care: a qualitative study in Nepal.

    abstract:BACKGROUND:In recent years, a significant change has taken place in the health care delivery systems due to the availability of smartphones and mobile software applications. The use of mobile technology can help to reduce a number of barriers for mental health care such as providers' workload, lack of qualified personn...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-021-01386-0

    authors: Pokhrel P,Karmacharya R,Taylor Salisbury T,Carswell K,Kohrt BA,Jordans MJD,Lempp H,Thornicroft G,Luitel NP

    更新日期:2021-01-19 00:00:00

  • A method for managing re-identification risk from small geographic areas in Canada.

    abstract:BACKGROUND:A common disclosure control practice for health datasets is to identify small geographic areas and either suppress records from these small areas or aggregate them into larger ones. A recent study provided a method for deciding when an area is too small based on the uniqueness criterion. The uniqueness crite...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/1472-6947-10-18

    authors: El Emam K,Brown A,AbdelMalik P,Neisa A,Walker M,Bottomley J,Roffey T

    更新日期:2010-04-02 00:00:00

  • Sharing longitudinal, non-biological birth cohort data: a cross-sectional analysis of parent consent preferences.

    abstract:BACKGROUND:Mandates abound to share publicly-funded research data for reuse, while data platforms continue to emerge to facilitate such reuse. Birth cohorts (BC) involve longitudinal designs, significant sample sizes and rich and deep datasets. Data sharing benefits include more analyses, greater research complexity, i...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0683-x

    authors: Manhas KP,Dodd SX,Page S,Letourneau N,Adair CE,Cui X,Tough SC

    更新日期:2018-11-12 00:00:00

  • Evaluating performance of health care facilities at meeting HIV-indicator reporting requirements in Kenya: an application of K-means clustering algorithm.

    abstract:BACKGROUND:The ability to report complete, accurate and timely data by HIV care providers and other entities is a key aspect in monitoring trends in HIV prevention, treatment and care, hence contributing to its eradication. In many low-middle-income-countries (LMICs), aggregate HIV data reporting is done through the Di...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01367-9

    authors: Gesicho MB,Were MC,Babic A

    更新日期:2021-01-06 00:00:00

  • Assessing data availability and quality within an electronic health record system through external validation against an external clinical data source.

    abstract:BACKGROUND:Approximately 20% of deaths in the US each year are attributable to smoking, yet current practices in the recording of this health risk in electronic health records (EHRs) have not led to discernable changes in health outcomes. Several groups have developed algorithms for extracting smoking behaviors from cl...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-019-0864-2

    authors: Palmer EL,Higgins J,Hassanpour S,Sargent J,Robinson CM,Doherty JA,Onega T

    更新日期:2019-07-25 00:00:00

  • The FeverApp registry - ecological momentary assessment (EMA) of fever management in families regarding conformity to up-to-date recommendations.

    abstract:BACKGROUND:Fever is one of the most common symptoms of pediatric consultations and its mismanagement is a health care burden. Guidelines on fever management are incoherent and data on fever management are still missing. This study protocol describes an app-based registry to evaluate the fever management of parents. OB...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-020-01269-w

    authors: Martin D,Wachtmeister J,Ludwigs K,Jenetzky E

    更新日期:2020-10-01 00:00:00

  • "Assessment of the social influence and facilitating conditions that support nurses' adoption of hospital electronic information management systems (HEIMS) in Ghana using the unified theory of acceptance and use of technology (UTAUT) model".

    abstract:BACKGROUND:Hospital electronic information management systems (HEIMS) are widely used in Ghana, and hence its performance must be carefully assessed. Nurses as clinical health personnel are the largest cluster of hospital staff and are the pillar of healthcare delivery. Therefore, they play a crucial role in the adopti...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-019-0956-z

    authors: Zhou LL,Owusu-Marfo J,Asante Antwi H,Antwi MO,Kachie ADT,Ampon-Wireko S

    更新日期:2019-11-21 00:00:00

  • Internet videos and colorectal cancer in mainland China: a content analysis.

    abstract:BACKGROUND:Colorectal cancer incidence and mortality have been increasing in China and as one of the most important health problems facing the nation. Adequate dissemination of correct information about colorectal cancer could help in reducing cancer-related morbidity and mortality. This study aims to assess the comple...

    journal_title:BMC medical informatics and decision making

    pub_type: 杂志文章

    doi:10.1186/s12911-018-0711-x

    authors: Zhang S,Yang Y,Yan D,Yuan B,Jiang X,Song C

    更新日期:2018-12-04 00:00:00