Combining glass box and black box evaluations in the identification of heart disease risk factors and their temporal relations from clinical records.

Abstract:

BACKGROUND:The determination of risk factors and their temporal relations in natural language patient records is a complex task which has been addressed in the i2b2/UTHealth 2014 shared task. In this context, in most systems it was broadly decomposed into two sub-tasks implemented by two components: entity detection, and temporal relation determination. Task-level ("black box") evaluation is relevant for the final clinical application, whereas component-level evaluation ("glass box") is important for system development and progress monitoring. Unfortunately, because of the interaction between entity representation and temporal relation representation, glass box and black box evaluation cannot be managed straightforwardly at the same time in the setting of the i2b2/UTHealth 2014 task, making it difficult to assess reliably the relative performance and contribution of the individual components to the overall task. OBJECTIVE:To identify obstacles and propose methods to cope with this difficulty, and illustrate them through experiments on the i2b2/UTHealth 2014 dataset. METHODS:We outline several solutions to this problem and examine their requirements in terms of adequacy for component-level and task-level evaluation and of changes to the task framework. We select the solution which requires the least modifications to the i2b2 evaluation framework and illustrate it with our system. This system identifies risk factor mentions with a CRF system complemented by hand-designed patterns, identifies and normalizes temporal expressions through a tailored version of the Heideltime tool, and determines temporal relations of each risk factor with a One Rule classifier. RESULTS:Giving a fixed value to the temporal attribute in risk factor identification proved to be the simplest way to evaluate the risk factor detection component independently. This evaluation method enabled us to identify the risk factor detection component as most contributing to the false negatives and false positives of the global system. This led us to redirect further effort to this component, focusing on medication detection, with gains of 7 to 20 recall points and of 3 to 6 F-measure points depending on the corpus and evaluation. CONCLUSION:We proposed a method to achieve a clearer glass box evaluation of risk factor detection and temporal relation detection in clinical texts, which can provide an example to help system development in similar tasks. This glass box evaluation was instrumental in refocusing our efforts and obtaining substantial improvements in risk factor detection.

journal_name

J Biomed Inform

authors

Grouin C,Moriceau V,Zweigenbaum P

doi

10.1016/j.jbi.2015.06.014

subject

Has Abstract

pub_date

2015-12-01 00:00:00

pages

S133-42

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(15)00124-0

journal_volume

58 Suppl

pub_type

杂志文章
  • Feature selection techniques for maximum entropy based biomedical named entity recognition.

    abstract::Named entity recognition is an extremely important and fundamental task of biomedical text mining. Biomedical named entities include mentions of proteins, genes, DNA, RNA, etc which often have complex structures, but it is challenging to identify and classify such entities. Machine learning methods like CRF, MEMM and ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2008.12.012

    authors: Saha SK,Sarkar S,Mitra P

    更新日期:2009-10-01 00:00:00

  • The Counterfactual χ-GAN: Finding comparable cohorts in observational health data.

    abstract::Causal inference often relies on the counterfactual framework, which requires that treatment assignment is independent of the outcome, known as strong ignorability. Approaches to enforcing strong ignorability in causal analyses of observational data include weighting and matching methods. Effect estimates, such as the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103515

    authors: Averitt AJ,Vanitchanant N,Ranganath R,Perotte AJ

    更新日期:2020-09-01 00:00:00

  • Classification of forensic autopsy reports through conceptual graph-based document representation model.

    abstract::Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation,...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.04.013

    authors: Mujtaba G,Shuib L,Raj RG,Rajandram R,Shaikh K,Al-Garadi MA

    更新日期:2018-06-01 00:00:00

  • Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles.

    abstract::Anticancer drug-associated side effect knowledge often exists in multiple heterogeneous and complementary data sources. A comprehensive anticancer drug-side effect (drug-SE) relationship knowledge base is important for computation-based drug target discovery, drug toxicity predication and drug repositioning. In this s...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2014.10.002

    authors: Xu R,Wang Q

    更新日期:2015-02-01 00:00:00

  • Modeling association detection in order to discover compounds to inhibit oral cancer.

    abstract::In the past, algorithms exploiting varying semantics in interactions between biological objects such as genes and diseases have been used in bioinformatics to uncover latent relationships within biological datasets. In this paper, we consider the algorithm Medusa in parallel with binary classification in order to find...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.07.005

    authors: Vittal S,Karthikeyan G

    更新日期:2018-08-01 00:00:00

  • A knowledge-based system to find over-the-counter medicines for self-medication.

    abstract::This study developed a medicine query system based on Semantic Web and open data especially for self-medication users to search over-the-counter (OTC) medicines. Most existing medicine query systems are based on keyword searches. If users are uncertain about the exact search words, these query systems do not offer eff...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103504

    authors: Sung HY,Chi YL

    更新日期:2020-08-01 00:00:00

  • Vaidurya: a multiple-ontology, concept-based, context-sensitive clinical-guideline search engine.

    abstract::We designed and implemented a generic search engine (Vaidurya), as part of our Digital clinical-Guideline Library (DeGeL) framework. Two search methods were implemented in addition to full-text search: (1) concept-based search, which relies on pre-indexing the guidelines in a clinically meaningful fashion, and (2) con...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2008.07.003

    authors: Moskovitch R,Shahar Y

    更新日期:2009-02-01 00:00:00

  • Colorado Care Tablet: the design of an interoperable Personal Health Application to help older adults with multimorbidity manage their medications.

    abstract::Medication errors are common and cause serious health issues during care transitions, particularly for older adults with multiple chronic conditions. In this paper, we discuss the design and evaluation of the Colorado Care Tablet, a Personal Health Application (PHA) that helps older adults and their lay caregivers man...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2010.05.007

    authors: Siek KA,Ross SE,Khan DU,Haverhals LM,Cali SR,Meyers J

    更新日期:2010-10-01 00:00:00

  • Specifying computer-based counseling systems in health care: a new approach to user-interface and interaction design.

    abstract::Computer-based counseling systems in health care play an important role in the toolset available for medical doctors to inform, motivate and challenge their patients according to a well-defined therapeutic goal. The design, development and implementation of such systems require close collaboration between users, i.e. ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2008.10.005

    authors: Herzberg D,Marsden N,Kübler P,Leonhardt C,Thomanek S,Jung H,Becker A

    更新日期:2009-04-01 00:00:00

  • PharmActa: Personalized pharmaceutical care eHealth platform for patients and pharmacists.

    abstract::Community pharmacists are critically placed in the patient care chain being an extended frontline within primary healthcare networks across Europe. They are trained to ensure safe and effective medication use, a crucial and responsible role, extending beyond the common misconception limited to just providing timely ac...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103336

    authors: Spanakis M,Sfakianakis S,Kallergis G,Spanakis EG,Sakkalis V

    更新日期:2019-12-01 00:00:00

  • Desiderata for domain reference ontologies in biomedicine.

    abstract::Domain reference ontologies represent knowledge about a particular part of the world in a way that is independent from specific objectives, through a theory of the domain. An example of reference ontology in biomedical informatics is the Foundational Model of Anatomy (FMA), an ontology of anatomy that covers the entir...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2005.09.002

    authors: Burgun A

    更新日期:2006-06-01 00:00:00

  • Patient empowerment for cancer patients through a novel ICT infrastructure.

    abstract::As a result of recent advances in cancer research and "precision medicine" approaches, i.e. the idea of treating each patient with the right drug at the right time, more and more cancer patients are being cured, or might have to cope with a life with cancer. For many people, cancer survival today means living with a c...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103342

    authors: Kondylakis H,Bucur A,Crico C,Dong F,Graf N,Hoffman S,Koumakis L,Manenti A,Marias K,Mazzocco K,Pravettoni G,Renzi C,Schera F,Triberti S,Tsiknakis M,Kiefer S

    更新日期:2020-01-01 00:00:00

  • DiseaSE: A biomedical text analytics system for disease symptom extraction and characterization.

    abstract::Due to increasing volume and unstructured nature of the scientific literatures in biomedical domain, most of the information embedded within them remain untapped. This paper presents a biomedical text analytics system, DiseaSE (Disease Symptom Extraction), to identify and extract disease symptoms and their association...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103324

    authors: Abulaish M,Parwez MA,Jahiruddin

    更新日期:2019-12-01 00:00:00

  • Knowledge-based personalized search engine for the Web-based Human Musculoskeletal System Resources (HMSR) in biomechanics.

    abstract::Human musculoskeletal system resources of the human body are valuable for the learning and medical purposes. Internet-based information from conventional search engines such as Google or Yahoo cannot response to the need of useful, accurate, reliable and good-quality human musculoskeletal resources related to medical ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.11.001

    authors: Dao TT,Hoang TN,Ta XH,Tho MC

    更新日期:2013-02-01 00:00:00

  • Homology assessment and molecular sequence alignment.

    abstract::Hypotheses of homology are the basis of phylogenetic analysis. All character data are considered to be equivalent regardless of the source of those characters. Putative homology statements are designated based on observations of similarity. Pairwise sequence alignment using the Needleman-Wunsch algorithm is the basis ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2005.11.005

    authors: Phillips AJ

    更新日期:2006-02-01 00:00:00

  • Multi-step ahead meningitis case forecasting based on decomposition and multi-objective optimization methods.

    abstract::Epidemiological time series forecasting plays an important role in health public systems, due to its ability to allow managers to develop strategic planning to avoid possible epidemics. In this paper, a hybrid learning framework is developed to forecast multi-step-ahead (one, two, and three-month-ahead) meningitis cas...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103575

    authors: Ribeiro MHDM,Mariani VC,Coelho LDS

    更新日期:2020-11-01 00:00:00

  • Inductive creation of an annotation schema for manually indexing clinical conditions from emergency department reports.

    abstract::Evaluating automated indexing applications requires comparing automatically indexed terms against manual reference standard annotations. However, there are no standard guidelines for determining which words from a textual document to include in manual annotations, and the vague task can result in substantial variation...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2005.06.004

    authors: Chapman WW,Dowling JN

    更新日期:2006-04-01 00:00:00

  • Prediction of influenza vaccination outcome by neural networks and logistic regression.

    abstract::The major challenge in influenza vaccination is to predict vaccine efficacy. The purpose of this study was to design a model to enable successful prediction of the outcome of influenza vaccination based on real historical medical data. A non-linear neural network approach was used, and its performance compared to logi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2010.04.011

    authors: Trtica-Majnaric L,Zekic-Susac M,Sarlija N,Vitale B

    更新日期:2010-10-01 00:00:00

  • Chronic disease modeling and simulation software.

    abstract::Computers allow describing the progress of a disease using computerized models. These models allow aggregating expert and clinical information to allow researchers and decision makers to forecast disease progression. To make this forecast reliable, good models and therefore good modeling tools are required. This paper...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2010.06.003

    authors: Barhak J,Isaman DJ,Ye W,Lee D

    更新日期:2010-10-01 00:00:00

  • Unleashing genotypes in epidemiology - A novel method for managing high throughput information.

    abstract::The large amounts of data generated when high-throughput genotyping methods are used in large-scale epidemiological studies (>10,000 participants) present an enormous challenge to researchers in terms of structured data management. In order to face these challenges, a system has been designed and implemented where gen...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.07.005

    authors: Olund G,Brinne A,Lindqvist P,Litton JE

    更新日期:2009-12-01 00:00:00

  • Explorative data analysis techniques and unsupervised clustering methods to support clinical assessment of Chronic Obstructive Pulmonary Disease (COPD) phenotypes.

    abstract::Chronic Obstructive Pulmonary Disease (COPD) is the fourth leading cause of death worldwide and represents one of the major causes of chronic morbidity. Cigarette smoking is the most important risk factor for COPD. In these patients, the airflow limitation is caused by a mixture of small airways disease and parenchyma...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.05.008

    authors: Paoletti M,Camiciottoli G,Meoni E,Bigazzi F,Cestelli L,Pistolesi M,Marchesi C

    更新日期:2009-12-01 00:00:00

  • Evaluation of relational and NoSQL database architectures to manage genomic annotations.

    abstract::While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architec...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.10.015

    authors: Schulz WL,Nelson BG,Felker DK,Durant TJS,Torres R

    更新日期:2016-12-01 00:00:00

  • Semi-supervised medical entity recognition: A study on Spanish and Swedish clinical corpora.

    abstract:OBJECTIVE:The goal of this study is to investigate entity recognition within Electronic Health Records (EHRs) focusing on Spanish and Swedish. Of particular importance is a robust representation of the entities. In our case, we utilized unsupervised methods to generate such representations. METHODS:The significance of...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.05.009

    authors: Pérez A,Weegar R,Casillas A,Gojenola K,Oronoz M,Dalianis H

    更新日期:2017-07-01 00:00:00

  • Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus.

    abstract::The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on the de-identification of longitudinal medical records. For this track, we de-identified a set of 1304 longitudinal medical records describing 296 patients. This corpus was de-identified under a broad interpretation of the HIPAA ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.07.020

    authors: Stubbs A,Uzuner Ö

    更新日期:2015-12-01 00:00:00

  • Serum cancer biomarker discovery through analysis of gene expression data sets across multiple tumor and normal tissues.

    abstract::The development of convenient serum bioassays for cancer screening, diagnosis, prognosis, and monitoring of treatment is one of top priorities in cancer research community. Although numerous biomarker candidates have been generated by applying high-throughput technologies such as transcriptomics, proteomics, and metab...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.08.010

    authors: Jin H,Lee HC,Park SS,Jeong YS,Kim SY

    更新日期:2011-12-01 00:00:00

  • Identifying complexity in infectious diseases inpatient settings: An observation study.

    abstract:BACKGROUND:Understanding complexity in healthcare has the potential to reduce decision and treatment uncertainty. Therefore, identifying both patient and task complexity may offer better task allocation and design recommendation for next-generation health information technology system design. OBJECTIVE:To identify spe...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.10.018

    authors: Roosan D,Weir C,Samore M,Jones M,Rahman M,Stoddard GJ,Del Fiol G

    更新日期:2017-07-01 00:00:00

  • Concept and implementation of a study dashboard module for a continuous monitoring of trial recruitment and documentation.

    abstract:BACKGROUND:The difficulty of managing patient recruitment and documentation for clinical trials prompts a demand for instruments for closely monitoring these critical but unpredictable processes. Increasingly adopted Electronic Data Capture (EDC) applications provide novel opportunities to reutilize stored information ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.10.010

    authors: Toddenroth D,Sivagnanasundaram J,Prokosch HU,Ganslandt T

    更新日期:2016-12-01 00:00:00

  • Enhancing phylogeography by improving geographical information from GenBank.

    abstract::Phylogeography is a field that focuses on the geographical lineages of species such as vertebrates or viruses. Here, geographical data, such as location of a species or viral host is as important as the sequence information extracted from the species. Together, this information can help illustrate the migration of the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.06.005

    authors: Scotch M,Sarkar IN,Mei C,Leaman R,Cheung KH,Ortiz P,Singraur A,Gonzalez G

    更新日期:2011-12-01 00:00:00

  • 3D interactive surgical visualization system using mobile spatial information acquisition and autostereoscopic display.

    abstract::Three-dimensional (3D) visualization of preoperative and intraoperative medical information becomes more and more important in minimally invasive surgery. We develop a 3D interactive surgical visualization system using mobile spatial information acquisition and autostereoscopic display for surgeons to observe surgical...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.05.014

    authors: Fan Z,Weng Y,Chen G,Liao H

    更新日期:2017-07-01 00:00:00

  • Unstructured medical image query using big data - An epilepsy case study.

    abstract::Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.12.005

    authors: Istephan S,Siadat MR

    更新日期:2016-02-01 00:00:00