Combining glass box and black box evaluations in the identification of heart disease risk factors and their temporal relations from clinical records.

Abstract:

BACKGROUND:The determination of risk factors and their temporal relations in natural language patient records is a complex task which has been addressed in the i2b2/UTHealth 2014 shared task. In this context, in most systems it was broadly decomposed into two sub-tasks implemented by two components: entity detection, and temporal relation determination. Task-level ("black box") evaluation is relevant for the final clinical application, whereas component-level evaluation ("glass box") is important for system development and progress monitoring. Unfortunately, because of the interaction between entity representation and temporal relation representation, glass box and black box evaluation cannot be managed straightforwardly at the same time in the setting of the i2b2/UTHealth 2014 task, making it difficult to assess reliably the relative performance and contribution of the individual components to the overall task. OBJECTIVE:To identify obstacles and propose methods to cope with this difficulty, and illustrate them through experiments on the i2b2/UTHealth 2014 dataset. METHODS:We outline several solutions to this problem and examine their requirements in terms of adequacy for component-level and task-level evaluation and of changes to the task framework. We select the solution which requires the least modifications to the i2b2 evaluation framework and illustrate it with our system. This system identifies risk factor mentions with a CRF system complemented by hand-designed patterns, identifies and normalizes temporal expressions through a tailored version of the Heideltime tool, and determines temporal relations of each risk factor with a One Rule classifier. RESULTS:Giving a fixed value to the temporal attribute in risk factor identification proved to be the simplest way to evaluate the risk factor detection component independently. This evaluation method enabled us to identify the risk factor detection component as most contributing to the false negatives and false positives of the global system. This led us to redirect further effort to this component, focusing on medication detection, with gains of 7 to 20 recall points and of 3 to 6 F-measure points depending on the corpus and evaluation. CONCLUSION:We proposed a method to achieve a clearer glass box evaluation of risk factor detection and temporal relation detection in clinical texts, which can provide an example to help system development in similar tasks. This glass box evaluation was instrumental in refocusing our efforts and obtaining substantial improvements in risk factor detection.

journal_name

J Biomed Inform

authors

Grouin C,Moriceau V,Zweigenbaum P

doi

10.1016/j.jbi.2015.06.014

subject

Has Abstract

pub_date

2015-12-01 00:00:00

pages

S133-42

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(15)00124-0

journal_volume

58 Suppl

pub_type

杂志文章
  • A flexible approach to distributed data anonymization.

    abstract::Sensitive biomedical data is often collected from distributed sources, involving different information systems and different organizational units. Local autonomy and legal reasons lead to the need of privacy preserving integration concepts. In this article, we focus on anonymization, which plays an important role for ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.12.002

    authors: Kohlmayer F,Prasser F,Eckert C,Kuhn KA

    更新日期:2014-08-01 00:00:00

  • Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches.

    abstract::The accurate diagnosis of heart failure in emergency room patients is quite important, but can also be quite difficult due to our insufficient understanding of the characteristics of heart failure. The purpose of this study is to design a decision-making model that provides critical factors and knowledge associated wi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.04.013

    authors: Son CS,Kim YN,Kim HS,Park HS,Kim MS

    更新日期:2012-10-01 00:00:00

  • Knowledge-based personalized search engine for the Web-based Human Musculoskeletal System Resources (HMSR) in biomechanics.

    abstract::Human musculoskeletal system resources of the human body are valuable for the learning and medical purposes. Internet-based information from conventional search engines such as Google or Yahoo cannot response to the need of useful, accurate, reliable and good-quality human musculoskeletal resources related to medical ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.11.001

    authors: Dao TT,Hoang TN,Ta XH,Tho MC

    更新日期:2013-02-01 00:00:00

  • Comparison of orthogonal NLP methods for clinical phenotyping and assessment of bone scan utilization among prostate cancer patients.

    abstract:OBJECTIVE:Clinical care guidelines recommend that newly diagnosed prostate cancer patients at high risk for metastatic spread receive a bone scan prior to treatment and that low risk patients not receive it. The objective was to develop an automated pipeline to interrogate heterogeneous data to evaluate the use of bone...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103184

    authors: Coquet J,Bozkurt S,Kan KM,Ferrari MK,Blayney DW,Brooks JD,Hernandez-Boussard T

    更新日期:2019-06-01 00:00:00

  • Analysis of eligibility criteria representation in industry-standard clinical trial protocols.

    abstract::Previous research on standardization of eligibility criteria and its feasibility has traditionally been conducted on clinical trial protocols from ClinicalTrials.gov (CT). The portability and use of such standardization for full-text industry-standard protocols has not been studied in-depth. Towards this end, in this ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.06.001

    authors: Bhattacharya S,Cantor MN

    更新日期:2013-10-01 00:00:00

  • Relevance feedback for enhancing content based image retrieval and automatic prediction of semantic image features: Application to bone tumor radiographs.

    abstract:BACKGROUND:The majority of current medical CBIR systems perform retrieval based only on "imaging signatures" generated by extracting pixel-level quantitative features, and only rarely has a feedback mechanism been incorporated to improve retrieval performance. In addition, current medical CBIR approaches do not routine...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.07.002

    authors: Banerjee I,Kurtz C,Devorah AE,Do B,Rubin DL,Beaulieu CF

    更新日期:2018-08-01 00:00:00

  • Digital subtraction angiogram registration method with local distortion vectors to decrease motion artifact.

    abstract::We have been investigating registration methods for improving digital subtraction angiography (DSA) images to extract blood vessels by reducing artifacts due to body motion, such as rotation, contraction, and dilation. In this paper, we propose a new and simple DSA registration algorithm with local distortion vectors ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1006/jbin.2001.1018

    authors: Hiroshima K,Funakami R,Hiratsuka K,Nishino J,Odaka T,Ogura H,Fukushima T,Nishimoto Y,Tanaka M,Ito H,Yamamoto K

    更新日期:2001-06-01 00:00:00

  • FRR: fair remote retrieval of outsourced private medical records in electronic health networks.

    abstract::Cloud computing is emerging as the next-generation IT architecture. However, cloud computing also raises security and privacy concerns since the users have no physical control over the outsourced data. This paper focuses on fairly retrieving encrypted private medical records outsourced to remote untrusted cloud server...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2014.02.008

    authors: Wang H,Wu Q,Qin B,Domingo-Ferrer J

    更新日期:2014-08-01 00:00:00

  • PharmActa: Personalized pharmaceutical care eHealth platform for patients and pharmacists.

    abstract::Community pharmacists are critically placed in the patient care chain being an extended frontline within primary healthcare networks across Europe. They are trained to ensure safe and effective medication use, a crucial and responsible role, extending beyond the common misconception limited to just providing timely ac...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103336

    authors: Spanakis M,Sfakianakis S,Kallergis G,Spanakis EG,Sakkalis V

    更新日期:2019-12-01 00:00:00

  • TRAK ontology: defining standard care for the rehabilitation of knee conditions.

    abstract::In this paper we discuss the design and development of TRAK (Taxonomy for RehAbilitation of Knee conditions), an ontology that formally models information relevant for the rehabilitation of knee conditions. TRAK provides the framework that can be used to collect coded data in sufficient detail to support epidemiologic...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.04.009

    authors: Button K,van Deursen RW,Soldatova L,Spasić I

    更新日期:2013-08-01 00:00:00

  • Multi-faceted informatics system for digitising and streamlining the reablement care model.

    abstract::Reablement is new paradigm to increase independence in the home amongst the ageing population. And it remains a challenge to design an optimal electronic system to streamline and integrate reablement into current healthcare infrastructure. Furthermore, given reablement requires collaboration with a range of organisati...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.05.008

    authors: Bond RR,Mulvenna MD,Finlay DD,Martin S

    更新日期:2015-08-01 00:00:00

  • A machine-learned knowledge discovery method for associating complex phenotypes with complex genotypes. Application to pain.

    abstract:BACKGROUND:The association of genotyping information with common traits is not satisfactorily solved. One of the most complex traits is pain and association studies have failed so far to provide reproducible predictions of pain phenotypes from genotypes in the general population despite a well-established genetic basis...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.07.010

    authors: Lötsch J,Ultsch A

    更新日期:2013-10-01 00:00:00

  • A deep learning approach for predicting the quality of online health expert question-answering services.

    abstract::Recently, online health expert question-answering (HQA) services (systems) have attracted more and more health consumers to ask health-related questions everywhere at any time due to the convenience and effectiveness. However, the quality of answers in existing HQA systems varies in different situations. It is signifi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.06.012

    authors: Hu Z,Zhang Z,Yang H,Chen Q,Zuo D

    更新日期:2017-07-01 00:00:00

  • Personal discovery in diabetes self-management: Discovering cause and effect using self-monitoring data.

    abstract:OBJECTIVE:To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. MATERIALS AND METHODS:We conducted an observational qualitative study of d...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.09.013

    authors: Mamykina L,Heitkemper EM,Smaldone AM,Kukafka R,Cole-Lewis HJ,Davidson PG,Mynatt ED,Cassells A,Tobin JN,Hripcsak G

    更新日期:2017-12-01 00:00:00

  • Applying semantic-based probabilistic context-free grammar to medical language processing--a preliminary study on parsing medication sentences.

    abstract::Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently r...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.08.009

    authors: Xu H,AbdelRahman S,Lu Y,Denny JC,Doan S

    更新日期:2011-12-01 00:00:00

  • MeSHing molecular sequences and clinical trials: a feasibility study.

    abstract::The centralized and public availability of molecular sequence and clinical trial data presents an opportunity to identify potentially valuable linkages across the bench-to-bedside "T1" translational barrier. In this study, we sought to leverage keyword metadata (Medical Subject Heading [MeSH] descriptors) to infer rel...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.10.003

    authors: Chen ES,Sarkar IN

    更新日期:2010-06-01 00:00:00

  • MorphoCol: An ontology-based knowledgebase for the characterisation of clinically significant bacterial colony morphologies.

    abstract:BACKGROUND:One of the major concerns of the biomedical community is the increasing prevalence of antimicrobial resistant microorganisms. Recent findings show that the diversification of colony morphology may be indicative of the expression of virulence factors and increased resistance to antibiotic therapeutics. To tra...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.03.007

    authors: Sousa AM,Pereira MO,Lourenço A

    更新日期:2015-06-01 00:00:00

  • Feature selection techniques for maximum entropy based biomedical named entity recognition.

    abstract::Named entity recognition is an extremely important and fundamental task of biomedical text mining. Biomedical named entities include mentions of proteins, genes, DNA, RNA, etc which often have complex structures, but it is challenging to identify and classify such entities. Machine learning methods like CRF, MEMM and ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2008.12.012

    authors: Saha SK,Sarkar S,Mitra P

    更新日期:2009-10-01 00:00:00

  • Integrated network analysis of symptom clusters across disease conditions.

    abstract::Identifying the symptom clusters (two or more related symptoms) with shared underlying molecular mechanisms has been a vital analysis task to promote the symptom science and precision health. Related studies have applied the clustering algorithms (e.g. k-means, latent class model) to detect the symptom clusters mostly...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103482

    authors: Lu K,Yang K,Niyongabo E,Shu Z,Wang J,Chang K,Zou Q,Jiang J,Jia C,Liu B,Zhou X

    更新日期:2020-07-01 00:00:00

  • Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function.

    abstract::Computerized survival prediction in healthcare identifying the risk of disease mortality, helps healthcare providers to effectively manage their patients by providing appropriate treatment options. In this study, we propose to apply a classification algorithm, Contrast Pattern Aided Logistic Regression (CPXR(Log)) wit...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.01.009

    authors: Taslimitehrani V,Dong G,Pereira NL,Panahiazar M,Pathak J

    更新日期:2016-04-01 00:00:00

  • An automated reasoning framework for translational research.

    abstract::In this paper we propose a novel approach to the design and implementation of knowledge-based decision support systems for translational research, specifically tailored to the analysis and interpretation of data from high-throughput experiments. Our approach is based on a general epistemological model of the scientifi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.11.005

    authors: Riva A,Nuzzo A,Stefanelli M,Bellazzi R

    更新日期:2010-06-01 00:00:00

  • glUCModel: a monitoring and modeling system for chronic diseases applied to diabetes.

    abstract::Chronic patients must carry out a rigorous control of diverse factors in their lives. Diet, sport activity, medical analysis or blood glucose levels are some of them. This is a hard task, because some of these controls are performed very often, for instance some diabetics measure their glucose levels several times eve...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.12.015

    authors: Hidalgo JI,Maqueda E,Risco-Martín JL,Cuesta-Infante A,Colmenar JM,Nobel J

    更新日期:2014-04-01 00:00:00

  • A genetic algorithm-support vector machine method with parameter optimization for selecting the tag SNPs.

    abstract::SNPs (Single Nucleotide Polymorphisms) include millions of changes in human genome, and therefore, are promising tools for disease-gene association studies. However, this kind of studies is constrained by the high expense of genotyping millions of SNPs. For this reason, it is required to obtain a suitable subset of SN...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.12.002

    authors: Ilhan I,Tezel G

    更新日期:2013-04-01 00:00:00

  • Matching patients to clinical trials using semantically enriched document representation.

    abstract::Recruiting eligible patients for clinical trials is crucial for reliably answering specific questions about medical interventions and evaluation. However, clinical trial recruitment is a bottleneck in clinical research and drug development. Our goal is to provide an approach towards automating this manual and time-con...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103406

    authors: Hassanzadeh H,Karimi S,Nguyen A

    更新日期:2020-05-01 00:00:00

  • On the reproducibility of results of pathway analysis in genome-wide expression studies of colorectal cancers.

    abstract::One of the major problems in genomics and medicine is the identification of gene networks and pathways deregulated in complex and polygenic diseases, like cancer. In this paper, we address the problem of assessing the variability of results of pathways analysis identified in different and independent genome wide expre...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.09.005

    authors: Maglietta R,Distaso A,Piepoli A,Palumbo O,Carella M,D'Addabbo A,Mukherjee S,Ancona N

    更新日期:2010-06-01 00:00:00

  • Automatic signal extraction, prioritizing and filtering approaches in detecting post-marketing cardiovascular events associated with targeted cancer drugs from the FDA Adverse Event Reporting System (FAERS).

    abstract:OBJECTIVE:Targeted drugs dramatically improve the treatment outcomes in cancer patients; however, these innovative drugs are often associated with unexpectedly high cardiovascular toxicity. Currently, cardiovascular safety represents both a challenging issue for drug developers, regulators, researchers, and clinicians ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.10.008

    authors: Xu R,Wang Q

    更新日期:2014-02-01 00:00:00

  • Unstructured medical image query using big data - An epilepsy case study.

    abstract::Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.12.005

    authors: Istephan S,Siadat MR

    更新日期:2016-02-01 00:00:00

  • Software tools for simultaneous data visualization and T cell epitopes and disorder prediction in proteins.

    abstract::We have developed EpDis and MassPred, extendable open source software tools that support bioinformatic research and enable parallel use of different methods for the prediction of T cell epitopes, disorder and disordered binding regions and hydropathy calculation. These tools offer a semi-automated installation of chos...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.01.016

    authors: Jandrlić DR,Lazić GM,Mitić NS,Pavlović MD

    更新日期:2016-04-01 00:00:00

  • Using natural language processing to extract mammographic findings.

    abstract:OBJECTIVE:Structured data on mammographic findings are difficult to obtain without manual review. We developed and evaluated a rule-based natural language processing (NLP) system to extract mammographic findings from free-text mammography reports. MATERIALS AND METHODS:The NLP system extracted four mammographic findin...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.01.010

    authors: Gao H,Aiello Bowles EJ,Carrell D,Buist DS

    更新日期:2015-04-01 00:00:00

  • Consensus and Meta-analysis regulatory networks for combining multiple microarray gene expression datasets.

    abstract::Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with great...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,meta分析

    doi:10.1016/j.jbi.2008.01.011

    authors: Steele E,Tucker A

    更新日期:2008-12-01 00:00:00