Classification of forensic autopsy reports through conceptual graph-based document representation model.

Abstract:

:Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier was responsible for predicting CoD using the proposed conceptual graph-based document representation technique. To demonstrate the significance of the proposed technique, its results were compared with those of six (6) state-of-the-art document representation techniques. Lastly, this study compared the effects of one-level classification and two-level classification on the experimental results. The experimental results indicated that the CGDR technique achieved 12% to 15% improvement in accuracy compared with fully automated document representation baseline techniques. Moreover, two-level classification obtained better results compared with one-level classification. The promising results of the proposed conceptual graph-based document representation technique suggest that pathologists can adopt the proposed system as their basis for second opinion, thereby supporting them in effectively determining CoD.

journal_name

J Biomed Inform

authors

Mujtaba G,Shuib L,Raj RG,Rajandram R,Shaikh K,Al-Garadi MA

doi

10.1016/j.jbi.2018.04.013

subject

Has Abstract

pub_date

2018-06-01 00:00:00

pages

88-105

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(18)30077-7

journal_volume

82

pub_type

杂志文章
  • Risk factor detection for heart disease by applying text analytics in electronic medical records.

    abstract::In the United States, about 600,000 people die of heart disease every year. The annual cost of care services, medications, and lost productivity reportedly exceeds 108.9 billion dollars. Effective disease risk assessment is critical to prevention, care, and treatment planning. Recent advancements in text analytics hav...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.08.011

    authors: Torii M,Fan JW,Yang WL,Lee T,Wiley MT,Zisook DS,Huang Y

    更新日期:2015-12-01 00:00:00

  • A method and software framework for enriching private biomedical sources with data from public online repositories.

    abstract::Modern biomedical research relies on the semantic integration of heterogeneous data sources to find data correlations. Researchers access multiple datasets of disparate origin, and identify elements-e.g. genes, compounds, pathways-that lead to interesting correlations. Normally, they must refer to additional public da...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.02.004

    authors: Anguita A,García-Remesal M,Graf N,Maojo V

    更新日期:2016-04-01 00:00:00

  • Analysis of eligibility criteria representation in industry-standard clinical trial protocols.

    abstract::Previous research on standardization of eligibility criteria and its feasibility has traditionally been conducted on clinical trial protocols from ClinicalTrials.gov (CT). The portability and use of such standardization for full-text industry-standard protocols has not been studied in-depth. Towards this end, in this ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.06.001

    authors: Bhattacharya S,Cantor MN

    更新日期:2013-10-01 00:00:00

  • R.A.P.I.D. (Root Aggregated Prioritized Information Display): A single screen display for efficient digital triaging of medical reports.

    abstract:OBJECTIVE:The timely acknowledgement of critical patient clinical reports is vital for the delivery of safe patient care. With current EHR systems, critical reports reside on different screens. This leads to treatment delays and inefficient work flows. As a remedy, the R.A.P.I.D. (Root Aggregated Prioritized Informatio...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,随机对照试验

    doi:10.1016/j.jbi.2016.04.001

    authors: Ford JP,Huang L,Richards DS,Ambinder EP,Rosenberger JL

    更新日期:2016-06-01 00:00:00

  • Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus.

    abstract::The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on the de-identification of longitudinal medical records. For this track, we de-identified a set of 1304 longitudinal medical records describing 296 patients. This corpus was de-identified under a broad interpretation of the HIPAA ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.07.020

    authors: Stubbs A,Uzuner Ö

    更新日期:2015-12-01 00:00:00

  • MorphoCol: An ontology-based knowledgebase for the characterisation of clinically significant bacterial colony morphologies.

    abstract:BACKGROUND:One of the major concerns of the biomedical community is the increasing prevalence of antimicrobial resistant microorganisms. Recent findings show that the diversification of colony morphology may be indicative of the expression of virulence factors and increased resistance to antibiotic therapeutics. To tra...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.03.007

    authors: Sousa AM,Pereira MO,Lourenço A

    更新日期:2015-06-01 00:00:00

  • A tiered approach is more cost effective than traditional pharmacist-based review for classifying computer-detected signals as adverse drug events.

    abstract:OBJECTIVE:To develop a cost-efficient method for identifying adverse drug events (ADEs) and medication errors (MEs) identified using outpatient electronic medical records within ambulatory settings. DESIGN:Comparison of sensitivity and cost of "traditional" pharmacist based approach to identifying ADEs and MEs during ...

    journal_title:Journal of biomedical informatics

    pub_type: 临床试验,杂志文章,多中心研究

    doi:10.1016/s1532-0464(03)00059-5

    authors: Hope C,Overhage JM,Seger A,Teal E,Mills V,Fiskio J,Gandhi TK,Bates DW,Murray MD

    更新日期:2003-02-01 00:00:00

  • A knowledge-based system to find over-the-counter medicines for self-medication.

    abstract::This study developed a medicine query system based on Semantic Web and open data especially for self-medication users to search over-the-counter (OTC) medicines. Most existing medicine query systems are based on keyword searches. If users are uncertain about the exact search words, these query systems do not offer eff...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103504

    authors: Sung HY,Chi YL

    更新日期:2020-08-01 00:00:00

  • Automated annotation and classification of BI-RADS assessment from radiology reports.

    abstract::The Breast Imaging Reporting and Data System (BI-RADS) was developed to reduce variation in the descriptions of findings. Manual analysis of breast radiology report data is challenging but is necessary for clinical and healthcare quality assurance activities. The objective of this study is to develop a natural languag...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.04.011

    authors: Castro SM,Tseytlin E,Medvedeva O,Mitchell K,Visweswaran S,Bekhuis T,Jacobson RS

    更新日期:2017-05-01 00:00:00

  • A Bayesian system to detect and characterize overlapping outbreaks.

    abstract::Outbreaks of infectious diseases such as influenza are a significant threat to human health. Because there are different strains of influenza which can cause independent outbreaks, and influenza can affect demographic groups at different rates and times, there is a need to recognize and characterize multiple outbreaks...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.08.003

    authors: Aronis JM,Millett NE,Wagner MM,Tsui F,Ye Y,Ferraro JP,Haug PJ,Gesteland PH,Cooper GF

    更新日期:2017-09-01 00:00:00

  • Mapping high-dimensional data onto a relative distance plane--an exact method for visualizing and characterizing high-dimensional patterns.

    abstract::We introduce a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances between points with respect to any two reference patterns in a special two-dimensional coordinate system, the relative distance pl...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2004.07.005

    authors: Somorjai RL,Dolenko B,Demko A,Mandelzweig M,Nikulin AE,Baumgartner R,Pizzi NJ

    更新日期:2004-10-01 00:00:00

  • Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.

    abstract::We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2017.07.012

    authors: Kreimeyer K,Foster M,Pandey A,Arya N,Halford G,Jones SF,Forshee R,Walderhaug M,Botsis T

    更新日期:2017-09-01 00:00:00

  • Making sense: sensor-based investigation of clinician activities in complex critical care environments.

    abstract::In many respects, the critical care workplace resembles a paradigmatic complex system: on account of the dynamic and interactive nature of collaborative clinical work, these settings are characterized by non-linear, inter-dependent and emergent activities. Developing a comprehensive understanding of the work activitie...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.02.007

    authors: Kannampallil T,Li Z,Zhang M,Cohen T,Robinson DJ,Franklin A,Zhang J,Patel VL

    更新日期:2011-06-01 00:00:00

  • LGscore: A method to identify disease-related genes using biological literature and Google data.

    abstract::Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which iden...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.01.003

    authors: Kim J,Kim H,Yoon Y,Park S

    更新日期:2015-04-01 00:00:00

  • NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information.

    abstract::Over the last 8 years, the National Cancer Institute (NCI) has launched a major effort to integrate molecular and clinical cancer-related information within a unified biomedical informatics framework, with controlled terminology as its foundational layer. The NCI Thesaurus is the reference terminology underpinning the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2006.02.013

    authors: Sioutos N,de Coronado S,Haber MW,Hartel FW,Shaiu WL,Wright LW

    更新日期:2007-02-01 00:00:00

  • A deep learning approach for predicting the quality of online health expert question-answering services.

    abstract::Recently, online health expert question-answering (HQA) services (systems) have attracted more and more health consumers to ask health-related questions everywhere at any time due to the convenience and effectiveness. However, the quality of answers in existing HQA systems varies in different situations. It is signifi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.06.012

    authors: Hu Z,Zhang Z,Yang H,Chen Q,Zuo D

    更新日期:2017-07-01 00:00:00

  • A private DNA motif finding algorithm.

    abstract::With the increasing availability of genomic sequence data, numerous methods have been proposed for finding DNA motifs. The discovery of DNA motifs serves a critical step in many biological applications. However, the privacy implication of DNA analysis is normally neglected in the existing methods. In this work, we pro...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.12.016

    authors: Chen R,Peng Y,Choi B,Xu J,Hu H

    更新日期:2014-08-01 00:00:00

  • A comparison of word embeddings for the biomedical natural language processing.

    abstract:BACKGROUND:Word embeddings have been prevalently used in biomedical Natural Language Processing (NLP) applications due to the ability of the vector representations being able to capture useful semantic properties and linguistic relationships between words. Different textual resources (e.g., Wikipedia and biomedical lit...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.09.008

    authors: Wang Y,Liu S,Afzal N,Rastegar-Mojarad M,Wang L,Shen F,Kingsbury P,Liu H

    更新日期:2018-11-01 00:00:00

  • Selecting significant genes by randomization test for cancer classification using gene expression data.

    abstract::Gene selection is an important task in bioinformatics studies, because the accuracy of cancer classification generally depends upon the genes that have biological relevance to the classifying problems. In this work, randomization test (RT) is used as a gene selection method for dealing with gene expression data. In th...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.03.009

    authors: Mao Z,Cai W,Shao X

    更新日期:2013-08-01 00:00:00

  • Unstructured medical image query using big data - An epilepsy case study.

    abstract::Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.12.005

    authors: Istephan S,Siadat MR

    更新日期:2016-02-01 00:00:00

  • Personal discovery in diabetes self-management: Discovering cause and effect using self-monitoring data.

    abstract:OBJECTIVE:To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. MATERIALS AND METHODS:We conducted an observational qualitative study of d...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.09.013

    authors: Mamykina L,Heitkemper EM,Smaldone AM,Kukafka R,Cole-Lewis HJ,Davidson PG,Mynatt ED,Cassells A,Tobin JN,Hripcsak G

    更新日期:2017-12-01 00:00:00

  • Predictive modeling of bacterial infections and antibiotic therapy needs in critically ill adults.

    abstract::Unnecessary antibiotic regimens in the intensive care unit (ICU) are associated with adverse patient outcomes and antimicrobial resistance. Bacterial infections (BI) are both common and deadly in ICUs, and as a result, patients with a suspected BI are routinely started on broad-spectrum antibiotics prior to having con...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103540

    authors: Eickelberg G,Sanchez-Pinto LN,Luo Y

    更新日期:2020-09-01 00:00:00

  • An ontology-based measure to compute semantic similarity in biomedicine.

    abstract::Proper understanding of textual data requires the exploitation and integration of unstructured and heterogeneous clinical sources, healthcare records or scientific literature, which are fundamental aspects in clinical and translational research. The determination of semantic similarity between word pairs is an importa...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2010.09.002

    authors: Batet M,Sánchez D,Valls A

    更新日期:2011-02-01 00:00:00

  • Algorithms for rapid outbreak detection: a research synthesis.

    abstract::The threat of bioterrorism has stimulated interest in enhancing public health surveillance to detect disease outbreaks more rapidly than is currently possible. To advance research on improving the timeliness of outbreak detection, the Defense Advanced Research Project Agency sponsored the Bio-event Advanced Leading In...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2004.11.007

    authors: Buckeridge DL,Burkom H,Campbell M,Hogan WR,Moore AW

    更新日期:2005-04-01 00:00:00

  • Deep neural models for ICD-10 coding of death certificates and autopsy reports in free-text.

    abstract::We address the assignment of ICD-10 codes for causes of death by analyzing free-text descriptions in death certificates, together with the associated autopsy reports and clinical bulletins, from the Portuguese Ministry of Health. We leverage a deep neural network that combines word embeddings, recurrent units, and neu...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.02.011

    authors: Duarte F,Martins B,Pinto CS,Silva MJ

    更新日期:2018-04-01 00:00:00

  • Introducing RFID technology in dynamic and time-critical medical settings: requirements and challenges.

    abstract::We describe the process of introducing RFID technology in the trauma bay of a trauma center to support fast-paced and complex teamwork during resuscitation. We analyzed trauma resuscitation tasks, photographs of medical tools, and videos of simulated resuscitations to gain insight into resuscitation tasks, work practi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.04.003

    authors: Parlak S,Sarcevic A,Marsic I,Burd RS

    更新日期:2012-10-01 00:00:00

  • Quantifying semantic similarity of clinical evidence in the biomedical literature to facilitate related evidence synthesis.

    abstract:OBJECTIVE:Published clinical trials and high quality peer reviewed medical publications are considered as the main sources of evidence used for synthesizing systematic reviews or practicing Evidence Based Medicine (EBM). Finding all relevant published evidence for a particular medical case is a time and labour intensiv...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103321

    authors: Hassanzadeh H,Nguyen A,Verspoor K

    更新日期:2019-12-01 00:00:00

  • A comprehensive review of feature based methods for drug target interaction prediction.

    abstract::Drug target interaction is a prominent research area in the field of drug discovery. It refers to the recognition of interactions between chemical compounds and the protein targets in the human body. Wet lab experiments to identify these interactions are expensive as well as time consuming. The computational methods o...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2019.103159

    authors: Sachdev K,Gupta MK

    更新日期:2019-05-01 00:00:00

  • Computing with evidence Part II: An evidential approach to predicting metabolic drug-drug interactions.

    abstract::We describe a novel experiment that we conducted with the Drug Interaction Knowledge-base (DIKB) to determine which combinations of evidence enable a rule-based theory of metabolic drug-drug interactions to make the most optimal set of predictions. The focus of the experiment was a group of 16 drugs including six memb...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.05.010

    authors: Boyce R,Collins C,Horn J,Kalet I

    更新日期:2009-12-01 00:00:00

  • Hierarchical data security in a Query-By-Example interface for a shared database.

    abstract::Whenever a shared database resource, containing critical patient data, is created, protecting the contents of the database is a high priority goal. This goal can be achieved by developing a Query-By-Example (QBE) interface, designed to access a shared database, and embedding within the QBE a hierarchical security modu...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/s1532-0464(02)00524-5

    authors: Taylor M

    更新日期:2002-06-01 00:00:00