Unstructured medical image query using big data - An epilepsy case study.

Abstract:

:Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasibility of having such a framework to provide efficient querying of unstructured data in unlimited ways. The feasibility study was conducted specifically in the epilepsy field. The proposed framework evaluates a query in two phases. In phase 1, structured data is used to filter the clinical data warehouse. In phase 2, feature extraction modules are executed on the unstructured data in a distributed manner via Hadoop to complete the query. Three modules have been created, volume comparer, surface to volume conversion and average intensity. The framework allows for user-defined modules to be imported to provide unlimited ways to process the unstructured data hence potentially extending the application of this framework beyond epilepsy field. Two types of criteria were used to validate the feasibility of the proposed framework - the ability/accuracy of fulfilling an advanced medical query and the efficiency that Hadoop provides. For the first criterion, the framework executed an advanced medical query that spanned both structured and unstructured data with accurate results. For the second criterion, different architectures were explored to evaluate the performance of various Hadoop configurations and were compared to a traditional Single Server Architecture (SSA). The surface to volume conversion module performed up to 40 times faster than the SSA (using a 20 node Hadoop cluster) and the average intensity module performed up to 85 times faster than the SSA (using a 40 node Hadoop cluster). Furthermore, the 40 node Hadoop cluster executed the average intensity module on 10,000 models in 3h which was not even practical for the SSA. The current study is limited to epilepsy field and further research and more feature extraction modules are required to show its applicability in other medical domains. The proposed framework advances data-driven medicine by unleashing the content of unstructured medical data in an efficient and unlimited way to be harnessed by medical experts.

journal_name

J Biomed Inform

authors

Istephan S,Siadat MR

doi

10.1016/j.jbi.2015.12.005

subject

Has Abstract

pub_date

2016-02-01 00:00:00

pages

218-26

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(15)00285-3

journal_volume

59

pub_type

杂志文章
  • A comprehensive review of feature based methods for drug target interaction prediction.

    abstract::Drug target interaction is a prominent research area in the field of drug discovery. It refers to the recognition of interactions between chemical compounds and the protein targets in the human body. Wet lab experiments to identify these interactions are expensive as well as time consuming. The computational methods o...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2019.103159

    authors: Sachdev K,Gupta MK

    更新日期:2019-05-01 00:00:00

  • Development of the nursing problem list subset of SNOMED CT®.

    abstract:OBJECTIVE:To create an interoperable set of nursing diagnoses for use in the patient problem list in the EHR to support interoperability. DESIGN:Queries for nursing diagnostic concepts were executed against the UMLS Metathesaurus to retrieve all nursing diagnoses across four nursing terminologies where the concept was...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.12.003

    authors: Matney SA,Warren JJ,Evans JL,Kim TY,Coenen A,Auld VA

    更新日期:2012-08-01 00:00:00

  • Use of morphological analysis in protein name recognition.

    abstract::Protein name recognition aims to detect each and every protein names appearing in a PubMed abstract. The task is not simple, as the graphic word boundary (space separator) assumed in conventional preprocessing does not necessarily coincide with the protein name boundary. Such boundary disagreement caused by tokenizati...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2004.08.001

    authors: Yamamoto K,Kudo T,Konagaya A,Matsumoto Y

    更新日期:2004-12-01 00:00:00

  • Methodological variations in lagged regression for detecting physiologic drug effects in EHR data.

    abstract::We studied how lagged linear regression can be used to detect the physiologic effects of drugs from data in the electronic health record (EHR). We systematically examined the effect of methodological variations ((i) time series construction, (ii) temporal parameterization, (iii) intra-subject normalization, (iv) diffe...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.08.014

    authors: Levine ME,Albers DJ,Hripcsak G

    更新日期:2018-10-01 00:00:00

  • Facilitating pre-operative assessment guidelines representation using SNOMED CT.

    abstract:OBJECTIVE:To investigate whether SNOMED CT covers the terms used in pre-operative assessment guidelines, and if necessary, how the measured content coverage can be improved. METHODS:Pre-operative assessment guidelines were retrieved from the websites of (inter)national anesthesia-related societies. The recommendations...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2010.07.009

    authors: Ahmadian L,Cornet R,de Keizer NF

    更新日期:2010-12-01 00:00:00

  • Specifying computer-based counseling systems in health care: a new approach to user-interface and interaction design.

    abstract::Computer-based counseling systems in health care play an important role in the toolset available for medical doctors to inform, motivate and challenge their patients according to a well-defined therapeutic goal. The design, development and implementation of such systems require close collaboration between users, i.e. ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2008.10.005

    authors: Herzberg D,Marsden N,Kübler P,Leonhardt C,Thomanek S,Jung H,Becker A

    更新日期:2009-04-01 00:00:00

  • Benchmarking deep learning models on large healthcare datasets.

    abstract::Deep learning models (aka Deep Neural Networks) have revolutionized many fields including computer vision, natural language processing, speech recognition, and is being increasingly used in clinical healthcare applications. However, few works exist which have benchmarked the performance of the deep learning models wit...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.04.007

    authors: Purushotham S,Meng C,Che Z,Liu Y

    更新日期:2018-07-01 00:00:00

  • A method and software framework for enriching private biomedical sources with data from public online repositories.

    abstract::Modern biomedical research relies on the semantic integration of heterogeneous data sources to find data correlations. Researchers access multiple datasets of disparate origin, and identify elements-e.g. genes, compounds, pathways-that lead to interesting correlations. Normally, they must refer to additional public da...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.02.004

    authors: Anguita A,García-Remesal M,Graf N,Maojo V

    更新日期:2016-04-01 00:00:00

  • Comparison with manual registration reveals satisfactory completeness and efficiency of a computerized cancer registration system.

    abstract::Automated software for cancer registration, called Open Registry and developed by ourselves was adopted by the Varese (population-based) Cancer Registry starting from 1997. Since the use of automated cancer registration is increasing, it is important to assess the quality and completeness of the automated data being p...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2007.03.003

    authors: Contiero P,Tittarelli A,Maghini A,Fabiano S,Frassoldi E,Costa E,Gada D,Codazzi T,Crosignani P,Tessandori R,Tagliabue G

    更新日期:2008-02-01 00:00:00

  • A flexible approach to distributed data anonymization.

    abstract::Sensitive biomedical data is often collected from distributed sources, involving different information systems and different organizational units. Local autonomy and legal reasons lead to the need of privacy preserving integration concepts. In this article, we focus on anonymization, which plays an important role for ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.12.002

    authors: Kohlmayer F,Prasser F,Eckert C,Kuhn KA

    更新日期:2014-08-01 00:00:00

  • Clinical decision support models and frameworks: Seeking to address research issues underlying implementation successes and failures.

    abstract::Computer-based clinical decision support (CDS) has been pursued for more than five decades. Despite notable accomplishments and successes, wide adoption and broad use of CDS in clinical practice has not been achieved. Many issues have been identified as being partially responsible for the relatively slow adoption and ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2017.12.005

    authors: Greenes RA,Bates DW,Kawamoto K,Middleton B,Osheroff J,Shahar Y

    更新日期:2018-02-01 00:00:00

  • Comparison between passive vision-based system and a wearable inertial-based system for estimating temporal gait parameters related to the GAITRite electronic walkway.

    abstract::Quantitative gait analysis allows clinicians to assess the inherent gait variability over time which is a functional marker to aid in the diagnosis of disabilities or diseases such as frailty, the onset of cognitive decline and neurodegenerative diseases, among others. However, despite the accuracy achieved by the cur...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.07.009

    authors: González I,López-Nava IH,Fontecha J,Muñoz-Meléndez A,Pérez-SanPablo AI,Quiñones-Urióstegui I

    更新日期:2016-08-01 00:00:00

  • MorphoCol: An ontology-based knowledgebase for the characterisation of clinically significant bacterial colony morphologies.

    abstract:BACKGROUND:One of the major concerns of the biomedical community is the increasing prevalence of antimicrobial resistant microorganisms. Recent findings show that the diversification of colony morphology may be indicative of the expression of virulence factors and increased resistance to antibiotic therapeutics. To tra...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.03.007

    authors: Sousa AM,Pereira MO,Lourenço A

    更新日期:2015-06-01 00:00:00

  • Automated annotation and classification of BI-RADS assessment from radiology reports.

    abstract::The Breast Imaging Reporting and Data System (BI-RADS) was developed to reduce variation in the descriptions of findings. Manual analysis of breast radiology report data is challenging but is necessary for clinical and healthcare quality assurance activities. The objective of this study is to develop a natural languag...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.04.011

    authors: Castro SM,Tseytlin E,Medvedeva O,Mitchell K,Visweswaran S,Bekhuis T,Jacobson RS

    更新日期:2017-05-01 00:00:00

  • RedMed: Extending drug lexicons for social media applications.

    abstract::Social media has been identified as a promising potential source of information for pharmacovigilance. The adoption of social media data has been hindered by the massive and noisy nature of the data. Initial attempts to use social media data have relied on exact text matches to drugs of interest, and therefore suffer ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103307

    authors: Lavertu A,Altman RB

    更新日期:2019-11-01 00:00:00

  • A graph-based approach to auditing RxNorm.

    abstract:OBJECTIVES:RxNorm is a standardized nomenclature for clinical drug entities developed by the National Library of Medicine. In this paper, we audit relations in RxNorm for consistency and completeness through the systematic analysis of the graph of its concepts and relationships. METHODS:The representation of multi-ing...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.04.004

    authors: Bodenreider O,Peters LB

    更新日期:2009-06-01 00:00:00

  • Monitoring Obstructive Sleep Apnea by means of a real-time mobile system based on the automatic extraction of sets of rules through Differential Evolution.

    abstract::Real-time Obstructive Sleep Apnea (OSA) episode detection and monitoring are important for society in terms of an improvement in the health of the general population and of a reduction in mortality and healthcare costs. Currently, to diagnose OSA patients undergo PolySomnoGraphy (PSG), a complicated and invasive test ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2014.02.015

    authors: Sannino G,De Falco I,De Pietro G

    更新日期:2014-06-01 00:00:00

  • Concept and implementation of a study dashboard module for a continuous monitoring of trial recruitment and documentation.

    abstract:BACKGROUND:The difficulty of managing patient recruitment and documentation for clinical trials prompts a demand for instruments for closely monitoring these critical but unpredictable processes. Increasingly adopted Electronic Data Capture (EDC) applications provide novel opportunities to reutilize stored information ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.10.010

    authors: Toddenroth D,Sivagnanasundaram J,Prokosch HU,Ganslandt T

    更新日期:2016-12-01 00:00:00

  • A kernel-based clustering method for gene selection with gene expression data.

    abstract::Gene selection is important for cancer classification based on gene expression data, because of high dimensionality and small sample size. In this paper, we present a new gene selection method based on clustering, in which dissimilarity measures are obtained through kernel functions. It searches for best weights of ge...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.05.007

    authors: Chen H,Zhang Y,Gutman I

    更新日期:2016-08-01 00:00:00

  • Patient empowerment for cancer patients through a novel ICT infrastructure.

    abstract::As a result of recent advances in cancer research and "precision medicine" approaches, i.e. the idea of treating each patient with the right drug at the right time, more and more cancer patients are being cured, or might have to cope with a life with cancer. For many people, cancer survival today means living with a c...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103342

    authors: Kondylakis H,Bucur A,Crico C,Dong F,Graf N,Hoffman S,Koumakis L,Manenti A,Marias K,Mazzocco K,Pravettoni G,Renzi C,Schera F,Triberti S,Tsiknakis M,Kiefer S

    更新日期:2020-01-01 00:00:00

  • Quality assurance of chemical ingredient classification for the National Drug File - Reference Terminology.

    abstract::The National Drug File - Reference Terminology (NDF-RT) is a large and complex drug terminology consisting of several classification hierarchies on top of an extensive collection of drug concepts. These hierarchies provide important information about clinical drugs, e.g., their chemical ingredients, mechanisms of acti...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.07.013

    authors: Zheng L,Yumak H,Chen L,Ochs C,Geller J,Kapusnik-Uner J,Perl Y

    更新日期:2017-09-01 00:00:00

  • The Analytic Information Warehouse (AIW): a platform for analytics using electronic health record data.

    abstract:OBJECTIVE:To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. MATERIALS AND METHODS:We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transfor...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.01.005

    authors: Post AR,Kurc T,Cholleti S,Gao J,Lin X,Bornstein W,Cantrell D,Levine D,Hohmann S,Saltz JH

    更新日期:2013-06-01 00:00:00

  • An evaluation of clinical order patterns machine-learned from clinician cohorts stratified by patient mortality outcomes.

    abstract:OBJECTIVE:Evaluate the quality of clinical order practice patterns machine-learned from clinician cohorts stratified by patient mortality outcomes. MATERIALS AND METHODS:Inpatient electronic health records from 2010 to 2013 were extracted from a tertiary academic hospital. Clinicians (n = 1822) were stratified into lo...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.09.005

    authors: Wang JK,Hom J,Balasubramanian S,Schuler A,Shah NH,Goldstein MK,Baiocchi MTM,Chen JH

    更新日期:2018-10-01 00:00:00

  • GLIF3: a representation format for sharable computer-interpretable clinical practice guidelines.

    abstract::The Guideline Interchange Format (GLIF) is a model for representation of sharable computer-interpretable guidelines. The current version of GLIF (GLIF3) is a substantial update and enhancement of the model since the previous version (GLIF2). GLIF3 enables encoding of a guideline at three levels: a conceptual flowchart...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2004.04.002

    authors: Boxwala AA,Peleg M,Tu S,Ogunyemi O,Zeng QT,Wang D,Patel VL,Greenes RA,Shortliffe EH

    更新日期:2004-06-01 00:00:00

  • Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.

    abstract::We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2017.07.012

    authors: Kreimeyer K,Foster M,Pandey A,Arya N,Halford G,Jones SF,Forshee R,Walderhaug M,Botsis T

    更新日期:2017-09-01 00:00:00

  • Spectral-dynamic representation of DNA sequences.

    abstract::A graphical representation of DNA sequences in which the distribution of a particular base B=A,C,G,T is represented by a set of discrete lines has been formulated. The methodology of this approach has been borrowed from two areas of physics: spectroscopy and dynamics. Consequently, the set of discrete lines is referre...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.06.001

    authors: Bielińska-Wąż D,Wąż P

    更新日期:2017-08-01 00:00:00

  • A model-driven methodology for exploring complex disease comorbidities applied to autism spectrum disorder and inflammatory bowel disease.

    abstract::We propose a model-driven methodology aimed to shed light on complex disorders. Our approach enables exploring shared etiologies of comorbid diseases at the molecular pathway level. The method, Comparative Comorbidities Simulation (CCS), uses stochastic Petri net simulation for examining the phenotypic effects of pert...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.08.008

    authors: Somekh J,Peleg M,Eran A,Koren I,Feiglin A,Demishtein A,Shiloh R,Heiner M,Kong SW,Elazar Z,Kohane I

    更新日期:2016-10-01 00:00:00

  • HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

    abstract::The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function....

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.01.008

    authors: O'Driscoll A,Belogrudov V,Carroll J,Kropp K,Walsh P,Ghazal P,Sleator RD

    更新日期:2015-04-01 00:00:00

  • NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information.

    abstract::Over the last 8 years, the National Cancer Institute (NCI) has launched a major effort to integrate molecular and clinical cancer-related information within a unified biomedical informatics framework, with controlled terminology as its foundational layer. The NCI Thesaurus is the reference terminology underpinning the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2006.02.013

    authors: Sioutos N,de Coronado S,Haber MW,Hartel FW,Shaiu WL,Wright LW

    更新日期:2007-02-01 00:00:00

  • Cognitive simulators for medical education and training.

    abstract::Simulators for honing procedural skills (such as surgical skills and central venous catheter placement) have proven to be valuable tools for medical educators and students. While such simulations represent an effective paradigm in surgical education, there is an opportunity to add a layer of cognitive exercises to the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.02.008

    authors: Kahol K,Vankipuram M,Smith ML

    更新日期:2009-08-01 00:00:00