Abstract:
RATIONALE:Templates in text notes pose challenges for automated information extraction algorithms. We propose a method that identifies novel templates in plain text medical notes. The identification can then be used to either include or exclude templates when processing notes for information extraction. METHODS:The two-module method is based on the framework of information foraging and addresses the hypothesis that documents containing templates and the templates within those documents can be identified by common features. The first module takes documents from the corpus and groups those with common templates. This is accomplished through a binned word count hierarchical clustering algorithm. The second module extracts the templates. It uses the groupings and performs a longest common subsequence (LCS) algorithm to obtain the constituent parts of the templates. The method was developed and tested on a random document corpus of 750 notes derived from a large database of US Department of Veterans Affairs (VA) electronic medical notes. RESULTS:The grouping module, using hierarchical clustering, identified 23 groups with 3 documents or more, consisting of 120 documents from the 750 documents in our test corpus. Of these, 18 groups had at least one common template that was present in all documents in the group for a positive predictive value of 78%. The LCS extraction module performed with 100% positive predictive value, 94% sensitivity, and 83% negative predictive value. The human review determined that in 4 groups the template covered the entire document, with the remaining 14 groups containing a common section template. Among documents with templates, the number of templates per document ranged from 1 to 14. The mean and median number of templates per group was 5.9 and 5, respectively. DISCUSSION:The grouping method was successful in finding like documents containing templates. Of the groups of documents containing templates, the LCS module was successful in deciphering text belonging to the template and text that was extraneous. Major obstacles to improved performance included documents composed of multiple templates, templates that included other templates embedded within them, and variants of templates. We demonstrate proof of concept of the grouping and extraction method of identifying templates in electronic medical records in this pilot study and propose methods to improve performance and scaling up.
journal_name
J Biomed Informjournal_title
Journal of biomedical informaticsauthors
Redd AM,Gundlapalli AV,Divita G,Carter ME,Tran LT,Samore MHdoi
10.1016/j.jbi.2016.07.019subject
Has Abstractpub_date
2017-07-01 00:00:00pages
S68-S76eissn
1532-0464issn
1532-0480pii
S1532-0464(16)30073-9journal_volume
71Spub_type
杂志文章abstract::This paper presents a natural language processing (NLP) system that was designed to participate in the 2014 i2b2 de-identification challenge. The challenge task aims to identify and classify seven main Protected Health Information (PHI) categories and 25 associated sub-categories. A hybrid model was proposed which com...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.06.015
更新日期:2015-12-01 00:00:00
abstract::Medical error is a leading cause of patient death in the United States. Among the different types of medical errors, harm to patients caused by doctors missing early signs of deterioration is especially challenging to address due to the heterogeneity of patients' physiological patterns. In this study, we implemented r...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103425
更新日期:2020-07-01 00:00:00
abstract::Patient recruitment is one of the most important barriers to successful completion of clinical trials and thus to obtaining evidence about new methods for prevention, diagnostics and treatment. The reason is that recruitment is effort consuming. It requires the identification of candidate patients for the trial (the p...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.05.005
更新日期:2015-08-01 00:00:00
abstract::Cancer surveillance data are collected every year in the United States via the National Program of Cancer Registries (NPCR) and the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI). General trends are closely monitored to measure the nation's progress against cancer. The...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2016.03.027
更新日期:2016-06-01 00:00:00
abstract::Drug therapeutic indications and side-effects are both measurable patient phenotype changes in response to the treatment. Inferring potential drug therapeutic indications and identifying clinically interesting drug side-effects are both important and challenging tasks. Previous studies have utilized either chemical st...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2014.03.014
更新日期:2014-10-01 00:00:00
abstract::In absence of periodic systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among the many protein-protein interaction (PPI) databases and tools. We conducted a comprehensive compilation and comparison of such resources. We compiled 375 PPI resources, short-listed 125 impor...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103380
更新日期:2020-03-01 00:00:00
abstract::Mereological relations such as part-of and its inverse has-part are fundamental to the description of the structure of living organisms. Whereas classical mereology focuses on individual entities, mereological relations in biomedical ontologies are generally asserted between classes of individuals. In general, this pr...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2005.11.003
更新日期:2006-06-01 00:00:00
abstract::We have been investigating registration methods for improving digital subtraction angiography (DSA) images to extract blood vessels by reducing artifacts due to body motion, such as rotation, contraction, and dilation. In this paper, we propose a new and simple DSA registration algorithm with local distortion vectors ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1006/jbin.2001.1018
更新日期:2001-06-01 00:00:00
abstract::Domain reference ontologies represent knowledge about a particular part of the world in a way that is independent from specific objectives, through a theory of the domain. An example of reference ontology in biomedical informatics is the Foundational Model of Anatomy (FMA), an ontology of anatomy that covers the entir...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2005.09.002
更新日期:2006-06-01 00:00:00
abstract::End-stage renal disease (ESRD) is the final stage of chronic kidney disease (CKD) and requires hemodialysis (HD) for survival. Intradialytic blood pressure (IBP) measurements are necessary to ensure patient safety during HD treatments and have critical clinical and prognostic significance. Studies on IBP measurements,...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.05.013
更新日期:2018-07-01 00:00:00
abstract::The Strategic Health IT Advanced Research Projects (SHARP) Program, established by the Office of the National Coordinator for Health Information Technology in 2010 supports research findings that remove barriers for increased adoption of health IT. The improvements envisioned by the SHARP Area 4 Consortium (SHARPn) wi...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2012.01.009
更新日期:2012-08-01 00:00:00
abstract::The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on identifying risk factors for heart disease (specifically, Cardiac Artery Disease) in clinical narratives. For this track, we used a "light" annotation paradigm to annotate a set of 1304 longitudinal medical records describing 29...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.05.009
更新日期:2015-12-01 00:00:00
abstract:MOTIVATION:A challenge in microarray data analysis is to interpret observed changes in terms of biological properties and relationships. One powerful approach is to make associations of gene expression clusters with biomedical ontologies and/or biological pathways. However, this approach evaluates only one cluster at a...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2007.10.003
更新日期:2008-04-01 00:00:00
abstract::Whenever a shared database resource, containing critical patient data, is created, protecting the contents of the database is a high priority goal. This goal can be achieved by developing a Query-By-Example (QBE) interface, designed to access a shared database, and embedding within the QBE a hierarchical security modu...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/s1532-0464(02)00524-5
更新日期:2002-06-01 00:00:00
abstract::We have developed EpDis and MassPred, extendable open source software tools that support bioinformatic research and enable parallel use of different methods for the prediction of T cell epitopes, disorder and disordered binding regions and hydropathy calculation. These tools offer a semi-automated installation of chos...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2016.01.016
更新日期:2016-04-01 00:00:00
abstract:OBJECTIVE:To develop an effective and scalable individual-level patient cost prediction method by automatically learning hidden temporal patterns from multivariate time series data in patient insurance claims using a convolutional neural network (CNN) architecture. METHODS:We used three years of medical and pharmacy c...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103565
更新日期:2020-11-01 00:00:00
abstract:OBJECTIVE:Clinical pathways (CPs) are widely studied methods to standardize clinical intervention and improve medical quality. However, standard care plans defined in current CPs are too general to execute in a practical healthcare environment. The purpose of this study was to create hospital-specific personalized CPs ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2014.07.017
更新日期:2014-12-01 00:00:00
abstract::With the increasing availability of genomic sequence data, numerous methods have been proposed for finding DNA motifs. The discovery of DNA motifs serves a critical step in many biological applications. However, the privacy implication of DNA analysis is normally neglected in the existing methods. In this work, we pro...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2013.12.016
更新日期:2014-08-01 00:00:00
abstract::Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the w...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.04.013
更新日期:2015-08-01 00:00:00
abstract:BACKGROUND:Control systems engineering methods, particularly, system identification (system ID), offer an idiographic (i.e., person-specific) approach to develop dynamic models of physical activity (PA) that can be used to personalize interventions in a systematic, scalable way. The purpose of this work is to: (1) appl...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.01.010
更新日期:2018-03-01 00:00:00
abstract::Warfarin is an effective preventative treatment for arterial and venous thromboembolism, but requires individualised dosing due to its narrow therapeutic range and high individual variation. Many machine learning techniques have been demonstrated in this domain. This study evaluated the accuracy of the most promising ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103634
更新日期:2021-01-01 00:00:00
abstract::To extract biomedical information about bio-entities from the huge amount of biomedical literature, the first key step is recognizing their names in these literatures, which remains a challenging task due to the irregularities and ambiguities in bio-entities nomenclature. The recognition performances of the current po...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2008.01.002
更新日期:2008-08-01 00:00:00
abstract:OBJECTIVE:Published clinical trials and high quality peer reviewed medical publications are considered as the main sources of evidence used for synthesizing systematic reviews or practicing Evidence Based Medicine (EBM). Finding all relevant published evidence for a particular medical case is a time and labour intensiv...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103321
更新日期:2019-12-01 00:00:00
abstract:BACKGROUND:Data collection and extraction from noisy text sources such as social media typically rely on keyword-based searching/listening. However, health-related terms are often misspelled in such noisy text sources due to their complex morphology, resulting in the exclusion of relevant data for studies. In this pape...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.11.007
更新日期:2018-12-01 00:00:00
abstract::Game-based interventions (GBI) have been used to promote health-related outcomes, including cognitive functions. Criteria for game-elements (GE) selection are insufficiently characterized in terms of their adequacy to patients' clinical conditions or targeted cognitive outcomes. This study aimed to identify GE applied...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103287
更新日期:2019-10-01 00:00:00
abstract::Information extraction is the process of scanning text for information relevant to some interest, including extracting entities, relations, and events. It requires deeper analysis than key word searches, but its aims fall short of the very hard and long-term problem of full text understanding. Information extraction r...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/s1532-0464(03)00015-7
更新日期:2002-08-01 00:00:00
abstract::Protein name recognition aims to detect each and every protein names appearing in a PubMed abstract. The task is not simple, as the graphic word boundary (space separator) assumed in conventional preprocessing does not necessarily coincide with the protein name boundary. Such boundary disagreement caused by tokenizati...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2004.08.001
更新日期:2004-12-01 00:00:00
abstract::We introduce a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances between points with respect to any two reference patterns in a special two-dimensional coordinate system, the relative distance pl...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2004.07.005
更新日期:2004-10-01 00:00:00
abstract::The explosive growth of biomedical literature has created a rich source of knowledge, such as that on protein-protein interactions (PPIs) and drug-drug interactions (DDIs), locked in unstructured free text. Biomedical relation classification aims to automatically detect and classify biomedical relations, which has gre...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章,评审
doi:10.1016/j.jbi.2019.103294
更新日期:2019-11-01 00:00:00
abstract::In this paper, a Hidden Semi-Markov Model (HSMM) based approach is proposed to evaluate and monitor body motion during a rehabilitation training program. The approach extracts clinically relevant motion features from skeleton joint trajectories, acquired by the RGB-D camera, and provides a score for the subject's perf...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2017.12.012
更新日期:2018-02-01 00:00:00