Using natural language processing to extract mammographic findings.

Abstract:

OBJECTIVE:Structured data on mammographic findings are difficult to obtain without manual review. We developed and evaluated a rule-based natural language processing (NLP) system to extract mammographic findings from free-text mammography reports. MATERIALS AND METHODS:The NLP system extracted four mammographic findings: mass, calcification, asymmetry, and architectural distortion, using a dictionary look-up method on 93,705 mammography reports from Group Health. Status annotations and anatomical location annotation were associated to each NLP detected finding through association rules. After excluding negated, uncertain, and historical findings, affirmative mentions of detected findings were summarized. Confidence flags were developed to denote reports with highly confident NLP results and reports with possible NLP errors. A random sample of 100 reports was manually abstracted to evaluate the accuracy of the system. RESULTS:The NLP system correctly coded 96-99 out of our sample of 100 reports depending on findings. Measures of sensitivity, specificity and negative predictive values exceeded 0.92 for all findings. Positive predictive values were relatively low for some findings due to their low prevalence. DISCUSSION:Our NLP system was implemented entirely in SAS Base, which makes it portable and easy to implement. It performed reasonably well with multiple applications, such as using confidence flags as a filter to improve the efficiency of manual review. Refinements of library and association rules, and testing on more diverse samples may further improve its performance. CONCLUSION:Our NLP system successfully extracts clinically useful information from mammography reports. Moreover, SAS is a feasible platform for implementing NLP algorithms.

journal_name

J Biomed Inform

authors

Gao H,Aiello Bowles EJ,Carrell D,Buist DS

doi

10.1016/j.jbi.2015.01.010

subject

Has Abstract

pub_date

2015-04-01 00:00:00

pages

77-84

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(15)00012-X

journal_volume

54

pub_type

杂志文章
  • The Counterfactual χ-GAN: Finding comparable cohorts in observational health data.

    abstract::Causal inference often relies on the counterfactual framework, which requires that treatment assignment is independent of the outcome, known as strong ignorability. Approaches to enforcing strong ignorability in causal analyses of observational data include weighting and matching methods. Effect estimates, such as the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103515

    authors: Averitt AJ,Vanitchanant N,Ranganath R,Perotte AJ

    更新日期:2020-09-01 00:00:00

  • The REDCap consortium: Building an international community of software platform partners.

    abstract::The Research Electronic Data Capture (REDCap) data management platform was developed in 2004 to address an institutional need at Vanderbilt University, then shared with a limited number of adopting sites beginning in 2006. Given bi-directional benefit in early sharing experiments, we created a broader consortium shari...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103208

    authors: Harris PA,Taylor R,Minor BL,Elliott V,Fernandez M,O'Neal L,McLeod L,Delacqua G,Delacqua F,Kirby J,Duda SN,REDCap Consortium.

    更新日期:2019-07-01 00:00:00

  • Systematic comparison of the protein-protein interaction databases from a user's perspective.

    abstract::In absence of periodic systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among the many protein-protein interaction (PPI) databases and tools. We conducted a comprehensive compilation and comparison of such resources. We compiled 375 PPI resources, short-listed 125 impor...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103380

    authors: Bajpai AK,Davuluri S,Tiwary K,Narayanan S,Oguru S,Basavaraju K,Dayalan D,Thirumurugan K,Acharya KK

    更新日期:2020-03-01 00:00:00

  • Extending the Fellegi-Sunter probabilistic record linkage method for approximate field comparators.

    abstract::Probabilistic record linkage is a method commonly used to determine whether demographic records refer to the same person. The Fellegi-Sunter method is a probabilistic approach that uses field weights based on log likelihood ratios to determine record similarity. This paper introduces an extension of the Fellegi-Sunter...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.08.004

    authors: DuVall SL,Kerber RA,Thomas A

    更新日期:2010-02-01 00:00:00

  • The Analytic Information Warehouse (AIW): a platform for analytics using electronic health record data.

    abstract:OBJECTIVE:To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. MATERIALS AND METHODS:We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transfor...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.01.005

    authors: Post AR,Kurc T,Cholleti S,Gao J,Lin X,Bornstein W,Cantrell D,Levine D,Hohmann S,Saltz JH

    更新日期:2013-06-01 00:00:00

  • ISeeU: Visually interpretable deep learning for mortality prediction inside the ICU.

    abstract::To improve the performance of Intensive Care Units (ICUs), the field of bio-statistics has developed scores which try to predict the likelihood of negative outcomes. These help evaluate the effectiveness of treatments and clinical practice, and also help to identify patients with unexpected outcomes. However, they hav...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103269

    authors: Caicedo-Torres W,Gutierrez J

    更新日期:2019-10-01 00:00:00

  • Tracking a moving user in indoor environments using Bluetooth low energy beacons.

    abstract:BACKGROUND:Bluetooth low energy (BLE) beacons have been used to track the locations of individuals in indoor environments for clinical applications such as workflow analysis and infectious disease modelling. Most current approaches use the received signal strength indicator (RSSI) to track locations. When using the RSS...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103288

    authors: Surian D,Kim V,Menon R,Dunn AG,Sintchenko V,Coiera E

    更新日期:2019-10-01 00:00:00

  • Use of an interactive tool to assess patients' willingness-to-pay.

    abstract::Assessment of willingness to pay (WTP) has become an important issue in health care technology assessment and in providing insight into the risks and benefits of treatment options. We have accordingly explored the use of an interactive method for assessment of WTP. To illustrate our methodology, we describe the develo...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1006/jbin.2002.1032

    authors: Matthews D,Rocchi A,Wang EC,Gafni A

    更新日期:2001-10-01 00:00:00

  • NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information.

    abstract::Over the last 8 years, the National Cancer Institute (NCI) has launched a major effort to integrate molecular and clinical cancer-related information within a unified biomedical informatics framework, with controlled terminology as its foundational layer. The NCI Thesaurus is the reference terminology underpinning the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2006.02.013

    authors: Sioutos N,de Coronado S,Haber MW,Hartel FW,Shaiu WL,Wright LW

    更新日期:2007-02-01 00:00:00

  • Game-based interventions for neuropsychological assessment, training and rehabilitation: Which game-elements to use? A systematic review.

    abstract::Game-based interventions (GBI) have been used to promote health-related outcomes, including cognitive functions. Criteria for game-elements (GE) selection are insufficiently characterized in terms of their adequacy to patients' clinical conditions or targeted cognitive outcomes. This study aimed to identify GE applied...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103287

    authors: Ferreira-Brito F,Fialho M,Virgolino A,Neves I,Miranda AC,Sousa-Santos N,Caneiras C,Carriço L,Verdelho A,Santos O

    更新日期:2019-10-01 00:00:00

  • Hierarchical data security in a Query-By-Example interface for a shared database.

    abstract::Whenever a shared database resource, containing critical patient data, is created, protecting the contents of the database is a high priority goal. This goal can be achieved by developing a Query-By-Example (QBE) interface, designed to access a shared database, and embedding within the QBE a hierarchical security modu...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/s1532-0464(02)00524-5

    authors: Taylor M

    更新日期:2002-06-01 00:00:00

  • Categorizing the world of registries.

    abstract::The term registry is widely used to refer to any database storing clinical information collected as a byproduct of patient care. Despite the use of this single characterizing term (registry), these databases exist in various forms and support functions ranging from biomedical informatics and clinical research, to publ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2008.01.009

    authors: Drolet BC,Johnson KB

    更新日期:2008-12-01 00:00:00

  • A survey on literature based discovery approaches in biomedical domain.

    abstract::Literature Based Discovery (LBD) refers to the problem of inferring new and interesting knowledge by logically connecting independent fragments of information units through explicit or implicit means. This area of research, which incorporates techniques from Natural Language Processing (NLP), Information Retrieval and...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2019.103141

    authors: Gopalakrishnan V,Jha K,Jin W,Zhang A

    更新日期:2019-05-01 00:00:00

  • Challenges in clinical natural language processing for automated disorder normalization.

    abstract:BACKGROUND:Identifying key variables such as disorders within the clinical narratives in electronic health records has wide-ranging applications within clinical practice and biomedical research. Previous research has demonstrated reduced performance of disorder named entity recognition (NER) and normalization (or groun...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.07.010

    authors: Leaman R,Khare R,Lu Z

    更新日期:2015-10-01 00:00:00

  • Explorative data analysis techniques and unsupervised clustering methods to support clinical assessment of Chronic Obstructive Pulmonary Disease (COPD) phenotypes.

    abstract::Chronic Obstructive Pulmonary Disease (COPD) is the fourth leading cause of death worldwide and represents one of the major causes of chronic morbidity. Cigarette smoking is the most important risk factor for COPD. In these patients, the airflow limitation is caused by a mixture of small airways disease and parenchyma...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.05.008

    authors: Paoletti M,Camiciottoli G,Meoni E,Bigazzi F,Cestelli L,Pistolesi M,Marchesi C

    更新日期:2009-12-01 00:00:00

  • Comparison with manual registration reveals satisfactory completeness and efficiency of a computerized cancer registration system.

    abstract::Automated software for cancer registration, called Open Registry and developed by ourselves was adopted by the Varese (population-based) Cancer Registry starting from 1997. Since the use of automated cancer registration is increasing, it is important to assess the quality and completeness of the automated data being p...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2007.03.003

    authors: Contiero P,Tittarelli A,Maghini A,Fabiano S,Frassoldi E,Costa E,Gada D,Codazzi T,Crosignani P,Tessandori R,Tagliabue G

    更新日期:2008-02-01 00:00:00

  • On the reproducibility of results of pathway analysis in genome-wide expression studies of colorectal cancers.

    abstract::One of the major problems in genomics and medicine is the identification of gene networks and pathways deregulated in complex and polygenic diseases, like cancer. In this paper, we address the problem of assessing the variability of results of pathways analysis identified in different and independent genome wide expre...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.09.005

    authors: Maglietta R,Distaso A,Piepoli A,Palumbo O,Carella M,D'Addabbo A,Mukherjee S,Ancona N

    更新日期:2010-06-01 00:00:00

  • Evaluation of an Enhanced Role-Based Access Control model to manage information access in collaborative processes for a statewide clinical education program.

    abstract:BACKGROUND:Managing information access in collaborative processes is a critical requirement to team-based biomedical research, clinical education, and patient care. We have previously developed a computation model, Enhanced Role-Based Access Control (EnhancedRBAC), and applied it to coordinate information access in the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.11.007

    authors: Le XH,Doll T,Barbosu M,Luque A,Wang D

    更新日期:2014-08-01 00:00:00

  • Introducing RFID technology in dynamic and time-critical medical settings: requirements and challenges.

    abstract::We describe the process of introducing RFID technology in the trauma bay of a trauma center to support fast-paced and complex teamwork during resuscitation. We analyzed trauma resuscitation tasks, photographs of medical tools, and videos of simulated resuscitations to gain insight into resuscitation tasks, work practi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.04.003

    authors: Parlak S,Sarcevic A,Marsic I,Burd RS

    更新日期:2012-10-01 00:00:00

  • Heterogeneous database integration in biomedicine.

    abstract::The rapid expansion of biomedical knowledge, reduction in computing costs, and spread of internet access have created an ocean of electronic data. The decentralized nature of our scientific community and healthcare system, however, has resulted in a patchwork of diverse, or heterogeneous, database implementations, mak...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1006/jbin.2001.1024

    authors: Sujansky W

    更新日期:2001-08-01 00:00:00

  • A flexible approach to distributed data anonymization.

    abstract::Sensitive biomedical data is often collected from distributed sources, involving different information systems and different organizational units. Local autonomy and legal reasons lead to the need of privacy preserving integration concepts. In this article, we focus on anonymization, which plays an important role for ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.12.002

    authors: Kohlmayer F,Prasser F,Eckert C,Kuhn KA

    更新日期:2014-08-01 00:00:00

  • Chronic disease modeling and simulation software.

    abstract::Computers allow describing the progress of a disease using computerized models. These models allow aggregating expert and clinical information to allow researchers and decision makers to forecast disease progression. To make this forecast reliable, good models and therefore good modeling tools are required. This paper...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2010.06.003

    authors: Barhak J,Isaman DJ,Ye W,Lee D

    更新日期:2010-10-01 00:00:00

  • Medical speciality classification system based on binary particle swarms and ensemble of one vs. rest support vector machines.

    abstract::Nowadays, artificial intelligence plays an integral role in medical and healthcare informatics. Developing an automatic question classification and answering system is essential for coping with constant advancements in science and technology. However, efficient online medical services are required to promote offline m...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103525

    authors: Faris H,Habib M,Faris M,Alomari M,Alomari A

    更新日期:2020-09-01 00:00:00

  • Combining glass box and black box evaluations in the identification of heart disease risk factors and their temporal relations from clinical records.

    abstract:BACKGROUND:The determination of risk factors and their temporal relations in natural language patient records is a complex task which has been addressed in the i2b2/UTHealth 2014 shared task. In this context, in most systems it was broadly decomposed into two sub-tasks implemented by two components: entity detection, a...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.06.014

    authors: Grouin C,Moriceau V,Zweigenbaum P

    更新日期:2015-12-01 00:00:00

  • Risk factor detection for heart disease by applying text analytics in electronic medical records.

    abstract::In the United States, about 600,000 people die of heart disease every year. The annual cost of care services, medications, and lost productivity reportedly exceeds 108.9 billion dollars. Effective disease risk assessment is critical to prevention, care, and treatment planning. Recent advancements in text analytics hav...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.08.011

    authors: Torii M,Fan JW,Yang WL,Lee T,Wiley MT,Zisook DS,Huang Y

    更新日期:2015-12-01 00:00:00

  • Personal discovery in diabetes self-management: Discovering cause and effect using self-monitoring data.

    abstract:OBJECTIVE:To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. MATERIALS AND METHODS:We conducted an observational qualitative study of d...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.09.013

    authors: Mamykina L,Heitkemper EM,Smaldone AM,Kukafka R,Cole-Lewis HJ,Davidson PG,Mynatt ED,Cassells A,Tobin JN,Hripcsak G

    更新日期:2017-12-01 00:00:00

  • Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function.

    abstract::Computerized survival prediction in healthcare identifying the risk of disease mortality, helps healthcare providers to effectively manage their patients by providing appropriate treatment options. In this study, we propose to apply a classification algorithm, Contrast Pattern Aided Logistic Regression (CPXR(Log)) wit...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.01.009

    authors: Taslimitehrani V,Dong G,Pereira NL,Panahiazar M,Pathak J

    更新日期:2016-04-01 00:00:00

  • Annotating risk factors for heart disease in clinical narratives for diabetic patients.

    abstract::The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on identifying risk factors for heart disease (specifically, Cardiac Artery Disease) in clinical narratives. For this track, we used a "light" annotation paradigm to annotate a set of 1304 longitudinal medical records describing 29...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.05.009

    authors: Stubbs A,Uzuner Ö

    更新日期:2015-12-01 00:00:00

  • FRR: fair remote retrieval of outsourced private medical records in electronic health networks.

    abstract::Cloud computing is emerging as the next-generation IT architecture. However, cloud computing also raises security and privacy concerns since the users have no physical control over the outsourced data. This paper focuses on fairly retrieving encrypted private medical records outsourced to remote untrusted cloud server...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2014.02.008

    authors: Wang H,Wu Q,Qin B,Domingo-Ferrer J

    更新日期:2014-08-01 00:00:00

  • A framework for modeling health behavior protocols and their linkage to behavioral theory.

    abstract::With the rise in chronic, behavior-related disease, computerized behavioral protocols (CBPs) that help individuals improve behaviors have the potential to play an increasing role in the future health of society. To be effective and widely used CBPs should be based on accepted behavioral theory. However, designing CBPs...

    journal_title:Journal of biomedical informatics

    pub_type: 临床试验,杂志文章

    doi:10.1016/j.jbi.2004.12.001

    authors: Lenert L,Norman GJ,Mailhot M,Patrick K

    更新日期:2005-08-01 00:00:00