Using natural language processing to extract mammographic findings.


OBJECTIVE:Structured data on mammographic findings are difficult to obtain without manual review. We developed and evaluated a rule-based natural language processing (NLP) system to extract mammographic findings from free-text mammography reports. MATERIALS AND METHODS:The NLP system extracted four mammographic findings: mass, calcification, asymmetry, and architectural distortion, using a dictionary look-up method on 93,705 mammography reports from Group Health. Status annotations and anatomical location annotation were associated to each NLP detected finding through association rules. After excluding negated, uncertain, and historical findings, affirmative mentions of detected findings were summarized. Confidence flags were developed to denote reports with highly confident NLP results and reports with possible NLP errors. A random sample of 100 reports was manually abstracted to evaluate the accuracy of the system. RESULTS:The NLP system correctly coded 96-99 out of our sample of 100 reports depending on findings. Measures of sensitivity, specificity and negative predictive values exceeded 0.92 for all findings. Positive predictive values were relatively low for some findings due to their low prevalence. DISCUSSION:Our NLP system was implemented entirely in SAS Base, which makes it portable and easy to implement. It performed reasonably well with multiple applications, such as using confidence flags as a filter to improve the efficiency of manual review. Refinements of library and association rules, and testing on more diverse samples may further improve its performance. CONCLUSION:Our NLP system successfully extracts clinically useful information from mammography reports. Moreover, SAS is a feasible platform for implementing NLP algorithms.


J Biomed Inform


Gao H,Aiello Bowles EJ,Carrell D,Buist DS




Has Abstract


2015-04-01 00:00:00












  • Categorizing the world of registries.

    abstract::The term registry is widely used to refer to any database storing clinical information collected as a byproduct of patient care. Despite the use of this single characterizing term (registry), these databases exist in various forms and support functions ranging from biomedical informatics and clinical research, to publ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Drolet BC,Johnson KB

    更新日期:2008-12-01 00:00:00

  • Consensus and Meta-analysis regulatory networks for combining multiple microarray gene expression datasets.

    abstract::Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with great...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,meta分析


    authors: Steele E,Tucker A

    更新日期:2008-12-01 00:00:00

  • An optimization based on simulation approach to the patient admission scheduling problem using a linear programing algorithm.

    abstract:BACKGROUND:As patient's length of stay in waiting lists increases, governments are looking for strategies to control the problem. Agreements were created with private providers to diminish the workload in the public sector. However, the growth of the private sector is not following the demand for care. Given this conte...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Granja C,Almada-Lobo B,Janela F,Seabra J,Mendes A

    更新日期:2014-12-01 00:00:00

  • Analysis of eligibility criteria representation in industry-standard clinical trial protocols.

    abstract::Previous research on standardization of eligibility criteria and its feasibility has traditionally been conducted on clinical trial protocols from (CT). The portability and use of such standardization for full-text industry-standard protocols has not been studied in-depth. Towards this end, in this ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Bhattacharya S,Cantor MN

    更新日期:2013-10-01 00:00:00

  • Digital subtraction angiogram registration method with local distortion vectors to decrease motion artifact.

    abstract::We have been investigating registration methods for improving digital subtraction angiography (DSA) images to extract blood vessels by reducing artifacts due to body motion, such as rotation, contraction, and dilation. In this paper, we propose a new and simple DSA registration algorithm with local distortion vectors ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Hiroshima K,Funakami R,Hiratsuka K,Nishino J,Odaka T,Ogura H,Fukushima T,Nishimoto Y,Tanaka M,Ito H,Yamamoto K

    更新日期:2001-06-01 00:00:00

  • Semi-supervised medical entity recognition: A study on Spanish and Swedish clinical corpora.

    abstract:OBJECTIVE:The goal of this study is to investigate entity recognition within Electronic Health Records (EHRs) focusing on Spanish and Swedish. Of particular importance is a robust representation of the entities. In our case, we utilized unsupervised methods to generate such representations. METHODS:The significance of...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Pérez A,Weegar R,Casillas A,Gojenola K,Oronoz M,Dalianis H

    更新日期:2017-07-01 00:00:00

  • Vaidurya: a multiple-ontology, concept-based, context-sensitive clinical-guideline search engine.

    abstract::We designed and implemented a generic search engine (Vaidurya), as part of our Digital clinical-Guideline Library (DeGeL) framework. Two search methods were implemented in addition to full-text search: (1) concept-based search, which relies on pre-indexing the guidelines in a clinically meaningful fashion, and (2) con...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Moskovitch R,Shahar Y

    更新日期:2009-02-01 00:00:00

  • Medical speciality classification system based on binary particle swarms and ensemble of one vs. rest support vector machines.

    abstract::Nowadays, artificial intelligence plays an integral role in medical and healthcare informatics. Developing an automatic question classification and answering system is essential for coping with constant advancements in science and technology. However, efficient online medical services are required to promote offline m...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Faris H,Habib M,Faris M,Alomari M,Alomari A

    更新日期:2020-09-01 00:00:00

  • A comprehensive review of feature based methods for drug target interaction prediction.

    abstract::Drug target interaction is a prominent research area in the field of drug discovery. It refers to the recognition of interactions between chemical compounds and the protein targets in the human body. Wet lab experiments to identify these interactions are expensive as well as time consuming. The computational methods o...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审


    authors: Sachdev K,Gupta MK

    更新日期:2019-05-01 00:00:00

  • A Health Surveillance Software Framework to deliver information on preventive healthcare strategies.

    abstract::A software framework can reduce costs related to the development of an application because it allows developers to reuse both design and code. Recently, companies and research groups have announced that they have been employing health software frameworks. This paper presents the design, proof-of-concept implementation...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Macedo AA,Pollettini JT,Baranauskas JA,Chaves JC

    更新日期:2016-08-01 00:00:00

  • Word sense disambiguation across two domains: biomedical literature and clinical notes.

    abstract::The aim of this study is to explore the word sense disambiguation (WSD) problem across two biomedical domains-biomedical literature and clinical notes. A supervised machine learning technique was used for the WSD task. One of the challenges addressed is the creation of a suitable clinical corpus with manual sense anno...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Savova GK,Coden AR,Sominsky IL,Johnson R,Ogren PV,de Groen PC,Chute CG

    更新日期:2008-12-01 00:00:00

  • Benchmarking relief-based feature selection methods for bioinformatics data mining.

    abstract::Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. g...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Urbanowicz RJ,Olson RS,Schmitt P,Meeker M,Moore JH

    更新日期:2018-09-01 00:00:00

  • Clinical decision support models and frameworks: Seeking to address research issues underlying implementation successes and failures.

    abstract::Computer-based clinical decision support (CDS) has been pursued for more than five decades. Despite notable accomplishments and successes, wide adoption and broad use of CDS in clinical practice has not been achieved. Many issues have been identified as being partially responsible for the relatively slow adoption and ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审


    authors: Greenes RA,Bates DW,Kawamoto K,Middleton B,Osheroff J,Shahar Y

    更新日期:2018-02-01 00:00:00

  • A knowledge-based system to find over-the-counter medicines for self-medication.

    abstract::This study developed a medicine query system based on Semantic Web and open data especially for self-medication users to search over-the-counter (OTC) medicines. Most existing medicine query systems are based on keyword searches. If users are uncertain about the exact search words, these query systems do not offer eff...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Sung HY,Chi YL

    更新日期:2020-08-01 00:00:00

  • Emerging medical informatics with case-based reasoning for aiding clinical decision in multi-agent system.

    abstract::This research aims to depict the methodological steps and tools about the combined operation of case-based reasoning (CBR) and multi-agent system (MAS) to expose the ontological application in the field of clinical decision support. The multi-agent architecture works for the consideration of the whole cycle of clinica...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Shen Y,Colloc J,Jacquet-Andrieu A,Lei K

    更新日期:2015-08-01 00:00:00

  • Comparison of reversible-jump Markov-chain-Monte-Carlo learning approach with other methods for missing enzyme identification.

    abstract::Computational identification of missing enzymes plays a significant role in accurate and complete reconstruction of metabolic network for both newly sequenced and well-studied organisms. For a metabolic reaction, given a set of candidate enzymes identified according to certain biological evidences, a powerful mathemat...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Geng B,Zhou X,Zhu J,Hung YS,Wong ST

    更新日期:2008-04-01 00:00:00

  • Selecting significant genes by randomization test for cancer classification using gene expression data.

    abstract::Gene selection is an important task in bioinformatics studies, because the accuracy of cancer classification generally depends upon the genes that have biological relevance to the classifying problems. In this work, randomization test (RT) is used as a gene selection method for dealing with gene expression data. In th...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Mao Z,Cai W,Shao X

    更新日期:2013-08-01 00:00:00

  • Computing with evidence Part II: An evidential approach to predicting metabolic drug-drug interactions.

    abstract::We describe a novel experiment that we conducted with the Drug Interaction Knowledge-base (DIKB) to determine which combinations of evidence enable a rule-based theory of metabolic drug-drug interactions to make the most optimal set of predictions. The focus of the experiment was a group of 16 drugs including six memb...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Boyce R,Collins C,Horn J,Kalet I

    更新日期:2009-12-01 00:00:00

  • Unsupervised ensemble ranking of terms in electronic health record notes based on their importance to patients.

    abstract:BACKGROUND:Allowing patients to access their own electronic health record (EHR) notes through online patient portals has the potential to improve patient-centered care. However, EHR notes contain abundant medical jargon that can be difficult for patients to comprehend. One way to help patients is to reduce information ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Chen J,Yu H

    更新日期:2017-04-01 00:00:00

  • An ontology-based measure to compute semantic similarity in biomedicine.

    abstract::Proper understanding of textual data requires the exploitation and integration of unstructured and heterogeneous clinical sources, healthcare records or scientific literature, which are fundamental aspects in clinical and translational research. The determination of semantic similarity between word pairs is an importa...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Batet M,Sánchez D,Valls A

    更新日期:2011-02-01 00:00:00

  • Patient similarity for precision medicine: A systematic review.

    abstract::Evidence-based medicine is the most prevalent paradigm adopted by physicians. Clinical practice guidelines typically define a set of recommendations together with eligibility criteria that restrict their applicability to a specific group of patients. The ever-growing size and availability of health-related data is cur...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Parimbelli E,Marini S,Sacchi L,Bellazzi R

    更新日期:2018-07-01 00:00:00

  • Heterogeneous database integration in biomedicine.

    abstract::The rapid expansion of biomedical knowledge, reduction in computing costs, and spread of internet access have created an ocean of electronic data. The decentralized nature of our scientific community and healthcare system, however, has resulted in a patchwork of diverse, or heterogeneous, database implementations, mak...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Sujansky W

    更新日期:2001-08-01 00:00:00

  • Lexical patterns, features and knowledge resources for coreference resolution in clinical notes.

    abstract::Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general-purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approa...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Gooch P,Roudsari A

    更新日期:2012-10-01 00:00:00

  • Risk factor detection for heart disease by applying text analytics in electronic medical records.

    abstract::In the United States, about 600,000 people die of heart disease every year. The annual cost of care services, medications, and lost productivity reportedly exceeds 108.9 billion dollars. Effective disease risk assessment is critical to prevention, care, and treatment planning. Recent advancements in text analytics hav...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Torii M,Fan JW,Yang WL,Lee T,Wiley MT,Zisook DS,Huang Y

    更新日期:2015-12-01 00:00:00

  • Mapping high-dimensional data onto a relative distance plane--an exact method for visualizing and characterizing high-dimensional patterns.

    abstract::We introduce a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances between points with respect to any two reference patterns in a special two-dimensional coordinate system, the relative distance pl...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Somorjai RL,Dolenko B,Demko A,Mandelzweig M,Nikulin AE,Baumgartner R,Pizzi NJ

    更新日期:2004-10-01 00:00:00

  • Role of OpenEHR as an open source solution for the regional modelling of patient data in obstetrics.

    abstract::This work investigates, whether openEHR with its reference model, archetypes and templates is suitable for the digital representation of demographic as well as clinical data. Moreover, it elaborates openEHR as a tool for modelling Hospital Information Systems on a regional level based on a national logical infrastruct...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Pahl C,Zare M,Nilashi M,de Faria Borges MA,Weingaertner D,Detschew V,Supriyanto E,Ibrahim O

    更新日期:2015-06-01 00:00:00

  • 3D interactive surgical visualization system using mobile spatial information acquisition and autostereoscopic display.

    abstract::Three-dimensional (3D) visualization of preoperative and intraoperative medical information becomes more and more important in minimally invasive surgery. We develop a 3D interactive surgical visualization system using mobile spatial information acquisition and autostereoscopic display for surgeons to observe surgical...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Fan Z,Weng Y,Chen G,Liao H

    更新日期:2017-07-01 00:00:00

  • A machine-learned knowledge discovery method for associating complex phenotypes with complex genotypes. Application to pain.

    abstract:BACKGROUND:The association of genotyping information with common traits is not satisfactorily solved. One of the most complex traits is pain and association studies have failed so far to provide reproducible predictions of pain phenotypes from genotypes in the general population despite a well-established genetic basis...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Lötsch J,Ultsch A

    更新日期:2013-10-01 00:00:00

  • RedMed: Extending drug lexicons for social media applications.

    abstract::Social media has been identified as a promising potential source of information for pharmacovigilance. The adoption of social media data has been hindered by the massive and noisy nature of the data. Initial attempts to use social media data have relied on exact text matches to drugs of interest, and therefore suffer ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Lavertu A,Altman RB

    更新日期:2019-11-01 00:00:00

  • Concept and implementation of a study dashboard module for a continuous monitoring of trial recruitment and documentation.

    abstract:BACKGROUND:The difficulty of managing patient recruitment and documentation for clinical trials prompts a demand for instruments for closely monitoring these critical but unpredictable processes. Increasingly adopted Electronic Data Capture (EDC) applications provide novel opportunities to reutilize stored information ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Toddenroth D,Sivagnanasundaram J,Prokosch HU,Ganslandt T

    更新日期:2016-12-01 00:00:00