A controlled greedy supervised approach for co-reference resolution on clinical text.

Abstract:

:Identification of co-referent entity mentions inside text has significant importance for other natural language processing (NLP) tasks (e.g. event linking). However, this task, known as co-reference resolution, remains a complex problem, partly because of the confusion over different evaluation metrics and partly because the well-researched existing methodologies do not perform well on new domains such as clinical records. This paper presents a variant of the influential mention-pair model for co-reference resolution. Using a series of linguistically and semantically motivated constraints, the proposed approach controls generation of less-informative/sub-optimal training and test instances. Additionally, the approach also introduces some aggressive greedy strategies in chain clustering. The proposed approach has been tested on the official test corpus of the recently held i2b2/VA 2011 challenge. It achieves an unweighted average F1 score of 0.895, calculated from multiple evaluation metrics (MUC, B(3) and CEAF scores). These results are comparable to the best systems of the challenge. What makes our proposed system distinct is that it also achieves high average F1 scores for each individual chain type (Test: 0.897, Person: 0.852, PROBLEM: 0.855, TREATMENT: 0.884). Unlike other works, it obtains good scores for each of the individual metrics rather than being biased towards a particular metric.

journal_name

J Biomed Inform

authors

Chowdhury MF,Zweigenbaum P

doi

10.1016/j.jbi.2013.03.007

subject

Has Abstract

pub_date

2013-06-01 00:00:00

pages

506-15

issue

3

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(13)00041-5

journal_volume

46

pub_type

杂志文章
  • MACE prediction of acute coronary syndrome via boosted resampling classification using electronic medical records.

    abstract:OBJECTIVES:Major adverse cardiac events (MACE) of acute coronary syndrome (ACS) often occur suddenly resulting in high mortality and morbidity. Recently, the rapid development of electronic medical records (EMR) provides the opportunity to utilize the potential of EMR to improve the performance of MACE prediction. In t...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.01.001

    authors: Huang Z,Chan TM,Dong W

    更新日期:2017-02-01 00:00:00

  • The REDCap consortium: Building an international community of software platform partners.

    abstract::The Research Electronic Data Capture (REDCap) data management platform was developed in 2004 to address an institutional need at Vanderbilt University, then shared with a limited number of adopting sites beginning in 2006. Given bi-directional benefit in early sharing experiments, we created a broader consortium shari...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103208

    authors: Harris PA,Taylor R,Minor BL,Elliott V,Fernandez M,O'Neal L,McLeod L,Delacqua G,Delacqua F,Kirby J,Duda SN,REDCap Consortium.

    更新日期:2019-07-01 00:00:00

  • Patient-specific learning in real time for adaptive monitoring in critical care.

    abstract::Intensive care monitoring systems are typically developed from population data, but do not take into account the variability among individual patients' characteristics. This study develops patient-specific alarm algorithms in real time. Classification tree and neural network learning were carried out in batch mode on ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2008.03.011

    authors: Zhang Y,Szolovits P

    更新日期:2008-06-01 00:00:00

  • Inductive creation of an annotation schema for manually indexing clinical conditions from emergency department reports.

    abstract::Evaluating automated indexing applications requires comparing automatically indexed terms against manual reference standard annotations. However, there are no standard guidelines for determining which words from a textual document to include in manual annotations, and the vague task can result in substantial variation...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2005.06.004

    authors: Chapman WW,Dowling JN

    更新日期:2006-04-01 00:00:00

  • A graph-based approach to auditing RxNorm.

    abstract:OBJECTIVES:RxNorm is a standardized nomenclature for clinical drug entities developed by the National Library of Medicine. In this paper, we audit relations in RxNorm for consistency and completeness through the systematic analysis of the graph of its concepts and relationships. METHODS:The representation of multi-ing...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.04.004

    authors: Bodenreider O,Peters LB

    更新日期:2009-06-01 00:00:00

  • Intradialytic blood pressure pattern recognition based on density peak clustering.

    abstract::End-stage renal disease (ESRD) is the final stage of chronic kidney disease (CKD) and requires hemodialysis (HD) for survival. Intradialytic blood pressure (IBP) measurements are necessary to ensure patient safety during HD treatments and have critical clinical and prognostic significance. Studies on IBP measurements,...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.05.013

    authors: Wang F,Zhou JY,Tian Y,Wang Y,Zhang P,Chen JH,Li JS

    更新日期:2018-07-01 00:00:00

  • Unleashing genotypes in epidemiology - A novel method for managing high throughput information.

    abstract::The large amounts of data generated when high-throughput genotyping methods are used in large-scale epidemiological studies (>10,000 participants) present an enormous challenge to researchers in terms of structured data management. In order to face these challenges, a system has been designed and implemented where gen...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.07.005

    authors: Olund G,Brinne A,Lindqvist P,Litton JE

    更新日期:2009-12-01 00:00:00

  • Predicting biomedical metadata in CEDAR: A study of Gene Expression Omnibus (GEO).

    abstract::A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descriptions of data, known as metadata. Towards improving the quantity and quality of metadata, we propose a novel metadata prediction framework to learn associations from existing metadata that can be used to predict metada...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.06.017

    authors: Panahiazar M,Dumontier M,Gevaert O

    更新日期:2017-08-01 00:00:00

  • Leveraging concept-based approaches to identify potential phyto-therapies.

    abstract::The potential of plant-based remedies has been documented in both traditional and contemporary biomedical literature. Such types of text sources may thus be sources from which one might identify potential plant-based therapies ("phyto-therapies"). Concept-based analytic approaches have been shown to uncover knowledge ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.04.008

    authors: Sharma V,Sarkar IN

    更新日期:2013-08-01 00:00:00

  • DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx.

    abstract::In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.02.010

    authors: Mehrabi S,Krishnan A,Sohn S,Roch AM,Schmidt H,Kesterson J,Beesley C,Dexter P,Max Schmidt C,Liu H,Palakal M

    更新日期:2015-04-01 00:00:00

  • A novel web informatics approach for automated surveillance of cancer mortality trends.

    abstract::Cancer surveillance data are collected every year in the United States via the National Program of Cancer Registries (NPCR) and the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI). General trends are closely monitored to measure the nation's progress against cancer. The...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.03.027

    authors: Tourassi G,Yoon HJ,Xu S

    更新日期:2016-06-01 00:00:00

  • Risk factor detection for heart disease by applying text analytics in electronic medical records.

    abstract::In the United States, about 600,000 people die of heart disease every year. The annual cost of care services, medications, and lost productivity reportedly exceeds 108.9 billion dollars. Effective disease risk assessment is critical to prevention, care, and treatment planning. Recent advancements in text analytics hav...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.08.011

    authors: Torii M,Fan JW,Yang WL,Lee T,Wiley MT,Zisook DS,Huang Y

    更新日期:2015-12-01 00:00:00

  • The development and validation of a simulation tool for health policy decision making.

    abstract::Computer simulations have been used to model infectious diseases to examine the outcomes of alternative strategies for managing their spread. Methicillin resistant Staphylococcus aureus (MRSA) skin and soft tissue infections have become prominent in many communities and efforts are underway to reduce the spread of thi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2010.03.013

    authors: Panchanathan SS,Petitti DB,Fridsma DB

    更新日期:2010-08-01 00:00:00

  • A hybrid of whale optimization and late acceptance hill climbing based imputation to enhance classification performance in electronic health records.

    abstract::Electronic health records (EHR) are a major source of information in biomedical informatics. Yet, missing values are prominent characteristics of EHR. Prediction on dataset with missing values results in inaccurate inferences. Nearest neighbour imputation based on lazy learning approach is a proven technique for missi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103190

    authors: Nagarajan G,Dhinesh Babu LD

    更新日期:2019-06-01 00:00:00

  • Deep neural models for ICD-10 coding of death certificates and autopsy reports in free-text.

    abstract::We address the assignment of ICD-10 codes for causes of death by analyzing free-text descriptions in death certificates, together with the associated autopsy reports and clinical bulletins, from the Portuguese Ministry of Health. We leverage a deep neural network that combines word embeddings, recurrent units, and neu...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.02.011

    authors: Duarte F,Martins B,Pinto CS,Silva MJ

    更新日期:2018-04-01 00:00:00

  • Applying semantic-based probabilistic context-free grammar to medical language processing--a preliminary study on parsing medication sentences.

    abstract::Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently r...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.08.009

    authors: Xu H,AbdelRahman S,Lu Y,Denny JC,Doan S

    更新日期:2011-12-01 00:00:00

  • LGscore: A method to identify disease-related genes using biological literature and Google data.

    abstract::Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which iden...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.01.003

    authors: Kim J,Kim H,Yoon Y,Park S

    更新日期:2015-04-01 00:00:00

  • Benchmarking deep learning models on large healthcare datasets.

    abstract::Deep learning models (aka Deep Neural Networks) have revolutionized many fields including computer vision, natural language processing, speech recognition, and is being increasingly used in clinical healthcare applications. However, few works exist which have benchmarked the performance of the deep learning models wit...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.04.007

    authors: Purushotham S,Meng C,Che Z,Liu Y

    更新日期:2018-07-01 00:00:00

  • Integrated network analysis of symptom clusters across disease conditions.

    abstract::Identifying the symptom clusters (two or more related symptoms) with shared underlying molecular mechanisms has been a vital analysis task to promote the symptom science and precision health. Related studies have applied the clustering algorithms (e.g. k-means, latent class model) to detect the symptom clusters mostly...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103482

    authors: Lu K,Yang K,Niyongabo E,Shu Z,Wang J,Chang K,Zou Q,Jiang J,Jia C,Liu B,Zhou X

    更新日期:2020-07-01 00:00:00

  • A method and software framework for enriching private biomedical sources with data from public online repositories.

    abstract::Modern biomedical research relies on the semantic integration of heterogeneous data sources to find data correlations. Researchers access multiple datasets of disparate origin, and identify elements-e.g. genes, compounds, pathways-that lead to interesting correlations. Normally, they must refer to additional public da...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.02.004

    authors: Anguita A,García-Remesal M,Graf N,Maojo V

    更新日期:2016-04-01 00:00:00

  • A reference ontology for biomedical informatics: the Foundational Model of Anatomy.

    abstract::The Foundational Model of Anatomy (FMA), initially developed as an enhancement of the anatomical content of UMLS, is a domain ontology of the concepts and relationships that pertain to the structural organization of the human body. It encompasses the material objects from the molecular to the macroscopic levels that c...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2003.11.007

    authors: Rosse C,Mejino JL Jr

    更新日期:2003-12-01 00:00:00

  • Unsupervised ensemble ranking of terms in electronic health record notes based on their importance to patients.

    abstract:BACKGROUND:Allowing patients to access their own electronic health record (EHR) notes through online patient portals has the potential to improve patient-centered care. However, EHR notes contain abundant medical jargon that can be difficult for patients to comprehend. One way to help patients is to reduce information ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.02.016

    authors: Chen J,Yu H

    更新日期:2017-04-01 00:00:00

  • Scenario-based design: a method for connecting information system design with public health operations and emergency management.

    abstract:UNLABELLED:Responding to public health emergencies requires rapid and accurate assessment of workforce availability under adverse and changing circumstances. However, public health information systems to support resource management during both routine and emergency operations are currently lacking. We applied scenario-...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.07.004

    authors: Reeder B,Turner AM

    更新日期:2011-12-01 00:00:00

  • An unsupervised and customizable misspelling generator for mining noisy health-related text sources.

    abstract:BACKGROUND:Data collection and extraction from noisy text sources such as social media typically rely on keyword-based searching/listening. However, health-related terms are often misspelled in such noisy text sources due to their complex morphology, resulting in the exclusion of relevant data for studies. In this pape...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.11.007

    authors: Sarker A,Gonzalez-Hernandez G

    更新日期:2018-12-01 00:00:00

  • Systematic comparison of the protein-protein interaction databases from a user's perspective.

    abstract::In absence of periodic systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among the many protein-protein interaction (PPI) databases and tools. We conducted a comprehensive compilation and comparison of such resources. We compiled 375 PPI resources, short-listed 125 impor...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103380

    authors: Bajpai AK,Davuluri S,Tiwary K,Narayanan S,Oguru S,Basavaraju K,Dayalan D,Thirumurugan K,Acharya KK

    更新日期:2020-03-01 00:00:00

  • MeSHy: Mining unanticipated PubMed information using frequencies of occurrences and concurrences of MeSH terms.

    abstract:MOTIVATION:PubMed is the most widely used database of biomedical literature. To the detriment of the user though, the ranking of the documents retrieved for a query is not content-based, and important semantic information in the form of assigned Medical Subject Headings (MeSH) terms is not readily presented or producti...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.05.009

    authors: Theodosiou T,Vizirianakis IS,Angelis L,Tsaftaris A,Darzentas N

    更新日期:2011-12-01 00:00:00

  • Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.

    abstract::Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the w...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.04.013

    authors: De-Arteaga M,Eggel I,Do B,Rubin D,Kahn CE Jr,Müller H

    更新日期:2015-08-01 00:00:00

  • Comparison of orthogonal NLP methods for clinical phenotyping and assessment of bone scan utilization among prostate cancer patients.

    abstract:OBJECTIVE:Clinical care guidelines recommend that newly diagnosed prostate cancer patients at high risk for metastatic spread receive a bone scan prior to treatment and that low risk patients not receive it. The objective was to develop an automated pipeline to interrogate heterogeneous data to evaluate the use of bone...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103184

    authors: Coquet J,Bozkurt S,Kan KM,Ferrari MK,Blayney DW,Brooks JD,Hernandez-Boussard T

    更新日期:2019-06-01 00:00:00

  • A comparison of word embeddings for the biomedical natural language processing.

    abstract:BACKGROUND:Word embeddings have been prevalently used in biomedical Natural Language Processing (NLP) applications due to the ability of the vector representations being able to capture useful semantic properties and linguistic relationships between words. Different textual resources (e.g., Wikipedia and biomedical lit...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.09.008

    authors: Wang Y,Liu S,Afzal N,Rastegar-Mojarad M,Wang L,Shen F,Kingsbury P,Liu H

    更新日期:2018-11-01 00:00:00

  • Matching patients to clinical trials using semantically enriched document representation.

    abstract::Recruiting eligible patients for clinical trials is crucial for reliably answering specific questions about medical interventions and evaluation. However, clinical trial recruitment is a bottleneck in clinical research and drug development. Our goal is to provide an approach towards automating this manual and time-con...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103406

    authors: Hassanzadeh H,Karimi S,Nguyen A

    更新日期:2020-05-01 00:00:00