Abstract:
:Identification of co-referent entity mentions inside text has significant importance for other natural language processing (NLP) tasks (e.g. event linking). However, this task, known as co-reference resolution, remains a complex problem, partly because of the confusion over different evaluation metrics and partly because the well-researched existing methodologies do not perform well on new domains such as clinical records. This paper presents a variant of the influential mention-pair model for co-reference resolution. Using a series of linguistically and semantically motivated constraints, the proposed approach controls generation of less-informative/sub-optimal training and test instances. Additionally, the approach also introduces some aggressive greedy strategies in chain clustering. The proposed approach has been tested on the official test corpus of the recently held i2b2/VA 2011 challenge. It achieves an unweighted average F1 score of 0.895, calculated from multiple evaluation metrics (MUC, B(3) and CEAF scores). These results are comparable to the best systems of the challenge. What makes our proposed system distinct is that it also achieves high average F1 scores for each individual chain type (Test: 0.897, Person: 0.852, PROBLEM: 0.855, TREATMENT: 0.884). Unlike other works, it obtains good scores for each of the individual metrics rather than being biased towards a particular metric.
journal_name
J Biomed Informjournal_title
Journal of biomedical informaticsauthors
Chowdhury MF,Zweigenbaum Pdoi
10.1016/j.jbi.2013.03.007subject
Has Abstractpub_date
2013-06-01 00:00:00pages
506-15issue
3eissn
1532-0464issn
1532-0480pii
S1532-0464(13)00041-5journal_volume
46pub_type
杂志文章abstract:OBJECTIVES:Major adverse cardiac events (MACE) of acute coronary syndrome (ACS) often occur suddenly resulting in high mortality and morbidity. Recently, the rapid development of electronic medical records (EMR) provides the opportunity to utilize the potential of EMR to improve the performance of MACE prediction. In t...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2017.01.001
更新日期:2017-02-01 00:00:00
abstract::The Research Electronic Data Capture (REDCap) data management platform was developed in 2004 to address an institutional need at Vanderbilt University, then shared with a limited number of adopting sites beginning in 2006. Given bi-directional benefit in early sharing experiments, we created a broader consortium shari...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103208
更新日期:2019-07-01 00:00:00
abstract::Intensive care monitoring systems are typically developed from population data, but do not take into account the variability among individual patients' characteristics. This study develops patient-specific alarm algorithms in real time. Classification tree and neural network learning were carried out in batch mode on ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2008.03.011
更新日期:2008-06-01 00:00:00
abstract::Evaluating automated indexing applications requires comparing automatically indexed terms against manual reference standard annotations. However, there are no standard guidelines for determining which words from a textual document to include in manual annotations, and the vague task can result in substantial variation...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2005.06.004
更新日期:2006-04-01 00:00:00
abstract:OBJECTIVES:RxNorm is a standardized nomenclature for clinical drug entities developed by the National Library of Medicine. In this paper, we audit relations in RxNorm for consistency and completeness through the systematic analysis of the graph of its concepts and relationships. METHODS:The representation of multi-ing...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2009.04.004
更新日期:2009-06-01 00:00:00
abstract::End-stage renal disease (ESRD) is the final stage of chronic kidney disease (CKD) and requires hemodialysis (HD) for survival. Intradialytic blood pressure (IBP) measurements are necessary to ensure patient safety during HD treatments and have critical clinical and prognostic significance. Studies on IBP measurements,...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.05.013
更新日期:2018-07-01 00:00:00
abstract::The large amounts of data generated when high-throughput genotyping methods are used in large-scale epidemiological studies (>10,000 participants) present an enormous challenge to researchers in terms of structured data management. In order to face these challenges, a system has been designed and implemented where gen...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2009.07.005
更新日期:2009-12-01 00:00:00
abstract::A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descriptions of data, known as metadata. Towards improving the quantity and quality of metadata, we propose a novel metadata prediction framework to learn associations from existing metadata that can be used to predict metada...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2017.06.017
更新日期:2017-08-01 00:00:00
abstract::The potential of plant-based remedies has been documented in both traditional and contemporary biomedical literature. Such types of text sources may thus be sources from which one might identify potential plant-based therapies ("phyto-therapies"). Concept-based analytic approaches have been shown to uncover knowledge ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2013.04.008
更新日期:2013-08-01 00:00:00
abstract::In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.02.010
更新日期:2015-04-01 00:00:00
abstract::Cancer surveillance data are collected every year in the United States via the National Program of Cancer Registries (NPCR) and the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI). General trends are closely monitored to measure the nation's progress against cancer. The...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2016.03.027
更新日期:2016-06-01 00:00:00
abstract::In the United States, about 600,000 people die of heart disease every year. The annual cost of care services, medications, and lost productivity reportedly exceeds 108.9 billion dollars. Effective disease risk assessment is critical to prevention, care, and treatment planning. Recent advancements in text analytics hav...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.08.011
更新日期:2015-12-01 00:00:00
abstract::Computer simulations have been used to model infectious diseases to examine the outcomes of alternative strategies for managing their spread. Methicillin resistant Staphylococcus aureus (MRSA) skin and soft tissue infections have become prominent in many communities and efforts are underway to reduce the spread of thi...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2010.03.013
更新日期:2010-08-01 00:00:00
abstract::Electronic health records (EHR) are a major source of information in biomedical informatics. Yet, missing values are prominent characteristics of EHR. Prediction on dataset with missing values results in inaccurate inferences. Nearest neighbour imputation based on lazy learning approach is a proven technique for missi...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103190
更新日期:2019-06-01 00:00:00
abstract::We address the assignment of ICD-10 codes for causes of death by analyzing free-text descriptions in death certificates, together with the associated autopsy reports and clinical bulletins, from the Portuguese Ministry of Health. We leverage a deep neural network that combines word embeddings, recurrent units, and neu...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.02.011
更新日期:2018-04-01 00:00:00
abstract::Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently r...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2011.08.009
更新日期:2011-12-01 00:00:00
abstract::Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which iden...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.01.003
更新日期:2015-04-01 00:00:00
abstract::Deep learning models (aka Deep Neural Networks) have revolutionized many fields including computer vision, natural language processing, speech recognition, and is being increasingly used in clinical healthcare applications. However, few works exist which have benchmarked the performance of the deep learning models wit...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.04.007
更新日期:2018-07-01 00:00:00
abstract::Identifying the symptom clusters (two or more related symptoms) with shared underlying molecular mechanisms has been a vital analysis task to promote the symptom science and precision health. Related studies have applied the clustering algorithms (e.g. k-means, latent class model) to detect the symptom clusters mostly...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103482
更新日期:2020-07-01 00:00:00
abstract::Modern biomedical research relies on the semantic integration of heterogeneous data sources to find data correlations. Researchers access multiple datasets of disparate origin, and identify elements-e.g. genes, compounds, pathways-that lead to interesting correlations. Normally, they must refer to additional public da...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2016.02.004
更新日期:2016-04-01 00:00:00
abstract::The Foundational Model of Anatomy (FMA), initially developed as an enhancement of the anatomical content of UMLS, is a domain ontology of the concepts and relationships that pertain to the structural organization of the human body. It encompasses the material objects from the molecular to the macroscopic levels that c...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2003.11.007
更新日期:2003-12-01 00:00:00
abstract:BACKGROUND:Allowing patients to access their own electronic health record (EHR) notes through online patient portals has the potential to improve patient-centered care. However, EHR notes contain abundant medical jargon that can be difficult for patients to comprehend. One way to help patients is to reduce information ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2017.02.016
更新日期:2017-04-01 00:00:00
abstract:UNLABELLED:Responding to public health emergencies requires rapid and accurate assessment of workforce availability under adverse and changing circumstances. However, public health information systems to support resource management during both routine and emergency operations are currently lacking. We applied scenario-...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2011.07.004
更新日期:2011-12-01 00:00:00
abstract:BACKGROUND:Data collection and extraction from noisy text sources such as social media typically rely on keyword-based searching/listening. However, health-related terms are often misspelled in such noisy text sources due to their complex morphology, resulting in the exclusion of relevant data for studies. In this pape...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.11.007
更新日期:2018-12-01 00:00:00
abstract::In absence of periodic systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among the many protein-protein interaction (PPI) databases and tools. We conducted a comprehensive compilation and comparison of such resources. We compiled 375 PPI resources, short-listed 125 impor...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103380
更新日期:2020-03-01 00:00:00
abstract:MOTIVATION:PubMed is the most widely used database of biomedical literature. To the detriment of the user though, the ranking of the documents retrieved for a query is not content-based, and important semantic information in the form of assigned Medical Subject Headings (MeSH) terms is not readily presented or producti...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2011.05.009
更新日期:2011-12-01 00:00:00
abstract::Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the w...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.04.013
更新日期:2015-08-01 00:00:00
abstract:OBJECTIVE:Clinical care guidelines recommend that newly diagnosed prostate cancer patients at high risk for metastatic spread receive a bone scan prior to treatment and that low risk patients not receive it. The objective was to develop an automated pipeline to interrogate heterogeneous data to evaluate the use of bone...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103184
更新日期:2019-06-01 00:00:00
abstract:BACKGROUND:Word embeddings have been prevalently used in biomedical Natural Language Processing (NLP) applications due to the ability of the vector representations being able to capture useful semantic properties and linguistic relationships between words. Different textual resources (e.g., Wikipedia and biomedical lit...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.09.008
更新日期:2018-11-01 00:00:00
abstract::Recruiting eligible patients for clinical trials is crucial for reliably answering specific questions about medical interventions and evaluation. However, clinical trial recruitment is a bottleneck in clinical research and drug development. Our goal is to provide an approach towards automating this manual and time-con...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103406
更新日期:2020-05-01 00:00:00