Extending the Fellegi-Sunter probabilistic record linkage method for approximate field comparators.


:Probabilistic record linkage is a method commonly used to determine whether demographic records refer to the same person. The Fellegi-Sunter method is a probabilistic approach that uses field weights based on log likelihood ratios to determine record similarity. This paper introduces an extension of the Fellegi-Sunter method that incorporates approximate field comparators in the calculation of field weights. The data warehouse of a large academic medical center was used as a case study. The approximate comparator extension was compared with the Fellegi-Sunter method in its ability to find duplicate records previously identified in the data warehouse using different demographic fields and matching cutoffs. The approximate comparator extension misclassified 25% fewer pairs and had a larger Welch's T statistic than the Fellegi-Sunter method for all field sets and matching cutoffs. The accuracy gain provided by the approximate comparator extension grew as less information was provided and as the matching cutoff increased. Given the ubiquity of linkage in both clinical and research settings, the incremental improvement of the extension has the potential to make a considerable impact.


J Biomed Inform


DuVall SL,Kerber RA,Thomas A




Has Abstract


2010-02-01 00:00:00














  • Challenges in clinical natural language processing for automated disorder normalization.

    abstract:BACKGROUND:Identifying key variables such as disorders within the clinical narratives in electronic health records has wide-ranging applications within clinical practice and biomedical research. Previous research has demonstrated reduced performance of disorder named entity recognition (NER) and normalization (or groun...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Leaman R,Khare R,Lu Z

    更新日期:2015-10-01 00:00:00

  • An automated reasoning framework for translational research.

    abstract::In this paper we propose a novel approach to the design and implementation of knowledge-based decision support systems for translational research, specifically tailored to the analysis and interpretation of data from high-throughput experiments. Our approach is based on a general epistemological model of the scientifi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Riva A,Nuzzo A,Stefanelli M,Bellazzi R

    更新日期:2010-06-01 00:00:00

  • Medical diagnosis of atherosclerosis from Carotid Artery Doppler Signals using principal component analysis (PCA), k-NN based weighting pre-processing and Artificial Immune Recognition System (AIRS).

    abstract::In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the f...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Latifoğlu F,Polat K,Kara S,Güneş S

    更新日期:2008-02-01 00:00:00

  • Unstructured medical image query using big data - An epilepsy case study.

    abstract::Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Istephan S,Siadat MR

    更新日期:2016-02-01 00:00:00

  • NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information.

    abstract::Over the last 8 years, the National Cancer Institute (NCI) has launched a major effort to integrate molecular and clinical cancer-related information within a unified biomedical informatics framework, with controlled terminology as its foundational layer. The NCI Thesaurus is the reference terminology underpinning the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Sioutos N,de Coronado S,Haber MW,Hartel FW,Shaiu WL,Wright LW

    更新日期:2007-02-01 00:00:00

  • A comprehensive review of feature based methods for drug target interaction prediction.

    abstract::Drug target interaction is a prominent research area in the field of drug discovery. It refers to the recognition of interactions between chemical compounds and the protein targets in the human body. Wet lab experiments to identify these interactions are expensive as well as time consuming. The computational methods o...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审


    authors: Sachdev K,Gupta MK

    更新日期:2019-05-01 00:00:00

  • Biomedical ontologies: what part-of is and isn't.

    abstract::Mereological relations such as part-of and its inverse has-part are fundamental to the description of the structure of living organisms. Whereas classical mereology focuses on individual entities, mereological relations in biomedical ontologies are generally asserted between classes of individuals. In general, this pr...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Schulz S,Kumar A,Bittner T

    更新日期:2006-06-01 00:00:00

  • HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

    abstract::The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function....

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: O'Driscoll A,Belogrudov V,Carroll J,Kropp K,Walsh P,Ghazal P,Sleator RD

    更新日期:2015-04-01 00:00:00

  • Temporal phenotyping of medically complex children via PARAFAC2 tensor factorization.

    abstract:OBJECTIVE:Our aim is to extract clinically-meaningful phenotypes from longitudinal electronic health records (EHRs) of medically-complex children. This is a fragile set of patients consuming a disproportionate amount of pediatric care resources but who often end up with sub-optimal clinical outcome. The rise in availab...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Perros I,Papalexakis EE,Vuduc R,Searles E,Sun J

    更新日期:2019-05-01 00:00:00

  • Enhancing phylogeography by improving geographical information from GenBank.

    abstract::Phylogeography is a field that focuses on the geographical lineages of species such as vertebrates or viruses. Here, geographical data, such as location of a species or viral host is as important as the sequence information extracted from the species. Together, this information can help illustrate the migration of the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Scotch M,Sarkar IN,Mei C,Leaman R,Cheung KH,Ortiz P,Singraur A,Gonzalez G

    更新日期:2011-12-01 00:00:00

  • Algorithms for rapid outbreak detection: a research synthesis.

    abstract::The threat of bioterrorism has stimulated interest in enhancing public health surveillance to detect disease outbreaks more rapidly than is currently possible. To advance research on improving the timeliness of outbreak detection, the Defense Advanced Research Project Agency sponsored the Bio-event Advanced Leading In...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Buckeridge DL,Burkom H,Campbell M,Hogan WR,Moore AW

    更新日期:2005-04-01 00:00:00

  • A Hidden Semi-Markov Model based approach for rehabilitation exercise assessment.

    abstract::In this paper, a Hidden Semi-Markov Model (HSMM) based approach is proposed to evaluate and monitor body motion during a rehabilitation training program. The approach extracts clinically relevant motion features from skeleton joint trajectories, acquired by the RGB-D camera, and provides a score for the subject's perf...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Capecci M,Ceravolo MG,Ferracuti F,Iarlori S,Kyrki V,Monteriù A,Romeo L,Verdini F

    更新日期:2018-02-01 00:00:00

  • Modified Needleman-Wunsch algorithm for clinical pathway clustering.

    abstract::Clinical pathways are used to guide clinicians to provide a standardised delivery of care. Because of their standardisation, the aim of clinical pathways is to reduce variation in both care process and patient outcomes. When learning clinical pathways from data through data mining, it is common practice to represent e...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Aspland E,Harper PR,Gartner D,Webb P,Barrett-Lee P

    更新日期:2021-01-27 00:00:00

  • Personal health information in research: Perceived risk, trustworthiness and opinions from patients attending a tertiary healthcare facility.

    abstract:BACKGROUND:Personal health information is a valuable resource to the advancement of research. In order to achieve a comprehensive reform of data infrastructure in Australia, both public engagement and building social trust is vital. In light of this, we conducted a study to explore the opinions, perceived risks and tru...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Krahe M,Milligan E,Reilly S

    更新日期:2019-07-01 00:00:00

  • Utilizing online stochastic optimization on scheduling of intensity-modulate radiotherapy therapy (IMRT).

    abstract::According to Ministry of Health and Welfare of Taiwan, cancer has been one of the major causes of death in Taiwan since 1982. The Intensive-Modulated Radiation Therapy (IMRT) is one of the most important radiotherapies of cancers, especially for Nasopharyngeal cancers, Digestive system cancers and Cervical cancers. Fo...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Chang WH,Lo SM,Chen TL,Chen JC,Wu HN

    更新日期:2020-08-01 00:00:00

  • Desiderata for domain reference ontologies in biomedicine.

    abstract::Domain reference ontologies represent knowledge about a particular part of the world in a way that is independent from specific objectives, through a theory of the domain. An example of reference ontology in biomedical informatics is the Foundational Model of Anatomy (FMA), an ontology of anatomy that covers the entir...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Burgun A

    更新日期:2006-06-01 00:00:00

  • MorphoCol: An ontology-based knowledgebase for the characterisation of clinically significant bacterial colony morphologies.

    abstract:BACKGROUND:One of the major concerns of the biomedical community is the increasing prevalence of antimicrobial resistant microorganisms. Recent findings show that the diversification of colony morphology may be indicative of the expression of virulence factors and increased resistance to antibiotic therapeutics. To tra...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Sousa AM,Pereira MO,Lourenço A

    更新日期:2015-06-01 00:00:00

  • Making sense: sensor-based investigation of clinician activities in complex critical care environments.

    abstract::In many respects, the critical care workplace resembles a paradigmatic complex system: on account of the dynamic and interactive nature of collaborative clinical work, these settings are characterized by non-linear, inter-dependent and emergent activities. Developing a comprehensive understanding of the work activitie...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Kannampallil T,Li Z,Zhang M,Cohen T,Robinson DJ,Franklin A,Zhang J,Patel VL

    更新日期:2011-06-01 00:00:00

  • Feature selection techniques for maximum entropy based biomedical named entity recognition.

    abstract::Named entity recognition is an extremely important and fundamental task of biomedical text mining. Biomedical named entities include mentions of proteins, genes, DNA, RNA, etc which often have complex structures, but it is challenging to identify and classify such entities. Machine learning methods like CRF, MEMM and ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Saha SK,Sarkar S,Mitra P

    更新日期:2009-10-01 00:00:00

  • Toward analyzing and synthesizing previous research in early prediction of cardiac arrest using machine learning based on a multi-layered integrative framework.

    abstract:BACKGROUND:One of the significant problems in the field of healthcare is the low survival rate of people who have experienced sudden cardiac arrest. Early prediction of cardiac arrest can provide the time required for intervening and preventing its onset in order to reduce mortality. Traditional statistical methods hav...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Layeghian Javan S,Sepehri MM,Aghajani H

    更新日期:2018-12-01 00:00:00

  • Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: an application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson's Disease Questionnaire (PDQ-39).

    abstract::Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models recently proposed to deal with multi-dimensional classification problems, where each instance in the data set has to be assigned to more than one class variable. In this paper, we propose a Markov blanket-based approach for learni...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Borchani H,Bielza C,Martı Nez-Martı N P,Larrañaga P

    更新日期:2012-12-01 00:00:00

  • The Analytic Information Warehouse (AIW): a platform for analytics using electronic health record data.

    abstract:OBJECTIVE:To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. MATERIALS AND METHODS:We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transfor...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Post AR,Kurc T,Cholleti S,Gao J,Lin X,Bornstein W,Cantrell D,Levine D,Hohmann S,Saltz JH

    更新日期:2013-06-01 00:00:00

  • A Health Surveillance Software Framework to deliver information on preventive healthcare strategies.

    abstract::A software framework can reduce costs related to the development of an application because it allows developers to reuse both design and code. Recently, companies and research groups have announced that they have been employing health software frameworks. This paper presents the design, proof-of-concept implementation...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Macedo AA,Pollettini JT,Baranauskas JA,Chaves JC

    更新日期:2016-08-01 00:00:00

  • The use of logic relationships to model colon cancer gene expression networks with mRNA microarray data.

    abstract::The ultimate goal of genomics research is to describe the network of molecules and interactions that govern all biological functions and disease processes in cells. Nonlinear interactions among genes in terms of their logic relationships play a key role for deciphering the networks of molecules that underlie cellular ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Ruan X,Wang J,Li H,Perozzi RE,Perozzi EF

    更新日期:2008-08-01 00:00:00

  • glUCModel: a monitoring and modeling system for chronic diseases applied to diabetes.

    abstract::Chronic patients must carry out a rigorous control of diverse factors in their lives. Diet, sport activity, medical analysis or blood glucose levels are some of them. This is a hard task, because some of these controls are performed very often, for instance some diabetics measure their glucose levels several times eve...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Hidalgo JI,Maqueda E,Risco-Martín JL,Cuesta-Infante A,Colmenar JM,Nobel J

    更新日期:2014-04-01 00:00:00

  • DyKOSMap: A framework for mapping adaptation between biomedical knowledge organization systems.

    abstract:BACKGROUND:Knowledge Organization Systems (KOS) and their associated mappings play a central role in several decision support systems. However, by virtue of knowledge evolution, KOS entities are modified over time, impacting mappings and potentially turning them invalid. This requires semi-automatic methods to maintain...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Dos Reis JC,Pruski C,Da Silveira M,Reynaud-Delaître C

    更新日期:2015-06-01 00:00:00

  • Categorizing the world of registries.

    abstract::The term registry is widely used to refer to any database storing clinical information collected as a byproduct of patient care. Despite the use of this single characterizing term (registry), these databases exist in various forms and support functions ranging from biomedical informatics and clinical research, to publ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Drolet BC,Johnson KB

    更新日期:2008-12-01 00:00:00

  • Knowledge-based automated planning system for StereoElectroEncephaloGraphy: A center-based scenario.

    abstract::Surgical planning for StereoElectroEncephaloGraphy (SEEG) is a complex and patient specific task, where the experience and medical workflow of each institution may influence the final planning choices. To account for this variability, we developed a data-based Computer Assisted Planning (CAP) solution able to exploit ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Scorza D,Rizzi M,De Momi E,Cortés C,Bertelsen Á,Cardinale F

    更新日期:2020-08-01 00:00:00

  • Description of a method to support public health information management: organizational network analysis.

    abstract::In this case study, we describe a method that has potential to provide systematic support for public health information management. Public health agencies depend on specialized information that travels throughout an organization via communication networks among employees. Interactions that occur within these networks ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Merrill J,Bakken S,Rockoff M,Gebbie K,Carley KM

    更新日期:2007-08-01 00:00:00

  • Mapping high-dimensional data onto a relative distance plane--an exact method for visualizing and characterizing high-dimensional patterns.

    abstract::We introduce a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances between points with respect to any two reference patterns in a special two-dimensional coordinate system, the relative distance pl...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章


    authors: Somorjai RL,Dolenko B,Demko A,Mandelzweig M,Nikulin AE,Baumgartner R,Pizzi NJ

    更新日期:2004-10-01 00:00:00