Mapping high-dimensional data onto a relative distance plane--an exact method for visualizing and characterizing high-dimensional patterns.

Abstract:

:We introduce a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances between points with respect to any two reference patterns in a special two-dimensional coordinate system, the relative distance plane (RDP). As only a single calculation of a distance matrix is required, this method is computationally efficient, an essential requirement for any exploratory data analysis. The data visualization afforded by this representation permits a rapid assessment of class pattern distributions. In particular, we can determine with a simple statistical test whether both training and validation sets of a 2-class, high-dimensional dataset derive from the same class distributions. We can explore any dataset in detail by identifying the subset of reference pairs whose members belong to different classes, cycling through this subset, and for each pair, mapping the remaining patterns. These multiple viewpoints facilitate the identification and confirmation of outliers. We demonstrate the effectiveness of this method on several complex biomedical datasets. Because of its efficiency, effectiveness, and versatility, one may use the RDP representation as an initial, data mining exploration that precedes classification by some classifier. Once final enhancements to the RDP mapping software are completed, we plan to make it freely available to researchers.

journal_name

J Biomed Inform

authors

Somorjai RL,Dolenko B,Demko A,Mandelzweig M,Nikulin AE,Baumgartner R,Pizzi NJ

doi

10.1016/j.jbi.2004.07.005

keywords:

subject

Has Abstract

pub_date

2004-10-01 00:00:00

pages

366-79

issue

5

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(04)00072-3

journal_volume

37

pub_type

杂志文章
  • Benchmarking relief-based feature selection methods for bioinformatics data mining.

    abstract::Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. g...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.07.015

    authors: Urbanowicz RJ,Olson RS,Schmitt P,Meeker M,Moore JH

    更新日期:2018-09-01 00:00:00

  • Algorithms for rapid outbreak detection: a research synthesis.

    abstract::The threat of bioterrorism has stimulated interest in enhancing public health surveillance to detect disease outbreaks more rapidly than is currently possible. To advance research on improving the timeliness of outbreak detection, the Defense Advanced Research Project Agency sponsored the Bio-event Advanced Leading In...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2004.11.007

    authors: Buckeridge DL,Burkom H,Campbell M,Hogan WR,Moore AW

    更新日期:2005-04-01 00:00:00

  • Toward national comparable nurse practitioner data: proposed data elements, rationale, and methods.

    abstract::Federal funds have supported Nurse Practitioner (NP) education and the establishment of nurse-managed centers. Yet, important questions are raised about the quality and appropriate scope of NP care. Few NP-patient encounters are documented in the largest national surveys of ambulatory care, sponsored by the National C...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2003.09.016

    authors: Jenkins ML

    更新日期:2003-08-01 00:00:00

  • Interacting agents through a web-based health serviceflow management system.

    abstract::The management of chronic and out-patients is a complex process which requires the cooperation of different agents belonging to several organizational units. Patients have to move to different locations to access the necessary services and to communicate their health status data. From their point of view there should ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2006.12.002

    authors: Leonardi G,Panzarasa S,Quaglini S,Stefanelli M,van der Aalst WM

    更新日期:2007-10-01 00:00:00

  • Heterogeneous database integration in biomedicine.

    abstract::The rapid expansion of biomedical knowledge, reduction in computing costs, and spread of internet access have created an ocean of electronic data. The decentralized nature of our scientific community and healthcare system, however, has resulted in a patchwork of diverse, or heterogeneous, database implementations, mak...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1006/jbin.2001.1024

    authors: Sujansky W

    更新日期:2001-08-01 00:00:00

  • A reference ontology for biomedical informatics: the Foundational Model of Anatomy.

    abstract::The Foundational Model of Anatomy (FMA), initially developed as an enhancement of the anatomical content of UMLS, is a domain ontology of the concepts and relationships that pertain to the structural organization of the human body. It encompasses the material objects from the molecular to the macroscopic levels that c...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2003.11.007

    authors: Rosse C,Mejino JL Jr

    更新日期:2003-12-01 00:00:00

  • DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx.

    abstract::In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.02.010

    authors: Mehrabi S,Krishnan A,Sohn S,Roch AM,Schmidt H,Kesterson J,Beesley C,Dexter P,Max Schmidt C,Liu H,Palakal M

    更新日期:2015-04-01 00:00:00

  • Quantifying semantic similarity of clinical evidence in the biomedical literature to facilitate related evidence synthesis.

    abstract:OBJECTIVE:Published clinical trials and high quality peer reviewed medical publications are considered as the main sources of evidence used for synthesizing systematic reviews or practicing Evidence Based Medicine (EBM). Finding all relevant published evidence for a particular medical case is a time and labour intensiv...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103321

    authors: Hassanzadeh H,Nguyen A,Verspoor K

    更新日期:2019-12-01 00:00:00

  • Predictive modeling of bacterial infections and antibiotic therapy needs in critically ill adults.

    abstract::Unnecessary antibiotic regimens in the intensive care unit (ICU) are associated with adverse patient outcomes and antimicrobial resistance. Bacterial infections (BI) are both common and deadly in ICUs, and as a result, patients with a suspected BI are routinely started on broad-spectrum antibiotics prior to having con...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103540

    authors: Eickelberg G,Sanchez-Pinto LN,Luo Y

    更新日期:2020-09-01 00:00:00

  • Creating hospital-specific customized clinical pathways by applying semantic reasoning to clinical data.

    abstract:OBJECTIVE:Clinical pathways (CPs) are widely studied methods to standardize clinical intervention and improve medical quality. However, standard care plans defined in current CPs are too general to execute in a practical healthcare environment. The purpose of this study was to create hospital-specific personalized CPs ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2014.07.017

    authors: Wang HQ,Zhou TS,Tian LL,Qian YM,Li JS

    更新日期:2014-12-01 00:00:00

  • Transitive closure of subsumption and causal relations in a large ontology of radiological diagnosis.

    abstract::The Radiology Gamuts Ontology (RGO)-an ontology of diseases, interventions, and imaging findings-was developed to aid in decision support, education, and translational research in diagnostic radiology. The ontology defines a subsumption (is_a) relation between more general and more specific terms, and a causal relatio...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.03.015

    authors: Kahn CE Jr

    更新日期:2016-06-01 00:00:00

  • A comprehensive review of feature based methods for drug target interaction prediction.

    abstract::Drug target interaction is a prominent research area in the field of drug discovery. It refers to the recognition of interactions between chemical compounds and the protein targets in the human body. Wet lab experiments to identify these interactions are expensive as well as time consuming. The computational methods o...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2019.103159

    authors: Sachdev K,Gupta MK

    更新日期:2019-05-01 00:00:00

  • A comparison of two methods for retrieving ICD-9-CM data: the effect of using an ontology-based method for handling terminology changes.

    abstract:OBJECTIVE:Most existing controlled terminologies can be characterized as collections of terms, wherein the terms are arranged in a simple list or organized in a hierarchy. These kinds of terminologies are considered useful for standardizing terms and encoding data and are currently used in many existing information sys...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.01.005

    authors: Yu AC,Cimino JJ

    更新日期:2011-04-01 00:00:00

  • A tiered approach is more cost effective than traditional pharmacist-based review for classifying computer-detected signals as adverse drug events.

    abstract:OBJECTIVE:To develop a cost-efficient method for identifying adverse drug events (ADEs) and medication errors (MEs) identified using outpatient electronic medical records within ambulatory settings. DESIGN:Comparison of sensitivity and cost of "traditional" pharmacist based approach to identifying ADEs and MEs during ...

    journal_title:Journal of biomedical informatics

    pub_type: 临床试验,杂志文章,多中心研究

    doi:10.1016/s1532-0464(03)00059-5

    authors: Hope C,Overhage JM,Seger A,Teal E,Mills V,Fiskio J,Gandhi TK,Bates DW,Murray MD

    更新日期:2003-02-01 00:00:00

  • An evaluation of clinical order patterns machine-learned from clinician cohorts stratified by patient mortality outcomes.

    abstract:OBJECTIVE:Evaluate the quality of clinical order practice patterns machine-learned from clinician cohorts stratified by patient mortality outcomes. MATERIALS AND METHODS:Inpatient electronic health records from 2010 to 2013 were extracted from a tertiary academic hospital. Clinicians (n = 1822) were stratified into lo...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.09.005

    authors: Wang JK,Hom J,Balasubramanian S,Schuler A,Shah NH,Goldstein MK,Baiocchi MTM,Chen JH

    更新日期:2018-10-01 00:00:00

  • Understanding infusion administration in the ICU through Distributed Cognition.

    abstract::To understand how healthcare technologies are used in practice and evaluate them, researchers have argued for adopting the theoretical framework of Distributed Cognition (DC). This paper describes the methods and results of a study in which a DC methodology, Distributed Cognition for Teamwork (DiCoT), was applied to s...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.02.003

    authors: Rajkomar A,Blandford A

    更新日期:2012-06-01 00:00:00

  • Classification of ADHD with bi-objective optimization.

    abstract::Attention Deficit Hyperactive Disorder (ADHD) is one of the most common diseases in school aged children. In this paper, we consider using fMRI data with classification techniques to aid the diagnosis of ADHD and propose a bi-objective ADHD classification scheme based on L1-norm support vector machine (SVM). In our cl...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.07.011

    authors: Shao L,Xu Y,Fu D

    更新日期:2018-08-01 00:00:00

  • Feature selection techniques for maximum entropy based biomedical named entity recognition.

    abstract::Named entity recognition is an extremely important and fundamental task of biomedical text mining. Biomedical named entities include mentions of proteins, genes, DNA, RNA, etc which often have complex structures, but it is challenging to identify and classify such entities. Machine learning methods like CRF, MEMM and ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2008.12.012

    authors: Saha SK,Sarkar S,Mitra P

    更新日期:2009-10-01 00:00:00

  • glUCModel: a monitoring and modeling system for chronic diseases applied to diabetes.

    abstract::Chronic patients must carry out a rigorous control of diverse factors in their lives. Diet, sport activity, medical analysis or blood glucose levels are some of them. This is a hard task, because some of these controls are performed very often, for instance some diabetics measure their glucose levels several times eve...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.12.015

    authors: Hidalgo JI,Maqueda E,Risco-Martín JL,Cuesta-Infante A,Colmenar JM,Nobel J

    更新日期:2014-04-01 00:00:00

  • MorphoCol: An ontology-based knowledgebase for the characterisation of clinically significant bacterial colony morphologies.

    abstract:BACKGROUND:One of the major concerns of the biomedical community is the increasing prevalence of antimicrobial resistant microorganisms. Recent findings show that the diversification of colony morphology may be indicative of the expression of virulence factors and increased resistance to antibiotic therapeutics. To tra...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.03.007

    authors: Sousa AM,Pereira MO,Lourenço A

    更新日期:2015-06-01 00:00:00

  • Comparison with manual registration reveals satisfactory completeness and efficiency of a computerized cancer registration system.

    abstract::Automated software for cancer registration, called Open Registry and developed by ourselves was adopted by the Varese (population-based) Cancer Registry starting from 1997. Since the use of automated cancer registration is increasing, it is important to assess the quality and completeness of the automated data being p...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2007.03.003

    authors: Contiero P,Tittarelli A,Maghini A,Fabiano S,Frassoldi E,Costa E,Gada D,Codazzi T,Crosignani P,Tessandori R,Tagliabue G

    更新日期:2008-02-01 00:00:00

  • A medical treatment based scoring model to detect abusive institutions.

    abstract::Medical abuse refers to a type of abnormal medical practice which is not in compliance with qualitative or ethical standards, such as excessive prescription or overbilling of medical services. Detection of such medical abuses is crucial, especially for the patients and insurance providers, because they become subject ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103423

    authors: Lee J,Shin H,Cho S

    更新日期:2020-07-01 00:00:00

  • Comparison of reversible-jump Markov-chain-Monte-Carlo learning approach with other methods for missing enzyme identification.

    abstract::Computational identification of missing enzymes plays a significant role in accurate and complete reconstruction of metabolic network for both newly sequenced and well-studied organisms. For a metabolic reaction, given a set of candidate enzymes identified according to certain biological evidences, a powerful mathemat...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2007.09.002

    authors: Geng B,Zhou X,Zhu J,Hung YS,Wong ST

    更新日期:2008-04-01 00:00:00

  • Evaluating warfarin dosing models on multiple datasets with a novel software framework and evolutionary optimisation.

    abstract::Warfarin is an effective preventative treatment for arterial and venous thromboembolism, but requires individualised dosing due to its narrow therapeutic range and high individual variation. Many machine learning techniques have been demonstrated in this domain. This study evaluated the accuracy of the most promising ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103634

    authors: Truda G,Marais P

    更新日期:2021-01-01 00:00:00

  • Intradialytic blood pressure pattern recognition based on density peak clustering.

    abstract::End-stage renal disease (ESRD) is the final stage of chronic kidney disease (CKD) and requires hemodialysis (HD) for survival. Intradialytic blood pressure (IBP) measurements are necessary to ensure patient safety during HD treatments and have critical clinical and prognostic significance. Studies on IBP measurements,...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.05.013

    authors: Wang F,Zhou JY,Tian Y,Wang Y,Zhang P,Chen JH,Li JS

    更新日期:2018-07-01 00:00:00

  • Serum cancer biomarker discovery through analysis of gene expression data sets across multiple tumor and normal tissues.

    abstract::The development of convenient serum bioassays for cancer screening, diagnosis, prognosis, and monitoring of treatment is one of top priorities in cancer research community. Although numerous biomarker candidates have been generated by applying high-throughput technologies such as transcriptomics, proteomics, and metab...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.08.010

    authors: Jin H,Lee HC,Park SS,Jeong YS,Kim SY

    更新日期:2011-12-01 00:00:00

  • Patient empowerment for cancer patients through a novel ICT infrastructure.

    abstract::As a result of recent advances in cancer research and "precision medicine" approaches, i.e. the idea of treating each patient with the right drug at the right time, more and more cancer patients are being cured, or might have to cope with a life with cancer. For many people, cancer survival today means living with a c...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103342

    authors: Kondylakis H,Bucur A,Crico C,Dong F,Graf N,Hoffman S,Koumakis L,Manenti A,Marias K,Mazzocco K,Pravettoni G,Renzi C,Schera F,Triberti S,Tsiknakis M,Kiefer S

    更新日期:2020-01-01 00:00:00

  • Extending the Fellegi-Sunter probabilistic record linkage method for approximate field comparators.

    abstract::Probabilistic record linkage is a method commonly used to determine whether demographic records refer to the same person. The Fellegi-Sunter method is a probabilistic approach that uses field weights based on log likelihood ratios to determine record similarity. This paper introduces an extension of the Fellegi-Sunter...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.08.004

    authors: DuVall SL,Kerber RA,Thomas A

    更新日期:2010-02-01 00:00:00

  • Enhancing phylogeography by improving geographical information from GenBank.

    abstract::Phylogeography is a field that focuses on the geographical lineages of species such as vertebrates or viruses. Here, geographical data, such as location of a species or viral host is as important as the sequence information extracted from the species. Together, this information can help illustrate the migration of the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.06.005

    authors: Scotch M,Sarkar IN,Mei C,Leaman R,Cheung KH,Ortiz P,Singraur A,Gonzalez G

    更新日期:2011-12-01 00:00:00

  • Using natural language processing to extract mammographic findings.

    abstract:OBJECTIVE:Structured data on mammographic findings are difficult to obtain without manual review. We developed and evaluated a rule-based natural language processing (NLP) system to extract mammographic findings from free-text mammography reports. MATERIALS AND METHODS:The NLP system extracted four mammographic findin...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.01.010

    authors: Gao H,Aiello Bowles EJ,Carrell D,Buist DS

    更新日期:2015-04-01 00:00:00