Wisdom of artificial crowds feature selection in untargeted metabolomics: An application to the development of a blood-based diagnostic test for thrombotic myocardial infarction.

Abstract:

INTRODUCTION:Heart disease remains a leading cause of global mortality. While acute myocardial infarction (colloquially: heart attack), has multiple proximate causes, proximate etiology cannot be determined by a blood-based diagnostic test. We enrolled a suitable patient cohort and conducted a non-targeted quantification of plasma metabolites by mass spectrometry for developing a test that can differentiate between thrombotic MI, non-thrombotic MI, and stable disease. A significant challenge in developing such a diagnostic test is solving the NP-hard problem of feature selection for constructing an optimal statistical classifier. OBJECTIVE:We employed a Wisdom of Artificial Crowds (WoAC) strategy for solving the feature selection problem and evaluated the accuracy and parsimony of downstream classifiers in comparison with traditional feature selection techniques including the Lasso and selection using Random Forest variable importance criteria. MATERIALS AND METHODS:Artificial Crowd Wisdom was generated via aggregation of the best solutions from independent and diverse genetic algorithm populations that were initialized with bootstrapping and a random subspaces constraint. RESULTS/CONCLUSIONS:Strong evidence was observed that a statistical classifier utilizing WoAC feature selection can discriminate between human subjects presenting with thrombotic MI, non-thrombotic MI, and stable Coronary Artery Disease given abundances of selected plasma metabolites. Utilizing the abundances of twenty selected metabolites, a leave-one-out cross-validation estimated misclassification rate of 2.6% was observed. However, the WoAC feature selection strategy did not perform better than the Lasso over the current study.

journal_name

J Biomed Inform

authors

Trainor PJ,Yampolskiy RV,DeFilippis AP

doi

10.1016/j.jbi.2018.03.007

subject

Has Abstract

pub_date

2018-05-01 00:00:00

pages

53-60

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(18)30049-2

journal_volume

81

pub_type

杂志文章,多中心研究
  • Automatic detection of protected health information from clinic narratives.

    abstract::This paper presents a natural language processing (NLP) system that was designed to participate in the 2014 i2b2 de-identification challenge. The challenge task aims to identify and classify seven main Protected Health Information (PHI) categories and 25 associated sub-categories. A hybrid model was proposed which com...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.06.015

    authors: Yang H,Garibaldi JM

    更新日期:2015-12-01 00:00:00

  • A comparison of word embeddings for the biomedical natural language processing.

    abstract:BACKGROUND:Word embeddings have been prevalently used in biomedical Natural Language Processing (NLP) applications due to the ability of the vector representations being able to capture useful semantic properties and linguistic relationships between words. Different textual resources (e.g., Wikipedia and biomedical lit...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.09.008

    authors: Wang Y,Liu S,Afzal N,Rastegar-Mojarad M,Wang L,Shen F,Kingsbury P,Liu H

    更新日期:2018-11-01 00:00:00

  • A tiered approach is more cost effective than traditional pharmacist-based review for classifying computer-detected signals as adverse drug events.

    abstract:OBJECTIVE:To develop a cost-efficient method for identifying adverse drug events (ADEs) and medication errors (MEs) identified using outpatient electronic medical records within ambulatory settings. DESIGN:Comparison of sensitivity and cost of "traditional" pharmacist based approach to identifying ADEs and MEs during ...

    journal_title:Journal of biomedical informatics

    pub_type: 临床试验,杂志文章,多中心研究

    doi:10.1016/s1532-0464(03)00059-5

    authors: Hope C,Overhage JM,Seger A,Teal E,Mills V,Fiskio J,Gandhi TK,Bates DW,Murray MD

    更新日期:2003-02-01 00:00:00

  • Quality assurance of chemical ingredient classification for the National Drug File - Reference Terminology.

    abstract::The National Drug File - Reference Terminology (NDF-RT) is a large and complex drug terminology consisting of several classification hierarchies on top of an extensive collection of drug concepts. These hierarchies provide important information about clinical drugs, e.g., their chemical ingredients, mechanisms of acti...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.07.013

    authors: Zheng L,Yumak H,Chen L,Ochs C,Geller J,Kapusnik-Uner J,Perl Y

    更新日期:2017-09-01 00:00:00

  • Clinical decision support models and frameworks: Seeking to address research issues underlying implementation successes and failures.

    abstract::Computer-based clinical decision support (CDS) has been pursued for more than five decades. Despite notable accomplishments and successes, wide adoption and broad use of CDS in clinical practice has not been achieved. Many issues have been identified as being partially responsible for the relatively slow adoption and ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2017.12.005

    authors: Greenes RA,Bates DW,Kawamoto K,Middleton B,Osheroff J,Shahar Y

    更新日期:2018-02-01 00:00:00

  • In defense of the Desiderata.

    abstract::A 1998 paper that delineated desirable characteristics, or desiderata for controlled medical terminologies attempted to summarize emerging consensus regarding structural issues of such terminologies. Among the Desiderata was a call for terminologies to be "concept oriented." Since then, research has trended toward the...

    journal_title:Journal of biomedical informatics

    pub_type: 评论,杂志文章

    doi:10.1016/j.jbi.2005.11.008

    authors: Cimino JJ

    更新日期:2006-06-01 00:00:00

  • On the reproducibility of results of pathway analysis in genome-wide expression studies of colorectal cancers.

    abstract::One of the major problems in genomics and medicine is the identification of gene networks and pathways deregulated in complex and polygenic diseases, like cancer. In this paper, we address the problem of assessing the variability of results of pathways analysis identified in different and independent genome wide expre...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.09.005

    authors: Maglietta R,Distaso A,Piepoli A,Palumbo O,Carella M,D'Addabbo A,Mukherjee S,Ancona N

    更新日期:2010-06-01 00:00:00

  • A novel web informatics approach for automated surveillance of cancer mortality trends.

    abstract::Cancer surveillance data are collected every year in the United States via the National Program of Cancer Registries (NPCR) and the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI). General trends are closely monitored to measure the nation's progress against cancer. The...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.03.027

    authors: Tourassi G,Yoon HJ,Xu S

    更新日期:2016-06-01 00:00:00

  • Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches.

    abstract::The accurate diagnosis of heart failure in emergency room patients is quite important, but can also be quite difficult due to our insufficient understanding of the characteristics of heart failure. The purpose of this study is to design a decision-making model that provides critical factors and knowledge associated wi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.04.013

    authors: Son CS,Kim YN,Kim HS,Park HS,Kim MS

    更新日期:2012-10-01 00:00:00

  • A reference ontology for biomedical informatics: the Foundational Model of Anatomy.

    abstract::The Foundational Model of Anatomy (FMA), initially developed as an enhancement of the anatomical content of UMLS, is a domain ontology of the concepts and relationships that pertain to the structural organization of the human body. It encompasses the material objects from the molecular to the macroscopic levels that c...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2003.11.007

    authors: Rosse C,Mejino JL Jr

    更新日期:2003-12-01 00:00:00

  • Automated annotation and classification of BI-RADS assessment from radiology reports.

    abstract::The Breast Imaging Reporting and Data System (BI-RADS) was developed to reduce variation in the descriptions of findings. Manual analysis of breast radiology report data is challenging but is necessary for clinical and healthcare quality assurance activities. The objective of this study is to develop a natural languag...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.04.011

    authors: Castro SM,Tseytlin E,Medvedeva O,Mitchell K,Visweswaran S,Bekhuis T,Jacobson RS

    更新日期:2017-05-01 00:00:00

  • Security and privacy in electronic health records: a systematic literature review.

    abstract:OBJECTIVE:To report the results of a systematic literature review concerning the security and privacy of electronic health record (EHR) systems. DATA SOURCES:Original articles written in English found in MEDLINE, ACM Digital Library, Wiley InterScience, IEEE Digital Library, Science@Direct, MetaPress, ERIC, CINAHL and...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2012.12.003

    authors: Fernández-Alemán JL,Señor IC,Lozoya PÁ,Toval A

    更新日期:2013-06-01 00:00:00

  • A method and software framework for enriching private biomedical sources with data from public online repositories.

    abstract::Modern biomedical research relies on the semantic integration of heterogeneous data sources to find data correlations. Researchers access multiple datasets of disparate origin, and identify elements-e.g. genes, compounds, pathways-that lead to interesting correlations. Normally, they must refer to additional public da...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.02.004

    authors: Anguita A,García-Remesal M,Graf N,Maojo V

    更新日期:2016-04-01 00:00:00

  • NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information.

    abstract::Over the last 8 years, the National Cancer Institute (NCI) has launched a major effort to integrate molecular and clinical cancer-related information within a unified biomedical informatics framework, with controlled terminology as its foundational layer. The NCI Thesaurus is the reference terminology underpinning the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2006.02.013

    authors: Sioutos N,de Coronado S,Haber MW,Hartel FW,Shaiu WL,Wright LW

    更新日期:2007-02-01 00:00:00

  • A Health Surveillance Software Framework to deliver information on preventive healthcare strategies.

    abstract::A software framework can reduce costs related to the development of an application because it allows developers to reuse both design and code. Recently, companies and research groups have announced that they have been employing health software frameworks. This paper presents the design, proof-of-concept implementation...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.06.002

    authors: Macedo AA,Pollettini JT,Baranauskas JA,Chaves JC

    更新日期:2016-08-01 00:00:00

  • Health information technology adoption: Understanding research protocols and outcome measurements for IT interventions in health care.

    abstract:OBJECTIVE:To classify and characterize the variables commonly used to measure the impact of Information Technology (IT) adoption in health care, as well as settings and IT interventions tested, and to guide future research. MATERIALS AND METHODS:We conducted a descriptive study screening a sample of 236 studies from a...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.07.018

    authors: Colicchio TK,Facelli JC,Del Fiol G,Scammon DL,Bowes WA 3rd,Narus SP

    更新日期:2016-10-01 00:00:00

  • A novel method to estimate the indirect community benefit of HIV interventions using a microsimulation model of HIV disease.

    abstract:BACKGROUND:Microsimulation models of human immunodeficiency virus (HIV) disease that simulate individual patients one at a time and assess clinical and economic outcomes of HIV interventions often provide key details regarding direct individual clinical benefits ("individual benefit"), but they may lack detail on trans...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103475

    authors: Kazemian P,Costantini S,Neilan AM,Resch SC,Walensky RP,Weinstein MC,Freedberg KA

    更新日期:2020-07-01 00:00:00

  • Personal health information in research: Perceived risk, trustworthiness and opinions from patients attending a tertiary healthcare facility.

    abstract:BACKGROUND:Personal health information is a valuable resource to the advancement of research. In order to achieve a comprehensive reform of data infrastructure in Australia, both public engagement and building social trust is vital. In light of this, we conducted a study to explore the opinions, perceived risks and tru...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103222

    authors: Krahe M,Milligan E,Reilly S

    更新日期:2019-07-01 00:00:00

  • Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    abstract::Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, theref...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2014.01.005

    authors: Liu B,Madduri RK,Sotomayor B,Chard K,Lacinski L,Dave UJ,Li J,Liu C,Foster IT

    更新日期:2014-06-01 00:00:00

  • Predictive modeling of bacterial infections and antibiotic therapy needs in critically ill adults.

    abstract::Unnecessary antibiotic regimens in the intensive care unit (ICU) are associated with adverse patient outcomes and antimicrobial resistance. Bacterial infections (BI) are both common and deadly in ICUs, and as a result, patients with a suspected BI are routinely started on broad-spectrum antibiotics prior to having con...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103540

    authors: Eickelberg G,Sanchez-Pinto LN,Luo Y

    更新日期:2020-09-01 00:00:00

  • BioLattice: a framework for the biological interpretation of microarray gene expression data using concept lattice analysis.

    abstract:MOTIVATION:A challenge in microarray data analysis is to interpret observed changes in terms of biological properties and relationships. One powerful approach is to make associations of gene expression clusters with biomedical ontologies and/or biological pathways. However, this approach evaluates only one cluster at a...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2007.10.003

    authors: Kim J,Chung HJ,Jung Y,Kim KK,Kim JH

    更新日期:2008-04-01 00:00:00

  • Use of morphological analysis in protein name recognition.

    abstract::Protein name recognition aims to detect each and every protein names appearing in a PubMed abstract. The task is not simple, as the graphic word boundary (space separator) assumed in conventional preprocessing does not necessarily coincide with the protein name boundary. Such boundary disagreement caused by tokenizati...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2004.08.001

    authors: Yamamoto K,Kudo T,Konagaya A,Matsumoto Y

    更新日期:2004-12-01 00:00:00

  • Induction of comprehensible models for gene expression datasets by subgroup discovery methodology.

    abstract::Finding disease markers (classifiers) from gene expression data by machine learning algorithms is characterized by a high risk of overfitting the data due the abundance of attributes (simultaneously measured gene expression values) and shortage of available examples (observations). To avoid this pitfall and achieve pr...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2004.07.007

    authors: Gamberger D,Lavrac N,Zelezný F,Tolar J

    更新日期:2004-08-01 00:00:00

  • Evaluation of relational and NoSQL database architectures to manage genomic annotations.

    abstract::While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architec...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.10.015

    authors: Schulz WL,Nelson BG,Felker DK,Durant TJS,Torres R

    更新日期:2016-12-01 00:00:00

  • Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: the SHARPn project.

    abstract::The Strategic Health IT Advanced Research Projects (SHARP) Program, established by the Office of the National Coordinator for Health Information Technology in 2010 supports research findings that remove barriers for increased adoption of health IT. The improvements envisioned by the SHARP Area 4 Consortium (SHARPn) wi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.01.009

    authors: Rea S,Pathak J,Savova G,Oniki TA,Westberg L,Beebe CE,Tao C,Parker CG,Haug PJ,Huff SM,Chute CG

    更新日期:2012-08-01 00:00:00

  • The EU-ADR corpus: annotated drugs, diseases, targets, and their relationships.

    abstract::Corpora with specific entities and relationships annotated are essential to train and evaluate text-mining systems that are developed to extract specific structured information from a large corpus. In this paper we describe an approach where a named-entity recognition system produces a first annotation and annotators ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.04.004

    authors: van Mulligen EM,Fourrier-Reglat A,Gurwitz D,Molokhia M,Nieto A,Trifiro G,Kors JA,Furlong LI

    更新日期:2012-10-01 00:00:00

  • High-performance implementation and analysis of the Linkmap program.

    abstract::Linkage analysis uses information from family pedigrees to map genes and locate disease genes on particular chromosomes. A recombination fraction denoted as theta is estimated as a measure of crossing over between two loci. Genetic linkage calculations are very time-consuming particularly for large family pedigrees, a...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1006/jbin.2002.1031

    authors: Kothari K,Lopez-Benitez N,Poduslo SE

    更新日期:2001-12-01 00:00:00

  • Predicting the function of transplanted kidney in long-term care processes: Application of a hybrid model.

    abstract:BACKGROUND:A tool that can predict the estimated glomerular filtration rate (eGFR) in routine daily care can help clinicians to make better decisions for kidney transplant patients and to improve transplantation outcome. In this paper, we proposed a hybrid prediction model for predicting a future value for eGFR during ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2019.103116

    authors: Rashidi Khazaee P,Bagherzadeh M J,Niazkhani Z,Pirnejad H

    更新日期:2019-03-01 00:00:00

  • Analysis of eligibility criteria representation in industry-standard clinical trial protocols.

    abstract::Previous research on standardization of eligibility criteria and its feasibility has traditionally been conducted on clinical trial protocols from ClinicalTrials.gov (CT). The portability and use of such standardization for full-text industry-standard protocols has not been studied in-depth. Towards this end, in this ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.06.001

    authors: Bhattacharya S,Cantor MN

    更新日期:2013-10-01 00:00:00

  • Collaborative text-annotation resource for disease-centered relation extraction from biomedical text.

    abstract::Agglomerating results from studies of individual biological components has shown the potential to produce biomedical discovery and the promise of therapeutic development. Such knowledge integration could be tremendously facilitated by automated text mining for relation extraction in the biomedical literature. Relation...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.02.001

    authors: Cano C,Monaghan T,Blanco A,Wall DP,Peshkin L

    更新日期:2009-10-01 00:00:00