Abstract:
:Finding disease markers (classifiers) from gene expression data by machine learning algorithms is characterized by a high risk of overfitting the data due the abundance of attributes (simultaneously measured gene expression values) and shortage of available examples (observations). To avoid this pitfall and achieve predictor robustness, state-of-the-art approaches construct complex classifiers that combine relatively weak contributions of up to thousands of genes (attributes) to classify a disease. The complexity of such classifiers limits their transparency and consequently the biological insights they can provide. The goal of this study is to apply to this domain the methodology of constructing simple yet robust logic-based classifiers amenable to direct expert interpretation. On two well-known, publicly available gene expression classification problems, the paper shows the feasibility of this approach, employing a recently developed subgroup discovery methodology. Some of the discovered classifiers allow for novel biological interpretations.
journal_name
J Biomed Informjournal_title
Journal of biomedical informaticsauthors
Gamberger D,Lavrac N,Zelezný F,Tolar Jdoi
10.1016/j.jbi.2004.07.007keywords:
subject
Has Abstractpub_date
2004-08-01 00:00:00pages
269-84issue
4eissn
1532-0464issn
1532-0480pii
S1532-0464(04)00075-9journal_volume
37pub_type
杂志文章abstract::Real-time Obstructive Sleep Apnea (OSA) episode detection and monitoring are important for society in terms of an improvement in the health of the general population and of a reduction in mortality and healthcare costs. Currently, to diagnose OSA patients undergo PolySomnoGraphy (PSG), a complicated and invasive test ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2014.02.015
更新日期:2014-06-01 00:00:00
abstract::Human musculoskeletal system resources of the human body are valuable for the learning and medical purposes. Internet-based information from conventional search engines such as Google or Yahoo cannot response to the need of useful, accurate, reliable and good-quality human musculoskeletal resources related to medical ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2012.11.001
更新日期:2013-02-01 00:00:00
abstract::Gene selection is important for cancer classification based on gene expression data, because of high dimensionality and small sample size. In this paper, we present a new gene selection method based on clustering, in which dissimilarity measures are obtained through kernel functions. It searches for best weights of ge...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2016.05.007
更新日期:2016-08-01 00:00:00
abstract::In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.02.010
更新日期:2015-04-01 00:00:00
abstract::In absence of periodic systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among the many protein-protein interaction (PPI) databases and tools. We conducted a comprehensive compilation and comparison of such resources. We compiled 375 PPI resources, short-listed 125 impor...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103380
更新日期:2020-03-01 00:00:00
abstract::Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2016.03.002
更新日期:2016-04-01 00:00:00
abstract::One of the main reasons that leads to a low adoption rate of telemedicine systems is poor usability. An aspect that influences usability during the reporting of findings is the input mode, e.g., if a free-text (FT) or a structured report (SR) interface is employed. The objective of our study is to compare the usabilit...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2014.07.002
更新日期:2014-12-01 00:00:00
abstract:BACKGROUND:A tool that can predict the estimated glomerular filtration rate (eGFR) in routine daily care can help clinicians to make better decisions for kidney transplant patients and to improve transplantation outcome. In this paper, we proposed a hybrid prediction model for predicting a future value for eGFR during ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103116
更新日期:2019-03-01 00:00:00
abstract::The explosive growth of biomedical literature has created a rich source of knowledge, such as that on protein-protein interactions (PPIs) and drug-drug interactions (DDIs), locked in unstructured free text. Biomedical relation classification aims to automatically detect and classify biomedical relations, which has gre...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章,评审
doi:10.1016/j.jbi.2019.103294
更新日期:2019-11-01 00:00:00
abstract::We analyze the discriminatory power of k-nearest neighbors, logistic regression, artificial neural networks (ANNs), decision tress, and support vector machines (SVMs) on the task of classifying pigmented skin lesions as common nevi, dysplastic nevi, or melanoma. Three different classification tasks were used as benchm...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1006/jbin.2001.1004
更新日期:2001-02-01 00:00:00
abstract::To understand how healthcare technologies are used in practice and evaluate them, researchers have argued for adopting the theoretical framework of Distributed Cognition (DC). This paper describes the methods and results of a study in which a DC methodology, Distributed Cognition for Teamwork (DiCoT), was applied to s...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2012.02.003
更新日期:2012-06-01 00:00:00
abstract:BACKGROUND:Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2013.11.010
更新日期:2014-04-01 00:00:00
abstract::Computer simulations have been used to model infectious diseases to examine the outcomes of alternative strategies for managing their spread. Methicillin resistant Staphylococcus aureus (MRSA) skin and soft tissue infections have become prominent in many communities and efforts are underway to reduce the spread of thi...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2010.03.013
更新日期:2010-08-01 00:00:00
abstract::According to Ministry of Health and Welfare of Taiwan, cancer has been one of the major causes of death in Taiwan since 1982. The Intensive-Modulated Radiation Therapy (IMRT) is one of the most important radiotherapies of cancers, especially for Nasopharyngeal cancers, Digestive system cancers and Cervical cancers. Fo...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2020.103499
更新日期:2020-08-01 00:00:00
abstract::To extract biomedical information about bio-entities from the huge amount of biomedical literature, the first key step is recognizing their names in these literatures, which remains a challenging task due to the irregularities and ambiguities in bio-entities nomenclature. The recognition performances of the current po...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2008.01.002
更新日期:2008-08-01 00:00:00
abstract::Phylogeography is a field that focuses on the geographical lineages of species such as vertebrates or viruses. Here, geographical data, such as location of a species or viral host is as important as the sequence information extracted from the species. Together, this information can help illustrate the migration of the...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2011.06.005
更新日期:2011-12-01 00:00:00
abstract::Social media has been identified as a promising potential source of information for pharmacovigilance. The adoption of social media data has been hindered by the massive and noisy nature of the data. Initial attempts to use social media data have relied on exact text matches to drugs of interest, and therefore suffer ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103307
更新日期:2019-11-01 00:00:00
abstract:OBJECTIVE:Our aim is to extract clinically-meaningful phenotypes from longitudinal electronic health records (EHRs) of medically-complex children. This is a fragile set of patients consuming a disproportionate amount of pediatric care resources but who often end up with sub-optimal clinical outcome. The rise in availab...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103125
更新日期:2019-05-01 00:00:00
abstract::This paper proposes a new framework for the selection of tag SNPs based on haplotypes instead of on a single SNP. The tag SNPs found by this framework form a set of haplotypes completely predictive of the alleles of all untyped SNPs. We refer to this problem as MTMH, which is defined as follows: given a set of SNPs, f...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2008.04.003
更新日期:2008-12-01 00:00:00
abstract::Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with great...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章,meta分析
doi:10.1016/j.jbi.2008.01.011
更新日期:2008-12-01 00:00:00
abstract:OBJECTIVE:To develop a method for profiling the collective populations targeted for recruitment by multiple clinical studies addressing the same medical condition using one eligibility feature each time. METHODS:Using a previously published database COMPACT as the backend, we designed a scalable method for visual aggr...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.01.005
更新日期:2015-04-01 00:00:00
abstract::Over the last 8 years, the National Cancer Institute (NCI) has launched a major effort to integrate molecular and clinical cancer-related information within a unified biomedical informatics framework, with controlled terminology as its foundational layer. The NCI Thesaurus is the reference terminology underpinning the...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2006.02.013
更新日期:2007-02-01 00:00:00
abstract::Evaluating automated indexing applications requires comparing automatically indexed terms against manual reference standard annotations. However, there are no standard guidelines for determining which words from a textual document to include in manual annotations, and the vague task can result in substantial variation...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2005.06.004
更新日期:2006-04-01 00:00:00
abstract:OBJECTIVE:Clinicians face numerous information needs during patient care activities and most of these needs are not met. Infobuttons are information retrieval tools that help clinicians to fulfill their information needs by providing links to on-line health information resources from within an electronic medical record...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2008.07.001
更新日期:2009-02-01 00:00:00
abstract::The development of convenient serum bioassays for cancer screening, diagnosis, prognosis, and monitoring of treatment is one of top priorities in cancer research community. Although numerous biomarker candidates have been generated by applying high-throughput technologies such as transcriptomics, proteomics, and metab...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2011.08.010
更新日期:2011-12-01 00:00:00
abstract::Evidence-based medicine is the most prevalent paradigm adopted by physicians. Clinical practice guidelines typically define a set of recommendations together with eligibility criteria that restrict their applicability to a specific group of patients. The ever-growing size and availability of health-related data is cur...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.06.001
更新日期:2018-07-01 00:00:00
abstract::Agglomerating results from studies of individual biological components has shown the potential to produce biomedical discovery and the promise of therapeutic development. Such knowledge integration could be tremendously facilitated by automated text mining for relation extraction in the biomedical literature. Relation...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2009.02.001
更新日期:2009-10-01 00:00:00
abstract::This study examines the ability of nonclinical adverse event observations to predict human clinical adverse events observed in drug development programs. In addition it examines the relationship between nonclinical and clinical adverse event observations to drug withdrawal and proposes a model to predict drug withdraw...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.02.008
更新日期:2015-04-01 00:00:00
abstract::Neural embeddings are a popular set of methods for representing words, phrases or text as a low dimensional vector (typically 50-500 dimensions). However, it is difficult to interpret these dimensions in a meaningful manner, and creating neural embeddings requires extensive training and tuning of multiple parameters a...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103096
更新日期:2019-02-01 00:00:00
abstract::Named entity recognition is an extremely important and fundamental task of biomedical text mining. Biomedical named entities include mentions of proteins, genes, DNA, RNA, etc which often have complex structures, but it is challenging to identify and classify such entities. Machine learning methods like CRF, MEMM and ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2008.12.012
更新日期:2009-10-01 00:00:00