Serum cancer biomarker discovery through analysis of gene expression data sets across multiple tumor and normal tissues.

Abstract:

:The development of convenient serum bioassays for cancer screening, diagnosis, prognosis, and monitoring of treatment is one of top priorities in cancer research community. Although numerous biomarker candidates have been generated by applying high-throughput technologies such as transcriptomics, proteomics, and metabolomics, few of them have been successfully validated in the clinic. Better strategies to mine omics data for successful biomarker discovery are needed. Using a data set of 22,794 tumor and normal samples across 23 tissues, we systematically analyzed current problems and challenges of serum biomarker discovery from gene expression data. We first performed tissue specificity analysis to identify genes that are both tissue-specific and up-regulated in tumors compared to controls, but identified few novel candidates. Then, we designed a novel computation method, the multiple normal tissues corrected differential analysis (MNTDA), to identify genes that are expected to be significantly up-regulated even after their expressions in other normal tissues are considered, and, in a simulation study, showed that the multiple normal tissues corrected differential analysis outperformed the single tissue differential analysis combined with tissue specificity analysis. By applying the multiple normal tissues corrected differential analysis, we identified some genes as novel biomarker candidates. However, the number of potential candidates was disappointingly small, exemplifying the difficulty of finding serum cancer biomarkers. We discussed a few important points that should be considered during biomarker discovery from omics data.

journal_name

J Biomed Inform

authors

Jin H,Lee HC,Park SS,Jeong YS,Kim SY

doi

10.1016/j.jbi.2011.08.010

subject

Has Abstract

pub_date

2011-12-01 00:00:00

pages

1076-85

issue

6

eissn

1532-0464

issn

1532-0480

pii

S1532-0464(11)00137-7

journal_volume

44

pub_type

杂志文章
  • Methodological variations in lagged regression for detecting physiologic drug effects in EHR data.

    abstract::We studied how lagged linear regression can be used to detect the physiologic effects of drugs from data in the electronic health record (EHR). We systematically examined the effect of methodological variations ((i) time series construction, (ii) temporal parameterization, (iii) intra-subject normalization, (iv) diffe...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.08.014

    authors: Levine ME,Albers DJ,Hripcsak G

    更新日期:2018-10-01 00:00:00

  • A hybrid knowledge-based and data-driven approach to identifying semantically similar concepts.

    abstract::An open research question when leveraging ontological knowledge is when to treat different concepts separately from each other and when to aggregate them. For instance, concepts for the terms "paroxysmal cough" and "nocturnal cough" might be aggregated in a kidney disease study, but should be left separate in a pneumo...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.01.002

    authors: Pivovarov R,Elhadad N

    更新日期:2012-06-01 00:00:00

  • An unsupervised and customizable misspelling generator for mining noisy health-related text sources.

    abstract:BACKGROUND:Data collection and extraction from noisy text sources such as social media typically rely on keyword-based searching/listening. However, health-related terms are often misspelled in such noisy text sources due to their complex morphology, resulting in the exclusion of relevant data for studies. In this pape...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.11.007

    authors: Sarker A,Gonzalez-Hernandez G

    更新日期:2018-12-01 00:00:00

  • Algorithms for rapid outbreak detection: a research synthesis.

    abstract::The threat of bioterrorism has stimulated interest in enhancing public health surveillance to detect disease outbreaks more rapidly than is currently possible. To advance research on improving the timeliness of outbreak detection, the Defense Advanced Research Project Agency sponsored the Bio-event Advanced Leading In...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2004.11.007

    authors: Buckeridge DL,Burkom H,Campbell M,Hogan WR,Moore AW

    更新日期:2005-04-01 00:00:00

  • Deep learning with wearable based heart rate variability for prediction of mental and general health.

    abstract::The ubiquity and commoditisation of wearable biosensors (fitness bands) has led to a deluge of personal healthcare data, but with limited analytics typically fed back to the user. The feasibility of feeding back more complex, seemingly unrelated measures to users was investigated, by assessing whether increased levels...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103610

    authors: Coutts LV,Plans D,Brown AW,Collomosse J

    更新日期:2020-12-01 00:00:00

  • Unstructured medical image query using big data - An epilepsy case study.

    abstract::Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasi...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.12.005

    authors: Istephan S,Siadat MR

    更新日期:2016-02-01 00:00:00

  • Reflective Random Indexing and indirect inference: a scalable method for discovery of implicit connections.

    abstract::The discovery of implicit connections between terms that do not occur together in any scientific document underlies the model of literature-based knowledge discovery first proposed by Swanson. Corpus-derived statistical models of semantic distance such as Latent Semantic Analysis (LSA) have been evaluated previously a...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2009.09.003

    authors: Cohen T,Schvaneveldt R,Widdows D

    更新日期:2010-04-01 00:00:00

  • Combining glass box and black box evaluations in the identification of heart disease risk factors and their temporal relations from clinical records.

    abstract:BACKGROUND:The determination of risk factors and their temporal relations in natural language patient records is a complex task which has been addressed in the i2b2/UTHealth 2014 shared task. In this context, in most systems it was broadly decomposed into two sub-tasks implemented by two components: entity detection, a...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.06.014

    authors: Grouin C,Moriceau V,Zweigenbaum P

    更新日期:2015-12-01 00:00:00

  • The use of logic relationships to model colon cancer gene expression networks with mRNA microarray data.

    abstract::The ultimate goal of genomics research is to describe the network of molecules and interactions that govern all biological functions and disease processes in cells. Nonlinear interactions among genes in terms of their logic relationships play a key role for deciphering the networks of molecules that underlie cellular ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2007.11.006

    authors: Ruan X,Wang J,Li H,Perozzi RE,Perozzi EF

    更新日期:2008-08-01 00:00:00

  • Benchmarking relief-based feature selection methods for bioinformatics data mining.

    abstract::Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. g...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.07.015

    authors: Urbanowicz RJ,Olson RS,Schmitt P,Meeker M,Moore JH

    更新日期:2018-09-01 00:00:00

  • Computer mediated reality technologies: A conceptual framework and survey of the state of the art in healthcare intervention systems.

    abstract:INTRODUCTION:The trend of an ageing and growing world population, particularly in developed countries, is expected to continue for decades to come causing an increase in demand for healthcare resources and services. Consequently, demand is growing faster than rises in funding. The UK government, in partnership with the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2019.103102

    authors: Ibrahim Z,Money AG

    更新日期:2019-02-01 00:00:00

  • A comprehensive review of feature based methods for drug target interaction prediction.

    abstract::Drug target interaction is a prominent research area in the field of drug discovery. It refers to the recognition of interactions between chemical compounds and the protein targets in the human body. Wet lab experiments to identify these interactions are expensive as well as time consuming. The computational methods o...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2019.103159

    authors: Sachdev K,Gupta MK

    更新日期:2019-05-01 00:00:00

  • Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles.

    abstract::Anticancer drug-associated side effect knowledge often exists in multiple heterogeneous and complementary data sources. A comprehensive anticancer drug-side effect (drug-SE) relationship knowledge base is important for computation-based drug target discovery, drug toxicity predication and drug repositioning. In this s...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2014.10.002

    authors: Xu R,Wang Q

    更新日期:2015-02-01 00:00:00

  • A reference ontology for biomedical informatics: the Foundational Model of Anatomy.

    abstract::The Foundational Model of Anatomy (FMA), initially developed as an enhancement of the anatomical content of UMLS, is a domain ontology of the concepts and relationships that pertain to the structural organization of the human body. It encompasses the material objects from the molecular to the macroscopic levels that c...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2003.11.007

    authors: Rosse C,Mejino JL Jr

    更新日期:2003-12-01 00:00:00

  • Evaluating warfarin dosing models on multiple datasets with a novel software framework and evolutionary optimisation.

    abstract::Warfarin is an effective preventative treatment for arterial and venous thromboembolism, but requires individualised dosing due to its narrow therapeutic range and high individual variation. Many machine learning techniques have been demonstrated in this domain. This study evaluated the accuracy of the most promising ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103634

    authors: Truda G,Marais P

    更新日期:2021-01-01 00:00:00

  • Transitive closure of subsumption and causal relations in a large ontology of radiological diagnosis.

    abstract::The Radiology Gamuts Ontology (RGO)-an ontology of diseases, interventions, and imaging findings-was developed to aid in decision support, education, and translational research in diagnostic radiology. The ontology defines a subsumption (is_a) relation between more general and more specific terms, and a causal relatio...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.03.015

    authors: Kahn CE Jr

    更新日期:2016-06-01 00:00:00

  • Modelling and analysing the dynamics of disease progression from cross-sectional studies.

    abstract::Clinical trials are typically conducted over a population within a defined time period in order to illuminate certain characteristics of a health issue or disease process. These cross-sectional studies give us a 'snapshot' of this disease process over a large number of people but do not allow us to model the temporal ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.11.003

    authors: Li Y,Swift S,Tucker A

    更新日期:2013-04-01 00:00:00

  • Classification of ADHD with bi-objective optimization.

    abstract::Attention Deficit Hyperactive Disorder (ADHD) is one of the most common diseases in school aged children. In this paper, we consider using fMRI data with classification techniques to aid the diagnosis of ADHD and propose a bi-objective ADHD classification scheme based on L1-norm support vector machine (SVM). In our cl...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2018.07.011

    authors: Shao L,Xu Y,Fu D

    更新日期:2018-08-01 00:00:00

  • Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus.

    abstract::The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on the de-identification of longitudinal medical records. For this track, we de-identified a set of 1304 longitudinal medical records describing 296 patients. This corpus was de-identified under a broad interpretation of the HIPAA ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.07.020

    authors: Stubbs A,Uzuner Ö

    更新日期:2015-12-01 00:00:00

  • Enhancing phylogeography by improving geographical information from GenBank.

    abstract::Phylogeography is a field that focuses on the geographical lineages of species such as vertebrates or viruses. Here, geographical data, such as location of a species or viral host is as important as the sequence information extracted from the species. Together, this information can help illustrate the migration of the...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2011.06.005

    authors: Scotch M,Sarkar IN,Mei C,Leaman R,Cheung KH,Ortiz P,Singraur A,Gonzalez G

    更新日期:2011-12-01 00:00:00

  • A kernel-based clustering method for gene selection with gene expression data.

    abstract::Gene selection is important for cancer classification based on gene expression data, because of high dimensionality and small sample size. In this paper, we present a new gene selection method based on clustering, in which dissimilarity measures are obtained through kernel functions. It searches for best weights of ge...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2016.05.007

    authors: Chen H,Zhang Y,Gutman I

    更新日期:2016-08-01 00:00:00

  • A cascaded approach for Chinese clinical text de-identification with less annotation effort.

    abstract::With rapid adoption of Electronic Health Records (EHR) in China, an increasing amount of clinical data has been available to support clinical research. Clinical data secondary use usually requires de-identification of personal information to protect patient privacy. Since manually de-identification of free clinical te...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.07.017

    authors: Jian Z,Guo X,Liu S,Ma H,Zhang S,Zhang R,Lei J

    更新日期:2017-09-01 00:00:00

  • The Analytic Information Warehouse (AIW): a platform for analytics using electronic health record data.

    abstract:OBJECTIVE:To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. MATERIALS AND METHODS:We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transfor...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2013.01.005

    authors: Post AR,Kurc T,Cholleti S,Gao J,Lin X,Bornstein W,Cantrell D,Levine D,Hohmann S,Saltz JH

    更新日期:2013-06-01 00:00:00

  • Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: an application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson's Disease Questionnaire (PDQ-39).

    abstract::Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models recently proposed to deal with multi-dimensional classification problems, where each instance in the data set has to be assigned to more than one class variable. In this paper, we propose a Markov blanket-based approach for learni...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2012.07.010

    authors: Borchani H,Bielza C,Martı Nez-Martı N P,Larrañaga P

    更新日期:2012-12-01 00:00:00

  • Quality assurance of chemical ingredient classification for the National Drug File - Reference Terminology.

    abstract::The National Drug File - Reference Terminology (NDF-RT) is a large and complex drug terminology consisting of several classification hierarchies on top of an extensive collection of drug concepts. These hierarchies provide important information about clinical drugs, e.g., their chemical ingredients, mechanisms of acti...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2017.07.013

    authors: Zheng L,Yumak H,Chen L,Ochs C,Geller J,Kapusnik-Uner J,Perl Y

    更新日期:2017-09-01 00:00:00

  • Annotating risk factors for heart disease in clinical narratives for diabetic patients.

    abstract::The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on identifying risk factors for heart disease (specifically, Cardiac Artery Disease) in clinical narratives. For this track, we used a "light" annotation paradigm to annotate a set of 1304 longitudinal medical records describing 29...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2015.05.009

    authors: Stubbs A,Uzuner Ö

    更新日期:2015-12-01 00:00:00

  • Description of a method to support public health information management: organizational network analysis.

    abstract::In this case study, we describe a method that has potential to provide systematic support for public health information management. Public health agencies depend on specialized information that travels throughout an organization via communication networks among employees. Interactions that occur within these networks ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2006.09.004

    authors: Merrill J,Bakken S,Rockoff M,Gebbie K,Carley KM

    更新日期:2007-08-01 00:00:00

  • Desiderata for domain reference ontologies in biomedicine.

    abstract::Domain reference ontologies represent knowledge about a particular part of the world in a way that is independent from specific objectives, through a theory of the domain. An example of reference ontology in biomedical informatics is the Foundational Model of Anatomy (FMA), an ontology of anatomy that covers the entir...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2005.09.002

    authors: Burgun A

    更新日期:2006-06-01 00:00:00

  • A knowledge-based system to find over-the-counter medicines for self-medication.

    abstract::This study developed a medicine query system based on Semantic Web and open data especially for self-medication users to search over-the-counter (OTC) medicines. Most existing medicine query systems are based on keyword searches. If users are uncertain about the exact search words, these query systems do not offer eff...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章

    doi:10.1016/j.jbi.2020.103504

    authors: Sung HY,Chi YL

    更新日期:2020-08-01 00:00:00

  • Homology assessment and molecular sequence alignment.

    abstract::Hypotheses of homology are the basis of phylogenetic analysis. All character data are considered to be equivalent regardless of the source of those characters. Putative homology statements are designated based on observations of similarity. Pairwise sequence alignment using the Needleman-Wunsch algorithm is the basis ...

    journal_title:Journal of biomedical informatics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jbi.2005.11.005

    authors: Phillips AJ

    更新日期:2006-02-01 00:00:00