EC-PGMGR: Ensemble Clustering Based on Probability Graphical Model With Graph Regularization for Single-Cell RNA-seq Data.

Abstract:

:Advances in technology have made it convenient to obtain a large amount of single cell RNA sequencing (scRNA-seq) data. Since that clustering is a very important step in identifying or defining cellular phenotypes, many clustering approaches have been developed recently for these applications. The general methods can be roughly divided into normal clustering methods and integrated (ensemble) clustering methods which combine more than two normal clustering methods aiming to get much more informative performance. In order to make a contrast with the integrated clustering algorithm, the normal clustering method is often called individual or base clustering method. Note that the results of many individual clustering methods are often developed to capture one aspect of the data, and the results depend on the initial parameter settings, such as cluster number, distance metric and so on. Compared with individual clustering, although integrative clustering method may get much more accurate performance, the results depend on the base clustering results and integrated systems are often not self-regulation. Therefore, how to design a robust unsupervised clustering method is still a challenge. In order to tackle above limitations, we propose a novel Ensemble Clustering algorithm based on Probability Graphical Model with Graph Regularization, which is called EC-PGMGR for short. On one hand, we use parameter controlling in Probability Graphical Model (PGM) to automatically determine the cluster number without prior knowledge. On the other hand, we add a regularization term to reduce the effect deriving from some weak base clustering results. Particularly, the integrative results collected from base clustering methods can be assembled in the form of combination with self-regulation weights through a pre-learning process, which can efficiently enhance the effect of active clustering methods while weaken the effect of inactive clustering methods. Experiments are carried out on 7 data sets generated by different platforms with the number of single cells from 822 to 5,132. Results show that EC-PGMGR performs better than 4 alternative individual clustering methods and 2 ensemble methods in terms of accuracy including Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI), robustness, effectiveness and so on. EC-PGMGR provides an effective way to integrate different clustering results for more accurate and reliable results in further biological analysis as well. It may provide some new insights to the other applications of clustering.

journal_name

Front Genet

journal_title

Frontiers in genetics

authors

Zhu Y,Zhang DX,Zhang XF,Yi M,Ou-Yang L,Wu M

doi

10.3389/fgene.2020.572242

subject

Has Abstract

pub_date

2020-11-04 00:00:00

pages

572242

issn

1664-8021

journal_volume

11

pub_type

杂志文章
  • Indy mutations and Drosophila longevity.

    abstract::Decreased expression of the fly and worm Indy genes extends longevity. The fly Indy gene and its mammalian homolog are transporters of Krebs cycle intermediates, with the highest rate of uptake for citrate. Cytosolic citrate has a role in energy regulation by affecting fatty acid synthesis and glycolysis. Fly, worm, a...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2013.00047

    authors: Rogina B,Helfand SL

    更新日期:2013-04-08 00:00:00

  • Component-Based Design and Assembly of Heuristic Multiple Sequence Alignment Algorithms.

    abstract::In recent years, there has been an explosive increase in the amount of bioinformatics data produced, but data are not information. The purpose of bioinformatics research is to obtain information with biological significance from large amounts of data. Multiple sequence alignment is widely used in sequence homology det...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2020.00105

    authors: Shi H,Zhang X

    更新日期:2020-02-27 00:00:00

  • Simple, standardized incorporation of genetic risk into non-genetic risk prediction tools for complex traits: coronary heart disease as an example.

    abstract:PURPOSE:Genetic risk assessment is becoming an important component of clinical decision-making. Genetic Risk Scores (GRSs) allow the composite assessment of genetic risk in complex traits. A technically and clinically pertinent question is how to most easily and effectively combine a GRS with an assessment of clinical ...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2014.00254

    authors: Goldstein BA,Knowles JW,Salfati E,Ioannidis JP,Assimes TL

    更新日期:2014-08-01 00:00:00

  • Immunohistochemical Evaluation of Histological Change in a Chinese Milroy Disease Family With Venous and Skin Abnormities.

    abstract::Background: Milroy disease (MD) is rare and autosomal dominant resulting from mutations of the vascular endothelial growth factor receptor-3 (VEGFR-3 or FLT4), which leads to dysgenesis of the lymphatic system. Methods: Here we report a Chinese MD family with 2 affected members of two generations. We identified the mu...

    journal_title:Frontiers in genetics

    pub_type:

    doi:10.3389/fgene.2019.00206

    authors: Zhang S,Chen X,Yuan L,Wang S,Moli D,Liu S,Wu Y

    更新日期:2019-03-19 00:00:00

  • Identification of Pathogenic Mutations and Investigation of the NOTCH Pathway Activation in Kartagener Syndrome.

    abstract::Primary ciliary dyskinesia (PCD), a rare genetic disorder, is mostly caused by defects in more than 40 known cilia structure-related genes. However, in approximately 20-35% of patients, it is caused by unknown genetic factors, and the inherited pathogenic factors are difficult to confirm. Kartagener syndrome (KTS) is ...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.00749

    authors: Yue Y,Huang Q,Zhu P,Zhao P,Tan X,Liu S,Li S,Han X,Cheng L,Li B,Fu Y

    更新日期:2019-08-22 00:00:00

  • The molecular pathways underlying host resistance and tolerance to pathogens.

    abstract::Breeding livestock that are better able to withstand the onslaught of endemic- and exotic pathogens is high on the wish list of breeders and farmers world-wide. However, the defense systems in both pathogens and their hosts are complex and the degree of genetic variation in resistance and tolerance will depend on the ...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2012.00263

    authors: Glass EJ

    更新日期:2012-12-14 00:00:00

  • Experimental assessment of static and dynamic algorithms for gene regulation inference from time series expression data.

    abstract::Accurate inference of causal gene regulatory networks from gene expression data is an open bioinformatics challenge. Gene interactions are dynamical processes and consequently we can expect that the effect of any regulation action occurs after a certain temporal lag. However such lag is unknown a priori and temporal a...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2013.00303

    authors: Lopes M,Bontempi G

    更新日期:2013-12-24 00:00:00

  • The Cell Type-Specific Functions of miR-21 in Cardiovascular Diseases.

    abstract::Cardiovascular diseases are one of the prime reasons for disability and death worldwide. Diseases and conditions, such as hypoxia, pressure overload, infection, and hyperglycemia, might initiate cardiac remodeling and dysfunction by inducing hypertrophy or apoptosis in cardiomyocytes and by promoting proliferation in ...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章,评审

    doi:10.3389/fgene.2020.563166

    authors: Dai B,Wang F,Nie X,Du H,Zhao Y,Yin Z,Li H,Fan J,Wen Z,Wang DW,Chen C

    更新日期:2020-11-20 00:00:00

  • A Novel 12q13.2-q13.3 Microdeletion Syndrome With Combined Features of Diamond Blackfan Anemia, Pierre Robin Sequence and Klippel Feil Deformity.

    abstract::Diamond-Blackfan anemia (DBA) is a rare congenital erythroid aplasia with a highly heterogeneous genetic background; it usually occurs in infancy. Approximately 30-40% of patients have other associated congenital anomalies; in particular, facial anomalies, such as cleft palate, are part of about 10% of the DBA clinica...

    journal_title:Frontiers in genetics

    pub_type:

    doi:10.3389/fgene.2018.00549

    authors: Roberti D,Conforti R,Giugliano T,Brogna B,Tartaglione I,Casale M,Piluso G,Perrotta S

    更新日期:2018-11-19 00:00:00

  • Machine Learning on Human Muscle Transcriptomic Data for Biomarker Discovery and Tissue-Specific Drug Target Identification.

    abstract::For the past several decades, research in understanding the molecular basis of human muscle aging has progressed significantly. However, the development of accessible tissue-specific biomarkers of human muscle aging that may be measured to evaluate the effectiveness of therapeutic interventions is still a major challe...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2018.00242

    authors: Mamoshina P,Volosnikova M,Ozerov IV,Putin E,Skibina E,Cortese F,Zhavoronkov A

    更新日期:2018-07-12 00:00:00

  • Exploring the RNA Gap for Improving Diagnostic Yield in Primary Immunodeficiencies.

    abstract::Challenges in diagnosing primary immunodeficiency are numerous and diverse, with current whole-exome and whole-genome sequencing approaches only able to reach a molecular diagnosis in 25-60% of cases. We assess these problems and discuss how RNA-focused analysis has expanded and improved in recent years and may now be...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章,评审

    doi:10.3389/fgene.2019.01204

    authors: Lye JJ,Williams A,Baralle D

    更新日期:2019-12-11 00:00:00

  • Multifunctional roles of the mammalian CCR4-NOT complex in physiological phenomena.

    abstract::The carbon catabolite repression 4 (CCR4)-negative on TATA-less (NOT) complex serves as one of the major deadenylases of eukaryotes. Although it was originally identified and characterized in yeast, recent studies have revealed that the CCR4-NOT complex also exerts important functions in mammals, -including humans. Ho...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章,评审

    doi:10.3389/fgene.2014.00286

    authors: Shirai YT,Suzuki T,Morita M,Takahashi A,Yamamoto T

    更新日期:2014-08-21 00:00:00

  • Genome-Wide Association Studies and Genomic Selection in Pearl Millet: Advances and Prospects.

    abstract::Pearl millet is a climate-resilient, drought-tolerant crop capable of growing in marginal environments of arid and semi-arid regions globally. Pearl millet is a staple food for more than 90 million people living in poverty and can address the triple burden of malnutrition substantially. It remained a neglected crop un...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章,评审

    doi:10.3389/fgene.2019.01389

    authors: Srivastava RK,Singh RB,Pujarula VL,Bollam S,Pusuluri M,Chellapilla TS,Yadav RS,Gupta R

    更新日期:2020-02-28 00:00:00

  • Gradient Boosting Decision Tree-Based Method for Predicting Interactions Between Target Genes and Drugs.

    abstract::Determining the target genes that interact with drugs-drug-target interactions-plays an important role in drug discovery. Identification of drug-target interactions through biological experiments is time consuming, laborious, and costly. Therefore, using computational approaches to predict candidate targets is a good ...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.00459

    authors: Xuan P,Sun C,Zhang T,Ye Y,Shen T,Dong Y

    更新日期:2019-05-31 00:00:00

  • Bayesian, Likelihood-Free Modelling of Phenotypic Plasticity and Variability in Individuals and Populations.

    abstract::There is a paradigm shift from the traditional focus on the "average" individual towards the definition and analysis of trait variation within individual life-history and among individuals in populations. This is a result of increasing availability of individual phenotypic data. The shift allows the use of genetic and...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.00727

    authors: Filipe JAN,Kyriazakis I

    更新日期:2019-09-20 00:00:00

  • Experimental and Field Data Support Range Expansion in an Allopolyploid Arabidopsis Owing to Parental Legacy of Heavy Metal Hyperaccumulation.

    abstract::Empirical evidence is limited on whether allopolyploid species combine or merge parental adaptations to broaden habitats. The allopolyploid Arabidopsis kamchatica is a hybrid of the two diploid parents Arabidopsis halleri and Arabidopsis lyrata. A. halleri is a facultative heavy metal hyperaccumulator, and may be foun...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2020.565854

    authors: Paape T,Akiyama R,Cereghetti T,Onda Y,Hirao AS,Kenta T,Shimizu KK

    更新日期:2020-09-30 00:00:00

  • Dysregulation of MicroRNA Regulatory Network in Lower Extremities Arterial Disease.

    abstract::Atherosclerosis and its comorbidities are the major contributors to the global burden of death worldwide. Lower extremities arterial disease (LEAD) is a common manifestation of atherosclerotic disease of arteries of lower extremities. MicroRNAs belong to epigenetic factors that regulate gene expression and have not ye...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.01200

    authors: Bogucka-Kocka A,Zalewski DP,Ruszel KP,Stępniewski A,Gałkowski D,Bogucki J,Komsta Ł,Kołodziej P,Zubilewicz T,Feldo M,Kocki J

    更新日期:2019-11-22 00:00:00

  • The Challenges of Microbial Control of Mosquito-Borne Diseases Due to the Gut Microbiome.

    abstract::Mosquitoes are one of the deadliest animals on earth because of their ability to transmit a wide range of human pathogens. Traditional mosquito control methods use chemical insecticides, but with dwindling long-term effectiveness and negative effects on the environment, microbial forms of control have become common al...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章,评审

    doi:10.3389/fgene.2020.504354

    authors: Dacey DP,Chain FJJ

    更新日期:2020-10-07 00:00:00

  • Genome-Wide Association Study Uncovers Novel Genomic Regions Associated With Coleoptile Length in Hard Winter Wheat.

    abstract::Successful seedling establishment depends on the optimum depth of seed placement especially in drought-prone conditions, providing an opportunity to exploit subsoil water and increase winter survival in winter wheat. Coleoptile length is a key determinant for the appropriate depth at which seed can be sown. Thus, unde...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.01345

    authors: Sidhu JS,Singh D,Gill HS,Brar NK,Qiu Y,Halder J,Al Tameemi R,Turnipseed B,Sehgal SK

    更新日期:2020-02-05 00:00:00

  • Putative Epigenetic Biomarkers of Stress in Red Blood Cells of Chickens Reared Across Different Biomes.

    abstract::Production animals are constantly subjected to early adverse environmental conditions that influence the adult phenotype and produce epigenetic effects. CpG dinucleotide methylation in red blood cells (RBC) could be a useful epigenetic biomarker to identify animals subjected to chronic stress in the production environ...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2020.508809

    authors: Pértille F,Ibelli AMG,Sharif ME,Poleti MD,Fröhlich AS,Rezaei S,Ledur MC,Jensen P,Guerrero-Bosagna C,Coutinho LL

    更新日期:2020-11-02 00:00:00

  • Inherited and Acquired Determinants of Hepatic CYP3A Activity in Humans.

    abstract::Human CYP3A enzymes (including CYP3A4 and CYP4A5) metabolize about 40% of all drugs and numerous other environmental and endogenous substances. CYP3A activity is highly variable within and between humans. As a consequence, therapy with standard doses often results in too low or too high blood and tissue concentrations...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2020.00944

    authors: Matthaei J,Bonat WH,Kerb R,Tzvetkov MV,Strube J,Brunke S,Sachse-Seeboth C,Sehrt D,Hofmann U,von Bornemann Hjelmborg J,Schwab M,Brockmöller J

    更新日期:2020-08-21 00:00:00

  • TMEM205 Is an Independent Prognostic Factor and Is Associated With Immune Cell Infiltrates in Hepatocellular Carcinoma.

    abstract::Hepatocellular carcinoma (HCC) is the second leading cause of cancer-related death worldwide despite the availability of diverse treatment strategies. Much research progress has been made regarding immunotherapy but the effects remain unsatisfactory, highlighting the urgent need for novel immune-related therapy target...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2020.575776

    authors: Rao J,Wu X,Zhou X,Deng R,Ma Y

    更新日期:2020-10-14 00:00:00

  • MDR1 Gene Polymorphisms and Its Association With Expression as a Clinical Relevance in Terms of Response to Chemotherapy and Prognosis in Ovarian Cancer.

    abstract::In spite of the significant advancements in the treatment modalities, 30% of advanced stage ovarian cancer (OC) patients do not respond to the standard chemotherapeutic regimen and most of the responders finally relapse over time due to the escalation of multidrug resistance (MDR) Phenomenon. Our present study evaluat...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2020.00516

    authors: Haque A,Sait KHW,Alam Q,Alam MZ,Anfinan N,Wali AWN,Rasool M

    更新日期:2020-05-26 00:00:00

  • Ultra-Sensitive Automated Profiling of EpCAM Expression on Tumor-Derived Extracellular Vesicles.

    abstract::Extracellular vesicles (EVs) are abundant in most biological fluids and considered promising biomarker candidates, but the development of EV biomarker assays is hindered, in part, by their requirement for prior EV purification and the lack of standardized and reproducible EV isolation methods. We now describe a far-fi...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.01273

    authors: Amrollahi P,Rodrigues M,Lyon CJ,Goel A,Han H,Hu TY

    更新日期:2019-12-17 00:00:00

  • C6orf10 Low-Frequency and Rare Variants in Italian Multiple Sclerosis Patients.

    abstract::In light of the complex nature of multiple sclerosis (MS) and the recently estimated contribution of low-frequency variants into disease, decoding its genetic risk components requires novel variant prioritization strategies. We selected, by reviewing MS Genome Wide Association Studies (GWAS), 107 candidate loci marked...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.00573

    authors: Ziliotto N,Marchetti G,Scapoli C,Bovolenta M,Meneghetti S,Benazzo A,Lunghi B,Balestra D,Laino LA,Bozzini N,Guidi I,Salvi F,Straudi S,Gemmati D,Menegatti E,Zamboni P,Bernardi F

    更新日期:2019-06-26 00:00:00

  • Cancer as a Tissue Anomaly: Classifying Tumor Transcriptomes Based Only on Healthy Data.

    abstract::Since the turn of the century, researchers have sought to diagnose cancer based on gene expression signatures measured from the blood or biopsy as biomarkers. This task, known as classification, is typically solved using a suite of algorithms that learn a mathematical rule capable of discriminating one group ("cases")...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.00599

    authors: Quinn TP,Nguyen T,Lee SC,Venkatesh S

    更新日期:2019-07-02 00:00:00

  • Microsatellite-Based Genetic Structure and Diversity of Local Arabian Sheep Breeds.

    abstract::The genetic diversity of the sheep breeds in the Arab countries might be considered to be a mirror of the ecology of the region. In this study, the genetic structure and diversity of sheep breeds from Saudi Arabia (Harri, Najdi, Naemi, Arb, and Rufidi) and Awassi sheep from Jordan as an out-group were investigated usi...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2018.00408

    authors: Al-Atiyat RM,Aljumaah RS,Alshaikh MA,Abudabos AM

    更新日期:2018-09-25 00:00:00

  • Conserved Disease Modules Extracted From Multilayer Heterogeneous Disease and Gene Networks for Understanding Disease Mechanisms and Predicting Disease Treatments.

    abstract::Disease relationship studies for understanding the pathogenesis of complex diseases, diagnosis, prognosis, and drug development are important. Traditional approaches consider one type of disease data or aggregating multiple types of disease data into a single network, which results in important temporal- or context-re...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2018.00745

    authors: Yu L,Yao S,Gao L,Zha Y

    更新日期:2019-01-18 00:00:00

  • The Precise Diagnosis of Wolfram Syndrome Type 1 Based on Next-Generation Sequencing.

    abstract::Purpose: To explore a method for the early, rapid and accurate diagnosis of Wolfram syndrome 1 (WS1) and further enrich the spectrum of WFS1 mutations in the Chinese population. Methods: We analyzed 279 patients with unexplained optic atrophy using next-generation sequencing. All patients underwent detailed clinical e...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.01217

    authors: Wang DD,Hu FY,Gao FJ,Zhang SH,Xu P,Tian GH,Wu JH

    更新日期:2019-11-26 00:00:00

  • Screening and Identification of Potential Prognostic Biomarkers in Adrenocortical Carcinoma.

    abstract::Objective: Adrenocortical carcinoma (ACC) is a rare but aggressive malignant cancer that has been attracting growing attention over recent decades. This study aims to integrate protein interaction networks with gene expression profiles to identify potential biomarkers with prognostic value in silico. Methods: Three mi...

    journal_title:Frontiers in genetics

    pub_type: 杂志文章

    doi:10.3389/fgene.2019.00821

    authors: Xu WH,Wu J,Wang J,Wan FN,Wang HK,Cao DL,Qu YY,Zhang HL,Ye DW

    更新日期:2019-09-11 00:00:00