Abstract:
:The prevalence of dropout events is a serious problem for single-cell Hi-C (scHiC) data due to insufficient sequencing depth and data coverage, which brings difficulties in downstream studies such as clustering and structural analysis. Complicating things further is the fact that dropouts are confounded with structural zeros due to underlying properties, leading to observed zeros being a mixture of both types of events. Although a great deal of progress has been made in imputing dropout events for single cell RNA-sequencing (RNA-seq) data, little has been done in identifying structural zeros and imputing dropouts for scHiC data. In this paper, we adapted several methods from the single-cell RNA-seq literature for inference on observed zeros in scHiC data and evaluated their effectiveness. Through an extensive simulation study and real data analysis, we have shown that a couple of the adapted single-cell RNA-seq algorithms can be powerful for correctly identifying structural zeros and accurately imputing dropout values. Downstream analysis using the imputed values showed considerable improvement for clustering cells of the same types together over clustering results before imputation.
journal_name
Brief Bioinformjournal_title
Briefings in bioinformaticsauthors
Han C,Xie Q,Lin Sdoi
10.1093/bib/bbaa289subject
Has Abstractpub_date
2020-11-17 00:00:00eissn
1467-5463issn
1477-4054pii
5985294pub_type
杂志文章abstract::Information Integrator is an extension to IBM's relational database DB2, which uses data federation to provide benefits to molecular biology researchers through two unique capabilities: increased flexibility in combining data from disparate sources, and SQL access to non-SQL data, easing the task of automating data an...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/4.4.375
更新日期:2003-12-01 00:00:00
abstract::Circular RNA (circRNA) is a group of RNA family generated by RNA circularization, which was discovered ubiquitously across different species and tissues. However, there is no global view of tissue specificity for circRNAs to date. Here we performed the comprehensive analysis to characterize the features of human and m...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbw081
更新日期:2017-11-01 00:00:00
abstract::The use of multiple testing procedures in the context of gene-set testing is an important but relatively underexposed topic. If a multiple testing method is used, this is usually a standard familywise error rate (FWER) or false discovery rate (FDR) controlling procedure in which the logical relationships that exist be...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbv091
更新日期:2016-09-01 00:00:00
abstract::The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of ...
journal_title:Briefings in bioinformatics
pub_type: 历史文章,杂志文章
doi:10.1093/bib/bbu022
更新日期:2015-03-01 00:00:00
abstract::Occurrence and development of cancers are governed by complex networks of interacting intercellular and intracellular signals. The technology of single-cell RNA sequencing (scRNA-seq) provides an unprecedented opportunity for dissecting the interplay between the cancer cells and the associated microenvironment. Here w...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz040
更新日期:2020-05-21 00:00:00
abstract::Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference tre...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbr034
更新日期:2011-09-01 00:00:00
abstract::As a group of important plant species in agriculture and biology, polyploids have been increasingly studied in terms of their genome structure and organization. There are two types of polyploids, allopolyploids and autopolyploids, each resulting from a different genetic origin, which undergo meiotic divisions of a dis...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbt075
更新日期:2015-01-01 00:00:00
abstract::Gene expression data have played an essential role in many biomedical studies. When the number of genes is large and sample size is limited, there is a 'lack of information' problem, leading to low-quality findings. To tackle this problem, both horizontal and vertical data integrations have been developed, where verti...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa169
更新日期:2020-08-14 00:00:00
abstract::Mediation analysis has been a useful tool for investigating the effect of mediators that lie in the path from the independent variable to the outcome. With the increasing dimensionality of mediators such as in (epi)genomics studies, high-dimensional mediation model is needed. In this work, we focus on epigenetic studi...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa113
更新日期:2020-07-01 00:00:00
abstract::In recent years, high-throughput genomic technologies like chromatin immunoprecipitation sequencing (ChIp-seq) and transcriptome sequencing (RNA-seq) have been becoming both more refined and less expensive, making them more accessible. Many circular RNAs (circRNAs) that originate from back-spliced exons have been iden...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bby083
更新日期:2019-11-27 00:00:00
abstract:MOTIVATION:Long noncoding RNAs (lncRNAs) correspond to a eukaryotic noncoding RNA class that gained great attention in the past years as a higher layer of regulation for gene expression in cells. There is, however, a lack of specific computational approaches to reliably predict lncRNA in plants, which contrast the vari...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bby034
更新日期:2019-03-25 00:00:00
abstract::Broader functional annotation of known as well as putative genetic variations is a valuable mean for prioritizing targets in disease studies and large-scale genotyping projects. In this article, we present a practical guide to SNPnexus, a web-based tool that provides an aggregate set of functional annotations for geno...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbt004
更新日期:2013-07-01 00:00:00
abstract::Most of current gene expression signatures for cancer prognosis are based on risk scores, usually calculated as some summaries of expression levels of the signature genes, whose applications require presetting risk score thresholds and data normalization. In this study, we demonstrate the critical limitations of such ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbv064
更新日期:2016-03-01 00:00:00
abstract::Patients with spinal muscular atrophy (SMA) are susceptible to the respiratory infections and might be at a heightened risk of poor clinical outcomes upon contracting coronavirus disease 2019 (COVID-19). In the face of the COVID-19 pandemic, the potential associations of SMA with the susceptibility to and prognosticat...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa285
更新日期:2020-11-14 00:00:00
abstract::Precision and personalized medicine will be increasingly based on the integration of various type of information, particularly electronic health records and genome sequences. The availability of cheap genome sequencing services and the information interoperability will increase the role of online bioinformatics analys...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbw078
更新日期:2017-11-01 00:00:00
abstract::Phase separation is an important mechanism that mediates the spatial distribution of proteins in different cellular compartments. While phase-separated proteins share certain sequence characteristics, including intrinsically disordered regions (IDRs) and prion-like domains, such characteristics are insufficient for ma...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa187
更新日期:2020-09-02 00:00:00
abstract::Despite gene expression programs being notoriously complex, RNA abundance is usually assumed as a proxy for transcriptional activity. Recently developed approaches, able to disentangle transcriptional and post-transcriptional regulatory processes, have revealed a more complex scenario. It is now possible to work out h...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa389
更新日期:2020-12-22 00:00:00
abstract::Precision medicine promises to revolutionize treatment, shifting therapeutic approaches from the classical one-size-fits-all to those more tailored to the patient's individual genomic profile, lifestyle and environmental exposures. Yet, to advance precision medicine's main objective-ensuring the optimum diagnosis, tre...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa033
更新日期:2021-01-18 00:00:00
abstract::The residence of spliceosomal introns within protein-coding genes can fluctuate over time, with genes gaining, losing or conserving introns in a complex process that is not entirely understood. One approach for studying intron evolution is to compare introns with respect to position and type within closely related gen...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbp051
更新日期:2009-11-01 00:00:00
abstract::This Briefing reviews the widely used, currently active, up-to-date databases derived from the worldwide Protein Data Bank (PDB) to facilitate browsing, finding and exploring its entries. These databases contain visualization and analysis tools tailored to specific kinds of molecules and interactions, often including ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbw049
更新日期:2017-07-01 00:00:00
abstract::Accurate inference of orthologous genes is a pre-requisite for most comparative genomics studies, and is also important for functional annotation of new genomes. Identification of orthologous gene sets typically involves phylogenetic tree analysis, heuristic algorithms based on sequence conservation, synteny analysis,...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbr030
更新日期:2011-09-01 00:00:00
abstract::Prognostic tests using expression profiles of several dozen genes help provide treatment choices for prostate cancer (PCa). However, these tests require improvement to meet the clinical need for resolving overtreatment, which continues to be a pervasive problem in PCa management. Genomic selection (GS) methodology, wh...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa197
更新日期:2020-09-08 00:00:00
abstract::SARS-CoV-2 is an intensively investigated virus from the order Nidovirales (Coronaviridae family) that causes COVID-19 disease in humans. Through enormous scientific effort, thousands of viral strains have been sequenced to date, thereby creating a strong background for deep bioinformatics studies of the SARS-CoV-2 ge...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa385
更新日期:2020-12-21 00:00:00
abstract::Glycosylation of proteins is involved in immune defense, cell-cell adhesion, cellular recognition and pathogen binding and is one of the most common and complex post-translational modifications. Science is still struggling to assign detailed mechanisms and functions to this form of conjugation. Even the structural ana...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbs045
更新日期:2013-05-01 00:00:00
abstract::Integrated modelling of biological systems is challenged by composing components with sufficient kinetic data and components with insufficient kinetic data or components built only using experts' experience and knowledge. Fuzzy continuous Petri nets (FCPNs) combine continuous Petri nets with fuzzy inference systems, a...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz114
更新日期:2021-01-18 00:00:00
abstract::A key challenge of the post-genomic era is the identification of the function(s) of all the molecules in a given organism. Here, we review the status of sequence and structure-based approaches to protein function inference and ligand screening that can provide functional insights for a significant fraction of the appr...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbp017
更新日期:2009-07-01 00:00:00
abstract::The outbreak caused by the novel coronavirus SARS-CoV-2 has been declared a global health emergency. G-quadruplex structures in genomes have long been considered essential for regulating a number of biological processes in a plethora of organisms. We have analyzed and identified 25 four contiguous GG runs (G2NxG2NyG2N...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa114
更新日期:2020-06-01 00:00:00
abstract::As the number of completely sequenced genomes rapidly increases, including now the complete Human Genome sequence, the post-genomic problems of genome-scale protein structure determination and the issue of gene function identification become ever more pressing. In fact, these problems can be seen as interrelated in th...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/2.2.111
更新日期:2001-05-01 00:00:00
abstract::Alternative polyadenylation (APA) in breast tumor samples results in the removal/addition of cis-regulatory elements such as microRNA (miRNA) target sites in the 3'-untranslated region (3'-UTRs) of genes. Although previous computational APA studies focused on a subset of genes strongly affected by APA (APA genes), we ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa191
更新日期:2020-08-26 00:00:00
abstract:BACKGROUND:Whole genome sequencing (WGS) is increasingly used for Mycobacterium tuberculosis (Mtb) research. Countries with the highest tuberculosis (TB) burden face important challenges to integrate WGS into surveillance and research. METHODS:We assessed the global status of Mtb WGS and developed a 3-week training co...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa246
更新日期:2020-10-03 00:00:00