HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient.

Abstract:

:Hi-C is a powerful technology for studying genome-wide chromatin interactions. However, current methods for assessing Hi-C data reproducibility can produce misleading results because they ignore spatial features in Hi-C data, such as domain structure and distance dependence. We present HiCRep, a framework for assessing the reproducibility of Hi-C data that systematically accounts for these features. In particular, we introduce a novel similarity measure, the stratum adjusted correlation coefficient (SCC), for quantifying the similarity between Hi-C interaction matrices. Not only does it provide a statistically sound and reliable evaluation of reproducibility, SCC can also be used to quantify differences between Hi-C contact matrices and to determine the optimal sequencing depth for a desired resolution. The measure consistently shows higher accuracy than existing approaches in distinguishing subtle differences in reproducibility and depicting interrelationships of cell lineages. The proposed measure is straightforward to interpret and easy to compute, making it well-suited for providing standardized, interpretable, automatable, and scalable quality control. The freely available R package HiCRep implements our approach.

journal_name

Genome Res

journal_title

Genome research

authors

Yang T,Zhang F,Yardımcı GG,Song F,Hardison RC,Noble WS,Yue F,Li Q

doi

10.1101/gr.220640.117

subject

Has Abstract

pub_date

2017-11-01 00:00:00

pages

1939-1949

issue

11

eissn

1088-9051

issn

1549-5469

pii

gr.220640.117

journal_volume

27

pub_type

杂志文章
  • H3K27me3 forms BLOCs over silent genes and intergenic regions and specifies a histone banding pattern on a mouse autosomal chromosome.

    abstract::In mammals, genome-wide chromatin maps and immunofluorescence studies show that broad domains of repressive histone modifications are present on pericentromeric and telomeric repeats and on the inactive X chromosome. However, only a few autosomal loci such as silent Hox gene clusters have been shown to lie in broad do...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.080861.108

    authors: Pauler FM,Sloane MA,Huang R,Regha K,Koerner MV,Tamir I,Sommer A,Aszodi A,Jenuwein T,Barlow DP

    更新日期:2009-02-01 00:00:00

  • Arabidopsis-rice: will colinearity allow gene prediction across the eudicot-monocot divide?

    abstract::With the genomic sequencing of Arabidopsis nearing completion and rice sequencing very much in its infancy, a key question is whether we can exploit the Arabidopsis sequence to identify candidate genes for traits in cereal crops using a map-based approach. This requires the existence of colinearity between the Arabido...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.9.9.825

    authors: Devos KM,Beales J,Nagamura Y,Sasaki T

    更新日期:1999-09-01 00:00:00

  • Realizing the potential of blockchain technologies in genomics.

    abstract::Genomics data introduce a substantial computational burden as well as data privacy and ownership issues. Data sets generated by high-throughput sequencing platforms require immense amounts of computational resources to align to reference genomes and to call and annotate genomic variants. This problem is even more pron...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.207464.116

    authors: Ozercan HI,Ileri AM,Ayday E,Alkan C

    更新日期:2018-09-01 00:00:00

  • Alternative approach to a heavy weight problem.

    abstract::Obesity is reaching epidemic proportions in developed countries and represents a significant risk factor for hypertension, heart disease, diabetes, and dyslipidemia. Splicing mutations constitute at least 14% of disease-causing mutations, thus implicating polymorphisms that affect splicing as likely candidates for dis...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6661308

    authors: Goren A,Kim E,Amit M,Bochner R,Lev-Maor G,Ahituv N,Ast G

    更新日期:2008-02-01 00:00:00

  • Relationship between histone modifications and transcription factor binding is protein family specific.

    abstract::The very small fraction of putative binding sites (BSs) that are occupied by transcription factors (TFs) in vivo can be highly variable across different cell types. This observation has been partly attributed to changes in chromatin accessibility and histone modification (HM) patterns surrounding BSs. Previous studies...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.220079.116

    authors: Xin B,Rohs R

    更新日期:2018-01-11 00:00:00

  • Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci.

    abstract::The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human genome that is translated remains to be ascertained. We previously developed PhyloCSF, a widely used tool to identify evolutionary signatures of protein-coding regions using multispecies genome alignments. Here, we present...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.246462.118

    authors: Mudge JM,Jungreis I,Hunt T,Gonzalez JM,Wright JC,Kay M,Davidson C,Fitzgerald S,Seal R,Tweedie S,He L,Waterhouse RM,Li Y,Bruford E,Choudhary JS,Frankish A,Kellis M

    更新日期:2019-12-01 00:00:00

  • 2C-Cas9: a versatile tool for clonal analysis of gene function.

    abstract::CRISPR/Cas9-mediated targeted mutagenesis allows efficient generation of loss-of-function alleles in zebrafish. To date, this technology has been primarily used to generate genetic knockout animals. Nevertheless, the study of the function of certain loci might require tight spatiotemporal control of gene inactivation....

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.196170.115

    authors: Di Donato V,De Santis F,Auer TO,Testa N,Sánchez-Iranzo H,Mercader N,Concordet JP,Del Bene F

    更新日期:2016-05-01 00:00:00

  • Optical mapping of BAC clones from the human Y chromosome DAZ locus.

    abstract::The accurate mapping of clones derived from genomic regions containing complex arrangements of repeated elements presents special problems for DNA sequencers. Recent advances in the automation of optical mapping have enabled us to map a set of 16 BAC clones derived from the DAZ locus of the human Y chromosome long arm...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.112100

    authors: Giacalone J,Delobette S,Gibaja V,Ni L,Skiadas Y,Qi R,Edington J,Lai Z,Gebauer D,Zhao H,Anantharaman T,Mishra B,Brown LG,Saxena R,Page DC,Schwartz DC

    更新日期:2000-09-01 00:00:00

  • Integrated mapping, chromosomal sequencing and sequence analysis of Cryptosporidium parvum.

    abstract::The apicomplexan Cryptosporidium parvum is one of the most prevalent protozoan parasites of humans. We report the physical mapping of the genome of the Iowa isolate, sequencing and analysis of chromosome 6, and approximately 0.9 Mbp of sequence sampled from the remainder of the genome. To construct a robust physical m...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1555203

    authors: Bankier AT,Spriggs HF,Fartmann B,Konfortov BA,Madera M,Vogel C,Teichmann SA,Ivens A,Dear PH

    更新日期:2003-08-01 00:00:00

  • A GC-rich sequence feature in the 3' UTR directs UPF1-dependent mRNA decay in mammalian cells.

    abstract::Up-frameshift protein 1 (UPF1) is an ATP-dependent RNA helicase that has essential roles in RNA surveillance and in post-transcriptional gene regulation by promoting the degradation of mRNAs. Previous studies revealed that UPF1 is associated with the 3' untranslated region (UTR) of target mRNAs via as-yet-unknown sequ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.206060.116

    authors: Imamachi N,Salam KA,Suzuki Y,Akimitsu N

    更新日期:2017-03-01 00:00:00

  • Sequencing of cDNA clones from the genetic map of tomato (Lycopersicon esculentum).

    abstract::The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.8.842

    authors: Ganal MW,Czihal R,Hannappel U,Kloos DU,Polley A,Ling HQ

    更新日期:1998-08-01 00:00:00

  • A generic, cost-effective, and scalable cell lineage analysis platform.

    abstract::Advances in single-cell genomics enable commensurate improvements in methods for uncovering lineage relations among individual cells. Current sequencing-based methods for cell lineage analysis depend on low-resolution bulk analysis or rely on extensive single-cell sequencing, which is not scalable and could be biased ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.202903.115

    authors: Biezuner T,Spiro A,Raz O,Amir S,Milo L,Adar R,Chapal-Ilani N,Berman V,Fried Y,Ainbinder E,Cohen G,Barr HM,Halaban R,Shapiro E

    更新日期:2016-11-01 00:00:00

  • The region surrounding the PKD1 gene: a 700-kb P1 contig from a YAC-deficient interval.

    abstract::As part of an effort to identify the gene responsible for the predominant form of polycystic kidney disease (PKD1), we used a gridded human P1 library for contig assembly. The interval of interest, a 700-kb segment on chromosome 16p13.3, can be physically delineated by the genetic markers D16S125 and D16S84 and chromo...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6.6.515

    authors: Dackowski WR,Connors TD,Bowe AE,Stanton V Jr,Housman D,Doggett NA,Landes GM,Klinger KW

    更新日期:1996-06-01 00:00:00

  • metaSPAdes: a new versatile metagenomic assembler.

    abstract::While metagenomics has emerged as a technology of choice for analyzing bacterial populations, the assembly of metagenomic data remains challenging, thus stifling biological discoveries. Moreover, recent studies revealed that complex bacterial populations may be composed from dozens of related strains, thus further amp...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.213959.116

    authors: Nurk S,Meleshko D,Korobeynikov A,Pevzner PA

    更新日期:2017-05-01 00:00:00

  • A positive but complex association between meiotic double-strand break hotspots and open chromatin in Saccharomyces cerevisiae.

    abstract::During meiosis, chromatin undergoes extensive changes to facilitate recombination, homolog pairing, and chromosome segregation. To investigate the relationship between chromatin organization and meiotic processes, we used formaldehyde-assisted isolation of regulatory elements (FAIRE) to map open chromatin during the t...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.096297.109

    authors: Berchowitz LE,Hanlon SE,Lieb JD,Copenhaver GP

    更新日期:2009-12-01 00:00:00

  • The (r)evolution of SINE versus LINE distributions in primate genomes: sex chromosomes are important.

    abstract::The densities of transposable elements (TEs) in the human genome display substantial variation both within individual chromosomes and among chromosome types (autosomes and the two sex chromosomes). Finding an explanation for this variability has been challenging, especially in light of genome landscapes unique to the ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.099044.109

    authors: Kvikstad EM,Makova KD

    更新日期:2010-05-01 00:00:00

  • Predicting deleterious amino acid substitutions.

    abstract::Many missense substitutions are identified in single nucleotide polymorphism (SNP) data and large-scale random mutagenesis projects. Each amino acid substitution potentially affects protein function. We have constructed a tool that uses sequence homology to predict whether a substitution affects protein function. SIFT...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.176601

    authors: Ng PC,Henikoff S

    更新日期:2001-05-01 00:00:00

  • Theories and applications for sequencing randomly selected clones.

    abstract::Theory is developed for the process of sequencing randomly selected large-insert clones. Genome size, library depth, clone size, and clone distribution are considered relevant properties and perfect overlap detection for contig assembly is assumed. Genome-specific and nonrandom effects are neglected. Order of magnitud...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.gr-1339r

    authors: Wendl MC,Marra MA,Hillier LW,Chinwalla AT,Wilson RK,Waterston RH

    更新日期:2001-02-01 00:00:00

  • Immune signatures correlate with L1 retrotransposition in gastrointestinal cancers.

    abstract::Long interspersed nuclear element-1 (LINE-1 or L1) retrotransposons are normally suppressed in somatic tissues mainly due to DNA methylation and antiviral defense. However, the mechanism to suppress L1s may be disrupted in cancers, thus allowing L1s to act as insertional mutagens and cause genomic rearrangement and in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.231837.117

    authors: Jung H,Choi JK,Lee EA

    更新日期:2018-08-01 00:00:00

  • A-to-I RNA editing promotes developmental stage-specific gene and lncRNA expression.

    abstract::A-to-I RNA editing is a conserved widespread phenomenon in which adenosine (A) is converted to inosine (I) by adenosine deaminases (ADARs) in double-stranded RNA regions, mainly noncoding. Mutations in ADAR enzymes in Caenorhabditis elegans cause defects in normal development but are not lethal as in human and mouse. ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.211169.116

    authors: Goldstein B,Agranat-Tamir L,Light D,Ben-Naim Zgayer O,Fishman A,Lamm AT

    更新日期:2017-03-01 00:00:00

  • DNA methylation profiling in human B cells reveals immune regulatory elements and epigenetic plasticity at Alu elements during B-cell activation.

    abstract::Memory is a hallmark of adaptive immunity, wherein lymphocytes mount a superior response to a previously encountered antigen. It has been speculated that epigenetic alterations in memory lymphocytes contribute to their functional distinction from their naive counterparts. However, the nature and extent of epigenetic a...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.155473.113

    authors: Lai AY,Mav D,Shah R,Grimm SA,Phadke D,Hatzi K,Melnick A,Geigerman C,Sobol SE,Jaye DL,Wade PA

    更新日期:2013-12-01 00:00:00

  • Multimeric threading-based prediction of protein-protein interactions on a genomic scale: application to the Saccharomyces cerevisiae proteome.

    abstract::MULTIPROSPECTOR, a multimeric threading algorithm for the prediction of protein-protein interactions, is applied to the genome of Saccharomyces cerevisiae. Each possible pairwise interaction among more than 6000 encoded proteins is evaluated against a dimer database of 768 complex structures by using a confidence esti...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1145203

    authors: Lu L,Arakaki AK,Lu H,Skolnick J

    更新日期:2003-06-01 00:00:00

  • Adenoviral vectors expressing siRNAs for discovery and validation of gene function.

    abstract::RNA interference is a powerful tool for studying gene function and for drug target discovery in diverse organisms and cell types. In mammalian systems, small interfering RNAs (siRNAs), or DNA plasmids expressing these siRNAs, have been used to down-modulate gene expression. However, inefficient transfection protocols,...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1332603

    authors: Arts GJ,Langemeijer E,Tissingh R,Ma L,Pavliska H,Dokic K,Dooijes R,Mesić E,Clasen R,Michiels F,van der Schueren J,Lambrecht M,Herman S,Brys R,Thys K,Hoffmann M,Tomme P,van Es H

    更新日期:2003-10-01 00:00:00

  • The nonessentiality of essential genes in yeast provides therapeutic insights into a human disease.

    abstract::Essential genes refer to those whose null mutation leads to lethality or sterility. Theoretical reasoning and empirical data both suggest that the fatal effect of inactivating an essential gene can be attributed to either the loss of indispensable core cellular function (Type I), or the gain of fatal side effects afte...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.205955.116

    authors: Chen P,Wang D,Chen H,Zhou Z,He X

    更新日期:2016-10-01 00:00:00

  • Core promoter T-blocks correlate with gene expression levels in C. elegans.

    abstract::Core promoters mediate transcription initiation by the integration of diverse regulatory signals encoded in the proximal promoter and enhancers. It has been suggested that genes under simple regulation may have low-complexity permissive promoters. For these genes, the core promoter may serve as the principal regulator...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.113381.110

    authors: Grishkevich V,Hashimshony T,Yanai I

    更新日期:2011-05-01 00:00:00

  • The complete genome and proteome of Mycoplasma mobile.

    abstract::Although often considered "minimal" organisms, mycoplasmas show a wide range of diversity with respect to host environment, phenotypic traits, and pathogenicity. Here we report the complete genomic sequence and proteogenomic map for the piscine mycoplasma Mycoplasma mobile, noted for its robust gliding motility. For t...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2674004

    authors: Jaffe JD,Stange-Thomann N,Smith C,DeCaprio D,Fisher S,Butler J,Calvo S,Elkins T,FitzGerald MG,Hafez N,Kodira CD,Major J,Wang S,Wilkinson J,Nicol R,Nusbaum C,Birren B,Berg HC,Church GM

    更新日期:2004-08-01 00:00:00

  • Dynamic changes in replication timing and gene expression during lineage specification of human pluripotent stem cells.

    abstract::Duplication of the genome in mammalian cells occurs in a defined temporal order referred to as its replication-timing (RT) program. RT changes dynamically during development, regulated in units of 400-800 kb referred to as replication domains (RDs). Changes in RT are generally coordinated with transcriptional competen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.187989.114

    authors: Rivera-Mulia JC,Buckley Q,Sasaki T,Zimmerman J,Didier RA,Nazor K,Loring JF,Lian Z,Weissman S,Robins AJ,Schulz TC,Menendez L,Kulik MJ,Dalton S,Gabr H,Kahveci T,Gilbert DM

    更新日期:2015-08-01 00:00:00

  • Systematic insertional mutagenesis of a streptomycete genome: a link between osmoadaptation and antibiotic production.

    abstract::The model organism Streptomyces coelicolor represents a genus that produces a vast range of bioactive secondary metabolites. We describe a versatile procedure for systematic and comprehensive mutagenesis of the S. coelicolor genome. The high-throughput process relies on in vitro transposon mutagenesis of an ordered co...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1710304

    authors: Bishop A,Fielding S,Dyson P,Herron P

    更新日期:2004-05-01 00:00:00

  • From first base: the sequence of the tip of the X chromosome of Drosophila melanogaster, a comparison of two sequencing strategies.

    abstract::We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome of Drosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different tran...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.173801

    authors: Benos PV,Gatt MK,Murphy L,Harris D,Barrell B,Ferraz C,Vidal S,Brun C,Demaille J,Cadieu E,Dreano S,Gloux S,Lelaure V,Mottier S,Galibert F,Borkova D,Miñana B,Kafatos FC,Bolshakov S,Sidén-Kiamos I,Papagiannakis G,S

    更新日期:2001-05-01 00:00:00

  • Next-generation tag sequencing for cancer gene expression profiling.

    abstract::We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors,...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.094482.109

    authors: Morrissy AS,Morin RD,Delaney A,Zeng T,McDonald H,Jones S,Zhao Y,Hirst M,Marra MA

    更新日期:2009-10-01 00:00:00