Systematic analysis of dark and camouflaged genes reveals disease-relevant genes hiding in plain sight.

Abstract:

BACKGROUND:The human genome contains "dark" gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads that we call dark by depth, and others that have ambiguous alignment, called camouflaged. We assess how well long-read or linked-read technologies resolve these regions. RESULTS:Based on standard whole-genome Illumina sequencing data, we identify 36,794 dark regions in 6054 gene bodies from pathways important to human health, development, and reproduction. Of these gene bodies, 8.7% are completely dark and 35.2% are ≥ 5% dark. We identify dark regions that are present in protein-coding exons across 748 genes. Linked-read or long-read sequencing technologies from 10x Genomics, PacBio, and Oxford Nanopore Technologies reduce dark protein-coding regions to approximately 50.5%, 35.6%, and 9.6%, respectively. We present an algorithm to resolve most camouflaged regions and apply it to the Alzheimer's Disease Sequencing Project. We rescue a rare ten-nucleotide frameshift deletion in CR1, a top Alzheimer's disease gene, found in disease cases but not in controls. CONCLUSIONS:While we could not formally assess the association of the CR1 frameshift mutation with Alzheimer's disease due to insufficient sample-size, we believe it merits investigating in a larger cohort. There remain thousands of potentially important genomic regions overlooked by short-read sequencing that are largely resolved by long-read technologies.

journal_name

Genome Biol

journal_title

Genome biology

authors

Ebbert MTW,Jensen TD,Jansen-West K,Sens JP,Reddy JS,Ridge PG,Kauwe JSK,Belzil V,Pregent L,Carrasquillo MM,Keene D,Larson E,Crane P,Asmann YW,Ertekin-Taner N,Younkin SG,Ross OA,Rademakers R,Petrucelli L,Fryer JD

doi

10.1186/s13059-019-1707-2

subject

Has Abstract

pub_date

2019-05-20 00:00:00

pages

97

issue

1

eissn

1474-7596

issn

1474-760X

pii

10.1186/s13059-019-1707-2

journal_volume

20

pub_type

杂志文章
  • A Keystone for ncRNA.

    abstract::A report on the Keystone symposium 'Non-coding RNAs' held at Snowbird, Utah, USA, 31 March to 5 April 2012. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb-2012-13-5-315

    authors: Hacisuleyman E,Cabili MN,Rinn JL

    更新日期:2012-05-25 00:00:00

  • Mining physical protein-protein interactions from the literature.

    abstract:BACKGROUND:Deciphering physical protein-protein interactions is fundamental to elucidating both the functions of proteins and biological processes. The development of high-throughput experimental technologies such as the yeast two-hybrid screening has produced an explosion in data relating to interactions. Since manual...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-s2-s12

    authors: Huang M,Ding S,Wang H,Zhu X

    更新日期:2008-01-01 00:00:00

  • Chemical genomics in yeast.

    abstract::Many drugs have unknown, controversial or multiple mechanisms of action. Four recent 'chemical genomic' studies, using genome-scale collections of yeast gene deletions that were either arrayed or barcoded, have presented complementary approaches to identifying gene-drug and pathway-drug interactions. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2004-5-9-240

    authors: Brenner C

    更新日期:2004-01-01 00:00:00

  • Accuracy and quality of massively parallel DNA pyrosequencing.

    abstract:BACKGROUND:Massively parallel pyrosequencing systems have increased the efficiency of DNA sequencing, although the published per-base accuracy of a Roche GS20 is only 96%. In genome projects, highly redundant consensus assemblies can compensate for sequencing errors. In contrast, studies of microbial diversity that cat...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2007-8-7-r143

    authors: Huse SM,Huber JA,Morrison HG,Sogin ML,Welch DM

    更新日期:2007-01-01 00:00:00

  • CEL-Seq2: sensitive highly-multiplexed single-cell RNA-Seq.

    abstract::Single-cell transcriptomics requires a method that is sensitive, accurate, and reproducible. Here, we present CEL-Seq2, a modified version of our CEL-Seq method, with threefold higher sensitivity, lower costs, and less hands-on time. We implemented CEL-Seq2 on Fluidigm's C1 system, providing its first single-cell, on-...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-0938-8

    authors: Hashimshony T,Senderovich N,Avital G,Klochendler A,de Leeuw Y,Anavy L,Gennert D,Li S,Livak KJ,Rozenblatt-Rosen O,Dor Y,Regev A,Yanai I

    更新日期:2016-04-28 00:00:00

  • Chromatin accessibility reveals insights into androgen receptor activation and transcriptional specificity.

    abstract:BACKGROUND:Epigenetic mechanisms such as chromatin accessibility impact transcription factor binding to DNA and transcriptional specificity. The androgen receptor (AR), a master regulator of the male phenotype and prostate cancer pathogenesis, acts primarily through ligand-activated transcription of target genes. Altho...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2012-13-10-r88

    authors: Tewari AK,Yardimci GG,Shibata Y,Sheffield NC,Song L,Taylor BS,Georgiev SG,Coetzee GA,Ohler U,Furey TS,Crawford GE,Febbo PG

    更新日期:2012-10-03 00:00:00

  • Muscular expressions: profiling genes in complex tissues.

    abstract::Gene-expression profiling has yielded important information about simple systems, but complex tissues have not yet been widely profiled. Four recent studies of mammalian skeletal muscles have added to the catalogs of their gene expression differences, but have yet to lead to better understanding of the molecular proce...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2001-2-12-reviews1033

    authors: Hampson R,Hughes SM

    更新日期:2001-01-01 00:00:00

  • New tricks for old NODs.

    abstract::Recent work has identified the human NOD-like receptor NLRX1 as a negative regulator of intracellular signaling leading to type I interferon production. Here we discuss these findings and the questions and implications they raise regarding the function of NOD-like receptors in the antiviral response. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2008-9-4-217

    authors: Pietras EM,Cheng G

    更新日期:2008-04-25 00:00:00

  • Comparative and functional genomics provide insights into the pathogenicity of dermatophytic fungi.

    abstract:BACKGROUND:Millions of humans and animals suffer from superficial infections caused by a group of highly specialized filamentous fungi, the dermatophytes, which exclusively infect keratinized host structures. To provide broad insights into the molecular basis of the pathogenicity-associated traits, we report the first ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-1-r7

    authors: Burmester A,Shelest E,Glöckner G,Heddergott C,Schindler S,Staib P,Heidel A,Felder M,Petzold A,Szafranski K,Feuermann M,Pedruzzi I,Priebe S,Groth M,Winkler R,Li W,Kniemeyer O,Schroeckh V,Hertweck C,Hube B,White TC

    更新日期:2011-01-01 00:00:00

  • A compendium of Caenorhabditis elegans regulatory transcription factors: a resource for mapping transcription regulatory networks.

    abstract:BACKGROUND:Transcription regulatory networks are composed of interactions between transcription factors and their target genes. Whereas unicellular networks have been studied extensively, metazoan transcription regulatory networks remain largely unexplored. Caenorhabditis elegans provides a powerful model to study such...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2005-6-13-r110

    authors: Reece-Hoyes JS,Deplancke B,Shingles J,Grove CA,Hope IA,Walhout AJ

    更新日期:2005-01-01 00:00:00

  • Frequent intra- and inter-species introgression shapes the landscape of genetic variation in bread wheat.

    abstract:BACKGROUND:Bread wheat is one of the most important and broadly studied crops. However, due to the complexity of its genome and incomplete genome collection of wild populations, the bread wheat genome landscape and domestication history remain elusive. RESULTS:By investigating the whole-genome resequencing data of 93 ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1744-x

    authors: Cheng H,Liu J,Wen J,Nie X,Xu L,Chen N,Li Z,Wang Q,Zheng Z,Li M,Cui L,Liu Z,Bian J,Wang Z,Xu S,Yang Q,Appels R,Han D,Song W,Sun Q,Jiang Y

    更新日期:2019-07-12 00:00:00

  • DNA polymerase epsilon is required for heterochromatin maintenance in Arabidopsis.

    abstract:BACKGROUND:Chromatin organizes DNA and regulates its transcriptional activity through epigenetic modifications. Heterochromatic regions of the genome are generally transcriptionally silent, while euchromatin is more prone to transcription. During DNA replication, both genetic information and chromatin modifications mus...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-02190-1

    authors: Bourguet P,López-González L,Gómez-Zambrano Á,Pélissier T,Hesketh A,Potok ME,Pouch-Pélissier MN,Perez M,Da Ines O,Latrasse D,White CI,Jacobsen SE,Benhamed M,Mathieu O

    更新日期:2020-11-25 00:00:00

  • Investigating enhancer evolution with massively parallel reporter assays.

    abstract::A recent study in Genome Biology has characterized the evolution of candidate hominoid-specific liver enhancers by using massively parallel reporter assays (MPRAs). ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-018-1502-5

    authors: Kwon SB,Ernst J

    更新日期:2018-08-14 00:00:00

  • Wrangling for microRNAs provokes much crosstalk.

    abstract::Levels of transcripts sharing microRNA response elements are co-regulated. These RNA-RNA interactions imply that combinations of microRNAs modulate cell-specific transcript networks. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-11-132

    authors: Marques AC,Tan J,Ponting CP

    更新日期:2011-11-21 00:00:00

  • The R-spondin protein family.

    abstract::The four vertebrate R-spondin proteins are secreted agonists of the canonical Wnt/β-catenin signaling pathway. These proteins are approximately 35 kDa, and are characterized by two amino-terminal furin-like repeats, which are necessary and sufficient for Wnt signal potentiation, and a thrombospondin domain situated mo...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2012-13-3-242

    authors: de Lau WB,Snel B,Clevers HC

    更新日期:2012-01-01 00:00:00

  • Comparative genomics of the pathogenic ciliate Ichthyophthirius multifiliis, its free-living relatives and a host species provide insights into adoption of a parasitic lifestyle and prospects for disease control.

    abstract:BACKGROUND:Ichthyophthirius multifiliis, commonly known as Ich, is a highly pathogenic ciliate responsible for 'white spot', a disease causing significant economic losses to the global aquaculture industry. Options for disease control are extremely limited, and Ich's obligate parasitic lifestyle makes experimental stud...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-10-r100

    authors: Coyne RS,Hannick L,Shanmugam D,Hostetler JB,Brami D,Joardar VS,Johnson J,Radune D,Singh I,Badger JH,Kumar U,Saier M,Wang Y,Cai H,Gu J,Mather MW,Vaidya AB,Wilkes DE,Rajagopalan V,Asai DJ,Pearson CG,Findly RC,Di

    更新日期:2011-10-17 00:00:00

  • Cis-acting lnc-eRNA SEELA directly binds histone H4 to promote histone recognition and leukemia progression.

    abstract:BACKGROUND:Long noncoding enhancer RNAs (lnc-eRNAs) are a subset of stable eRNAs identified from annotated lncRNAs. They might act as enhancer activity-related therapeutic targets in cancer. However, the underlying mechanism of epigenetic activation and their function in cancer initiation and progression remain largely...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-02186-x

    authors: Fang K,Huang W,Sun YM,Chen TQ,Zeng ZC,Yang QQ,Pan Q,Han C,Sun LY,Luo XQ,Wang WT,Chen YQ

    更新日期:2020-11-03 00:00:00

  • Networks for all.

    abstract::A report on the Cold Spring Harbor Laboratory/Wellcome Trust conference on Network Biology, Hinxton, UK, 27-31 August 2008. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb-2008-9-10-324

    authors: Ahnert SE,Teichmann SA

    更新日期:2008-10-27 00:00:00

  • Minimal genome-wide human CRISPR-Cas9 library.

    abstract::CRISPR guide RNA libraries have been iteratively improved to provide increasingly efficient reagents, although their large size is a barrier for many applications. We design an optimised minimal genome-wide human CRISPR-Cas9 library (MinLibCas9) by mining existing large-scale gene loss-of-function datasets, resulting ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-021-02268-4

    authors: Gonçalves E,Thomas M,Behan FM,Picco G,Pacini C,Allen F,Vinceti A,Sharma M,Jackson DA,Price S,Beaver CM,Dovey O,Parry-Smith D,Iorio F,Parts L,Yusa K,Garnett MJ

    更新日期:2021-01-21 00:00:00

  • Epigenetic modifications of histones in cancer.

    abstract::The epigenetic modifications of histones are versatile marks that are intimately connected to development and disease pathogenesis including human cancers. In this review, we will discuss the many different types of histone modifications and the biological processes with which they are involved. Specifically, we revie...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/s13059-019-1870-5

    authors: Zhao Z,Shilatifard A

    更新日期:2019-11-20 00:00:00

  • Decontamination of ambient RNA in single-cell RNA-seq with DecontX.

    abstract::Droplet-based microfluidic devices have become widely used to perform single-cell RNA sequencing (scRNA-seq). However, ambient RNA present in the cell suspension can be aberrantly counted along with a cell's native mRNA and result in cross-contamination of transcripts between different cell populations. DecontX is a n...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-1950-6

    authors: Yang S,Corbett SE,Koga Y,Wang Z,Johnson WE,Yajima M,Campbell JD

    更新日期:2020-03-05 00:00:00

  • Localizing the proteome.

    abstract::The subcellular localization of the entire proteome of an organism, the yeast Saccharomyces cerevisiae, has been revealed for the first time. Comparison with less comprehensive studies of mammalian cells provides insights into the localization of the mammalian proteome. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2003-4-12-240

    authors: Simpson JC,Pepperkok R

    更新日期:2003-01-01 00:00:00

  • The amazing world of bacterial structured RNAs.

    abstract::The discovery of several new structured non-coding RNAs in bacterial and archaeal genomes and metagenomes raises burning questions about their biological and biochemical functions. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2010-11-3-108

    authors: Westhof E

    更新日期:2010-01-01 00:00:00

  • A Drosophila protein-interaction map centered on cell-cycle regulators.

    abstract:BACKGROUND:Maps depicting binary interactions between proteins can be powerful starting points for understanding biological systems. A proven technology for generating such maps is high-throughput yeast two-hybrid screening. In the most extensive screen to date, a Gal4-based two-hybrid system was used recently to detec...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2004-5-12-r96

    authors: Stanyon CA,Liu G,Mangiola BA,Patel N,Giot L,Kuang B,Zhang H,Zhong J,Finley RL Jr

    更新日期:2004-01-01 00:00:00

  • Mapping human pluripotent stem cell differentiation pathways using high throughput single-cell RNA-sequencing.

    abstract:BACKGROUND:Human pluripotent stem cells (hPSCs) provide powerful models for studying cellular differentiations and unlimited sources of cells for regenerative medicine. However, a comprehensive single-cell level differentiation roadmap for hPSCs has not been achieved. RESULTS:We use high throughput single-cell RNA-seq...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-018-1426-0

    authors: Han X,Chen H,Huang D,Chen H,Fei L,Cheng C,Huang H,Yuan GC,Guo G

    更新日期:2018-04-05 00:00:00

  • A comparison of automatic cell identification methods for single-cell RNA sequencing data.

    abstract:BACKGROUND:Single-cell transcriptomics is rapidly advancing our understanding of the cellular composition of complex tissues and organisms. A major limitation in most analysis pipelines is the reliance on manual annotations to determine cell identities, which are time-consuming and irreproducible. The exponential growt...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1795-z

    authors: Abdelaal T,Michielsen L,Cats D,Hoogduin D,Mei H,Reinders MJT,Mahfouz A

    更新日期:2019-09-09 00:00:00

  • EpiTEome: Simultaneous detection of transposable element insertion sites and their DNA methylation levels.

    abstract::The genome-wide investigation of DNA methylation levels has been limited to reference transposable element positions. The methylation analysis of non-reference and mobile transposable elements has only recently been performed, but required both genome resequencing and MethylC-seq datasets. We have created epiTEome, a ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1232-0

    authors: Daron J,Slotkin RK

    更新日期:2017-05-12 00:00:00

  • Uncovering metabolic pathways relevant to phenotypic traits of microbial genomes.

    abstract::Identifying the biochemical basis of microbial phenotypes is a main objective of comparative genomics. Here we present a novel method using multivariate machine learning techniques for comparing automatically derived metabolic reconstructions of sequenced genomes on a large scale. Applying our method to 266 genomes di...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2009-10-3-r28

    authors: Kastenmüller G,Schenk ME,Gasteiger J,Mewes HW

    更新日期:2009-01-01 00:00:00

  • Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation.

    abstract:BACKGROUND:Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotype...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-015-0683-4

    authors: Dueck H,Khaladkar M,Kim TK,Spaethling JM,Francis C,Suresh S,Fisher SA,Seale P,Beck SG,Bartfai T,Kuhn B,Eberwine J,Kim J

    更新日期:2015-06-09 00:00:00

  • MOABS: model based analysis of bisulfite sequencing data.

    abstract::Bisulfite sequencing (BS-seq) is the gold standard for studying genome-wide DNA methylation. We developed MOABS to increase the speed, accuracy, statistical power and biological relevance of BS-seq data analysis. MOABS detects differential methylation with 10-fold coverage at single-CpG resolution based on a Beta-Bino...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2014-15-2-r38

    authors: Sun D,Xi Y,Rodriguez B,Park HJ,Tong P,Meong M,Goodell MA,Li W

    更新日期:2014-02-24 00:00:00