Reconstructing complex regions of genomes using long-read sequencing technology.

Abstract:

:Obtaining high-quality sequence continuity of complex regions of recent segmental duplication remains one of the major challenges of finishing genome assemblies. In the human and mouse genomes, this was achieved by targeting large-insert clones using costly and laborious capillary-based sequencing approaches. Sanger shotgun sequencing of clone inserts, however, has now been largely abandoned, leaving most of these regions unresolved in newer genome assemblies generated primarily by next-generation sequencing hybrid approaches. Here we show that it is possible to resolve regions that are complex in a genome-wide context but simple in isolation for a fraction of the time and cost of traditional methods using long-read single molecule, real-time (SMRT) sequencing and assembly technology from Pacific Biosciences (PacBio). We sequenced and assembled BAC clones corresponding to a 1.3-Mbp complex region of chromosome 17q21.31, demonstrating 99.994% identity to Sanger assemblies of the same clones. We targeted 44 differences using Illumina sequencing and find that PacBio and Sanger assemblies share a comparable number of validated variants, albeit with different sequence context biases. Finally, we targeted a poorly assembled 766-kbp duplicated region of the chimpanzee genome and resolved the structure and organization for a fraction of the cost and time of traditional finishing approaches. Our data suggest a straightforward path for upgrading genomes to a higher quality finished state.

journal_name

Genome Res

journal_title

Genome research

authors

Huddleston J,Ranade S,Malig M,Antonacci F,Chaisson M,Hon L,Sudmant PH,Graves TA,Alkan C,Dennis MY,Wilson RK,Turner SW,Korlach J,Eichler EE

doi

10.1101/gr.168450.113

subject

Has Abstract

pub_date

2014-04-01 00:00:00

pages

688-96

issue

4

eissn

1088-9051

issn

1549-5469

pii

gr.168450.113

journal_volume

24

pub_type

杂志文章
  • Ancient duplicated conserved noncoding elements in vertebrates: a genomic and functional analysis.

    abstract::Fish-mammal genomic comparisons have proved powerful in identifying conserved noncoding elements likely to be cis-regulatory in nature, and the majority of those tested in vivo have been shown to act as tissue-specific enhancers associated with genes involved in transcriptional regulation of development. Although most...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4143406

    authors: McEwen GK,Woolfe A,Goode D,Vavouri T,Callaway H,Elgar G

    更新日期:2006-04-01 00:00:00

  • Unamplified cap analysis of gene expression on a single-molecule sequencer.

    abstract::We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3' end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.115469.110

    authors: Kanamori-Katayama M,Itoh M,Kawaji H,Lassmann T,Katayama S,Kojima M,Bertin N,Kaiho A,Ninomiya N,Daub CO,Carninci P,Forrest AR,Hayashizaki Y

    更新日期:2011-07-01 00:00:00

  • The amphioxus genome illuminates vertebrate origins and cephalochordate biology.

    abstract::Cephalochordates, urochordates, and vertebrates evolved from a common ancestor over 520 million years ago. To improve our understanding of chordate evolution and the origin of vertebrates, we intensively searched for particular genes, gene families, and conserved noncoding elements in the sequenced genome of the cepha...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.073676.107

    authors: Holland LZ,Albalat R,Azumi K,Benito-Gutiérrez E,Blow MJ,Bronner-Fraser M,Brunet F,Butts T,Candiani S,Dishaw LJ,Ferrier DE,Garcia-Fernàndez J,Gibson-Brown JJ,Gissi C,Godzik A,Hallböök F,Hirose D,Hosomichi K,Ikuta T,I

    更新日期:2008-07-01 00:00:00

  • A large database of chicken bursal ESTs as a resource for the analysis of vertebrate gene function.

    abstract::Chicken B cells create their immunoglobulin repertoire within the Bursa of Fabricius by gene conversion. The high homologous recombination activity is shared by the bursal B-cell-derived DT40 cell line, which integrates transfected DNA constructs at high rates into its endogenous loci. Targeted integration in DT40 is ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.12.2062

    authors: Abdrakhmanov I,Lodygin D,Geroth P,Arakawa H,Law A,Plachy J,Korn B,Buerstedde JM

    更新日期:2000-12-01 00:00:00

  • Two large families of chemoreceptor genes in the nematodes Caenorhabditis elegans and Caenorhabditis briggsae reveal extensive gene duplication, diversification, movement, and intron loss.

    abstract::The str family of genes encoding seven-transmembrane G-protein-coupled or serpentine receptors related to the ODR-10 diacetyl chemoreceptor is very large, with at least 197 members in the Caenorhabditis elegans genome. The closely related stl family has 43 genes, and both families are distantly related to the srd fami...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.5.449

    authors: Robertson HM

    更新日期:1998-05-01 00:00:00

  • Transposon expression in the Drosophila brain is driven by neighboring genes and diversifies the neural transcriptome.

    abstract::Somatic transposon expression in neural tissue is commonly considered as a measure of mobilization and has therefore been linked to neuropathology and organismal individuality. We combined genome sequencing data with single-cell mRNA sequencing of the same inbred fly strain to map transposon expression in the Drosophi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.259200.119

    authors: Treiber CD,Waddell S

    更新日期:2020-11-01 00:00:00

  • Nonrandom domain organization of the Arabidopsis genome at the nuclear periphery.

    abstract::The nuclear space is not a homogeneous biochemical environment. Many studies have demonstrated that the transcriptional activity of a gene is linked to its positioning within the nuclear space. Following the discovery of lamin-associated domains (LADs), which are transcriptionally repressed chromatin regions, the nonr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.215186.116

    authors: Bi X,Cheng YJ,Hu B,Ma X,Wu R,Wang JW,Liu C

    更新日期:2017-07-01 00:00:00

  • BRAFV600E remodels the melanocyte transcriptome and induces BANCR to regulate melanoma cell migration.

    abstract::Aberrations of protein-coding genes are a focus of cancer genomics; however, the impact of oncogenes on expression of the ~50% of transcripts without protein-coding potential, including long noncoding RNAs (lncRNAs), has been largely uncharacterized. Activating mutations in the BRAF oncogene are present in >70% of mel...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.140061.112

    authors: Flockhart RJ,Webster DE,Qu K,Mascarenhas N,Kovalski J,Kretz M,Khavari PA

    更新日期:2012-06-01 00:00:00

  • A network of transcriptionally coordinated functional modules in Saccharomyces cerevisiae.

    abstract::Recent computational and experimental work suggests that functional modules underlie much of cellular physiology and are a useful unit of cellular organization from the perspective of systems biology. Because interactions among modules can give rise to higher-level properties that are essential to cellular function, a...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.3847105

    authors: Petti AA,Church GM

    更新日期:2005-09-01 00:00:00

  • Gene3D: structural assignment for whole genes and genomes using the CATH domain structure database.

    abstract::We present a novel web-based resource, Gene3D, of precalculated structural assignments to gene sequences and whole genomes. This resource assigns structural domains from the CATH database to whole genes and links these to their curated functional and structural annotations within the CATH domain structure database, th...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.213802

    authors: Buchan DW,Shepherd AJ,Lee D,Pearl FM,Rison SC,Thornton JM,Orengo CA

    更新日期:2002-03-01 00:00:00

  • Long RT-PCR of the entire 8.5-kb NF1 open reading frame and mutation detection on agarose gels.

    abstract::Previous approaches to mutation detection in mRNA from the neurofibromatosis 1 (NF1) locus have required the PCR amplification of five or more overlapping cDNA segments to screen the entire 8.5-kb open reading frame (ORF). Systematically, these assays do not detect deletions that span the region of overlap (usually 1-...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6.1.58

    authors: Martinez JM,Breidenbach HH,Cawthon R

    更新日期:1996-01-01 00:00:00

  • Impact of genomic structural variation in Drosophila melanogaster based on population-scale sequencing.

    abstract::Genomic structural variation (SV) is a major determinant for phenotypic variation. Although it has been extensively studied in humans, the nucleotide resolution structure of SVs within the widely used model organism Drosophila remains unknown. We report a highly accurate, densely validated map of unbalanced SVs compri...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.142646.112

    authors: Zichner T,Garfield DA,Rausch T,Stütz AM,Cannavó E,Braun M,Furlong EE,Korbel JO

    更新日期:2013-03-01 00:00:00

  • Discovery of regulatory elements by a computational method for phylogenetic footprinting.

    abstract::Phylogenetic footprinting is a method for the discovery of regulatory elements in a set of orthologous regulatory regions from multiple species. It does so by identifying the best conserved motifs in those orthologous regions. We describe a computer algorithm designed specifically for this purpose, making use of the p...

    journal_title:Genome research

    pub_type: 信件

    doi:10.1101/gr.6902

    authors: Blanchette M,Tompa M

    更新日期:2002-05-01 00:00:00

  • The human obese (OB) gene: RNA expression pattern and mapping on the physical, cytogenetic, and genetic maps of chromosome 7.

    abstract::The recently identified mouse obese (ob) gene apparently encodes a secreted protein that may function in the signaling pathway of adipose tissue. Mutations in the mouse ob gene are associated with the early development of gross obesity. A detailed knowledge concerning the RNA expression pattern and precise genomic loc...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5.1.5

    authors: Green ED,Maffei M,Braden VV,Proenca R,DeSilva U,Zhang Y,Chua SC Jr,Leibel RL,Weissenbach J,Friedman JM

    更新日期:1995-08-01 00:00:00

  • Genomic localization of RNA binding proteins reveals links between pre-mRNA processing and transcription.

    abstract::Pre-mRNA processing often occurs in coordination with transcription thereby coupling these two key regulatory events. As such, many proteins involved in mRNA processing associate with the transcriptional machinery and are in proximity to DNA. This proximity allows for the mapping of the genomic associations of RNA bin...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5211806

    authors: Swinburne IA,Meyer CA,Liu XS,Silver PA,Brodsky AS

    更新日期:2006-07-01 00:00:00

  • Yeast genetic interaction screen of human genes associated with amyotrophic lateral sclerosis: identification of MAP2K5 kinase as a potential drug target.

    abstract::To understand disease mechanisms, a large-scale analysis of human-yeast genetic interactions was performed. Of 1305 human disease genes assayed, 20 genes exhibited strong toxicity in yeast. Human-yeast genetic interactions were identified by en masse transformation of the human disease genes into a pool of 4653 homozy...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.211649.116

    authors: Jo M,Chung AY,Yachie N,Seo M,Jeon H,Nam Y,Seo Y,Kim E,Zhong Q,Vidal M,Park HC,Roth FP,Suk K

    更新日期:2017-09-01 00:00:00

  • Pervasive polymorphic imprinted methylation in the human placenta.

    abstract::The maternal and paternal copies of the genome are both required for mammalian development, and this is primarily due to imprinted genes, those that are monoallelically expressed based on parent-of-origin. Typically, this pattern of expression is regulated by differentially methylated regions (DMRs) that are establish...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.196139.115

    authors: Hanna CW,Peñaherrera MS,Saadeh H,Andrews S,McFadden DE,Kelsey G,Robinson WP

    更新日期:2016-06-01 00:00:00

  • High mutational rates of large-scale duplication and deletion in Daphnia pulex.

    abstract::Knowledge of the genome-wide rate and spectrum of mutations is necessary to understand the origin of disease and the genetic variation driving all evolutionary processes. Here, we provide a genome-wide analysis of the rate and spectrum of mutations obtained in two Daphnia pulex genotypes via separate mutation-accumula...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.191338.115

    authors: Keith N,Tucker AE,Jackson CE,Sung W,Lucas Lledó JI,Schrider DR,Schaack S,Dudycha JL,Ackerman M,Younge AJ,Shaw JR,Lynch M

    更新日期:2016-01-01 00:00:00

  • High resolution mapping of modified DNA nucleobases using excision repair enzymes.

    abstract::The incorporation and creation of modified nucleobases in DNA have profound effects on genome function. We describe methods for mapping positions and local content of modified DNA nucleobases in genomic DNA. We combined in vitro nucleobase excision with massively parallel DNA sequencing (Excision-seq) to determine the...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.174052.114

    authors: Bryan DS,Ransom M,Adane B,York K,Hesselberth JR

    更新日期:2014-09-01 00:00:00

  • Transcriptional fates of human-specific segmental duplications in brain.

    abstract::Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently d...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.237610.118

    authors: Dougherty ML,Underwood JG,Nelson BJ,Tseng E,Munson KM,Penn O,Nowakowski TJ,Pollen AA,Eichler EE

    更新日期:2018-10-01 00:00:00

  • Comparative analysis of gene-expression patterns in human and African great ape cultured fibroblasts.

    abstract::Although much is known about genetic variation in human and African great ape (chimpanzee, bonobo, and gorilla) genomes, substantially less is known about variation in gene-expression profiles within and among these species. This information is necessary for defining transcriptional regulatory networks that contribute...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1289803

    authors: Karaman MW,Houck ML,Chemnick LG,Nagpal S,Chawannakul D,Sudano D,Pike BL,Ho VV,Ryder OA,Hacia JG

    更新日期:2003-07-01 00:00:00

  • A dynamic H3K27ac signature identifies VEGFA-stimulated endothelial enhancers and requires EP300 activity.

    abstract::Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth facto...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.149674.112

    authors: Zhang B,Day DS,Ho JW,Song L,Cao J,Christodoulou D,Seidman JG,Crawford GE,Park PJ,Pu WT

    更新日期:2013-06-01 00:00:00

  • The repetitive landscape of the chicken genome.

    abstract::Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, an...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2438004

    authors: Wicker T,Robertson JS,Schulze SR,Feltus FA,Magrini V,Morrison JA,Mardis ER,Wilson RK,Peterson DG,Paterson AH,Ivarie R

    更新日期:2005-01-01 00:00:00

  • Retrotransposon Ty1 integration targets specifically positioned asymmetric nucleosomal DNA segments in tRNA hotspots.

    abstract::The Saccharomyces cerevisiae genome contains about 35 copies of dispersed retrotransposons called Ty1 elements. Ty1 elements target regions upstream of tRNA genes and other Pol III-transcribed genes when retrotransposing to new sites. We used deep sequencing of Ty1-flanking sequence amplicons to characterize Ty1 integ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.129460.111

    authors: Mularoni L,Zhou Y,Bowen T,Gangadharan S,Wheelan SJ,Boeke JD

    更新日期:2012-04-01 00:00:00

  • Broad-spectrum respiratory tract pathogen identification using resequencing DNA microarrays.

    abstract::The exponential growth of pathogen nucleic acid sequences available in public domain databases has invited their direct use in pathogen detection, identification, and surveillance strategies. DNA microarray technology has offered the potential for the direct DNA sequence analysis of a broad spectrum of pathogens of in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4337206

    authors: Lin B,Wang Z,Vora GJ,Thornton JA,Schnur JM,Thach DC,Blaney KM,Ligler AG,Malanoski AP,Santiago J,Walter EA,Agan BK,Metzgar D,Seto D,Daum LT,Kruzelock R,Rowley RK,Hanson EH,Tibbetts C,Stenger DA

    更新日期:2006-04-01 00:00:00

  • Copy number variation at the 7q11.23 segmental duplications is a susceptibility factor for the Williams-Beuren syndrome deletion.

    abstract::Large copy number variants (CNVs) have been recently found as structural polymorphisms of the human genome of still unknown biological significance. CNVs are significantly enriched in regions with segmental duplications or low-copy repeats (LCRs). Williams-Beuren syndrome (WBS) is a neurodevelopmental disorder caused ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.073197.107

    authors: Cuscó I,Corominas R,Bayés M,Flores R,Rivera-Brugués N,Campuzano V,Pérez-Jurado LA

    更新日期:2008-05-01 00:00:00

  • Rapid molecular assays to study human centromere genomics.

    abstract::The centromere is the structural unit responsible for the faithful segregation of chromosomes. Although regulation of centromeric function by epigenetic factors has been well-studied, the contributions of the underlying DNA sequences have been much less well defined, and existing methodologies for studying centromere ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.219709.116

    authors: Contreras-Galindo R,Fischer S,Saha AK,Lundy JD,Cervantes PW,Mourad M,Wang C,Qian B,Dai M,Meng F,Chinnaiyan A,Omenn GS,Kaplan MH,Markovitz DM

    更新日期:2017-12-01 00:00:00

  • A comprehensive transcript map of the mouse Gnas imprinted complex.

    abstract::The recent publication of the FANTOM mouse transcriptome has provided a unique opportunity to study the diversity of transcripts arising from a single gene locus. We have focused on the Gnas complex, as imprinting loci themselves provide unique insights into transcriptional regulation. Thirteen full-length cDNAs from ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.955503

    authors: Holmes R,Williamson C,Peters J,Denny P,Wells C,RIKEN GER Group.,GSL Members.

    更新日期:2003-06-01 00:00:00

  • Interactome mapping suggests new mechanistic details underlying Alzheimer's disease.

    abstract::Recent advances toward the characterization of Alzheimer's disease (AD) have permitted the identification of a dozen of genetic risk factors, although many more remain undiscovered. In parallel, works in the field of network biology have shown a strong link between protein connectivity and disease. In this manuscript,...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.114280.110

    authors: Soler-López M,Zanzoni A,Lluís R,Stelzl U,Aloy P

    更新日期:2011-03-01 00:00:00

  • Exploring the human genome with functional maps.

    abstract::Human genomic data of many types are readily available, but the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it from a systems level, and apply it to the study of specific pathways or genetic disorders. An investigator could best explore a particular prot...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.082214.108

    authors: Huttenhower C,Haley EM,Hibbs MA,Dumeaux V,Barrett DR,Coller HA,Troyanskaya OG

    更新日期:2009-06-01 00:00:00