Abstract:
:Obtaining high-quality sequence continuity of complex regions of recent segmental duplication remains one of the major challenges of finishing genome assemblies. In the human and mouse genomes, this was achieved by targeting large-insert clones using costly and laborious capillary-based sequencing approaches. Sanger shotgun sequencing of clone inserts, however, has now been largely abandoned, leaving most of these regions unresolved in newer genome assemblies generated primarily by next-generation sequencing hybrid approaches. Here we show that it is possible to resolve regions that are complex in a genome-wide context but simple in isolation for a fraction of the time and cost of traditional methods using long-read single molecule, real-time (SMRT) sequencing and assembly technology from Pacific Biosciences (PacBio). We sequenced and assembled BAC clones corresponding to a 1.3-Mbp complex region of chromosome 17q21.31, demonstrating 99.994% identity to Sanger assemblies of the same clones. We targeted 44 differences using Illumina sequencing and find that PacBio and Sanger assemblies share a comparable number of validated variants, albeit with different sequence context biases. Finally, we targeted a poorly assembled 766-kbp duplicated region of the chimpanzee genome and resolved the structure and organization for a fraction of the cost and time of traditional finishing approaches. Our data suggest a straightforward path for upgrading genomes to a higher quality finished state.
journal_name
Genome Resjournal_title
Genome researchauthors
Huddleston J,Ranade S,Malig M,Antonacci F,Chaisson M,Hon L,Sudmant PH,Graves TA,Alkan C,Dennis MY,Wilson RK,Turner SW,Korlach J,Eichler EEdoi
10.1101/gr.168450.113subject
Has Abstractpub_date
2014-04-01 00:00:00pages
688-96issue
4eissn
1088-9051issn
1549-5469pii
gr.168450.113journal_volume
24pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::Fish-mammal genomic comparisons have proved powerful in identifying conserved noncoding elements likely to be cis-regulatory in nature, and the majority of those tested in vivo have been shown to act as tissue-specific enhancers associated with genes involved in transcriptional regulation of development. Although most...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4143406
更新日期:2006-04-01 00:00:00
abstract::We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3' end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.115469.110
更新日期:2011-07-01 00:00:00
abstract::Cephalochordates, urochordates, and vertebrates evolved from a common ancestor over 520 million years ago. To improve our understanding of chordate evolution and the origin of vertebrates, we intensively searched for particular genes, gene families, and conserved noncoding elements in the sequenced genome of the cepha...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.073676.107
更新日期:2008-07-01 00:00:00
abstract::Chicken B cells create their immunoglobulin repertoire within the Bursa of Fabricius by gene conversion. The high homologous recombination activity is shared by the bursal B-cell-derived DT40 cell line, which integrates transfected DNA constructs at high rates into its endogenous loci. Targeted integration in DT40 is ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.12.2062
更新日期:2000-12-01 00:00:00
abstract::The str family of genes encoding seven-transmembrane G-protein-coupled or serpentine receptors related to the ODR-10 diacetyl chemoreceptor is very large, with at least 197 members in the Caenorhabditis elegans genome. The closely related stl family has 43 genes, and both families are distantly related to the srd fami...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.8.5.449
更新日期:1998-05-01 00:00:00
abstract::Somatic transposon expression in neural tissue is commonly considered as a measure of mobilization and has therefore been linked to neuropathology and organismal individuality. We combined genome sequencing data with single-cell mRNA sequencing of the same inbred fly strain to map transposon expression in the Drosophi...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.259200.119
更新日期:2020-11-01 00:00:00
abstract::The nuclear space is not a homogeneous biochemical environment. Many studies have demonstrated that the transcriptional activity of a gene is linked to its positioning within the nuclear space. Following the discovery of lamin-associated domains (LADs), which are transcriptionally repressed chromatin regions, the nonr...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.215186.116
更新日期:2017-07-01 00:00:00
abstract::Aberrations of protein-coding genes are a focus of cancer genomics; however, the impact of oncogenes on expression of the ~50% of transcripts without protein-coding potential, including long noncoding RNAs (lncRNAs), has been largely uncharacterized. Activating mutations in the BRAF oncogene are present in >70% of mel...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.140061.112
更新日期:2012-06-01 00:00:00
abstract::Recent computational and experimental work suggests that functional modules underlie much of cellular physiology and are a useful unit of cellular organization from the perspective of systems biology. Because interactions among modules can give rise to higher-level properties that are essential to cellular function, a...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3847105
更新日期:2005-09-01 00:00:00
abstract::We present a novel web-based resource, Gene3D, of precalculated structural assignments to gene sequences and whole genomes. This resource assigns structural domains from the CATH database to whole genes and links these to their curated functional and structural annotations within the CATH domain structure database, th...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.213802
更新日期:2002-03-01 00:00:00
abstract::Previous approaches to mutation detection in mRNA from the neurofibromatosis 1 (NF1) locus have required the PCR amplification of five or more overlapping cDNA segments to screen the entire 8.5-kb open reading frame (ORF). Systematically, these assays do not detect deletions that span the region of overlap (usually 1-...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6.1.58
更新日期:1996-01-01 00:00:00
abstract::Genomic structural variation (SV) is a major determinant for phenotypic variation. Although it has been extensively studied in humans, the nucleotide resolution structure of SVs within the widely used model organism Drosophila remains unknown. We report a highly accurate, densely validated map of unbalanced SVs compri...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.142646.112
更新日期:2013-03-01 00:00:00
abstract::Phylogenetic footprinting is a method for the discovery of regulatory elements in a set of orthologous regulatory regions from multiple species. It does so by identifying the best conserved motifs in those orthologous regions. We describe a computer algorithm designed specifically for this purpose, making use of the p...
journal_title:Genome research
pub_type: 信件
doi:10.1101/gr.6902
更新日期:2002-05-01 00:00:00
abstract::The recently identified mouse obese (ob) gene apparently encodes a secreted protein that may function in the signaling pathway of adipose tissue. Mutations in the mouse ob gene are associated with the early development of gross obesity. A detailed knowledge concerning the RNA expression pattern and precise genomic loc...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.5.1.5
更新日期:1995-08-01 00:00:00
abstract::Pre-mRNA processing often occurs in coordination with transcription thereby coupling these two key regulatory events. As such, many proteins involved in mRNA processing associate with the transcriptional machinery and are in proximity to DNA. This proximity allows for the mapping of the genomic associations of RNA bin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.5211806
更新日期:2006-07-01 00:00:00
abstract::To understand disease mechanisms, a large-scale analysis of human-yeast genetic interactions was performed. Of 1305 human disease genes assayed, 20 genes exhibited strong toxicity in yeast. Human-yeast genetic interactions were identified by en masse transformation of the human disease genes into a pool of 4653 homozy...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.211649.116
更新日期:2017-09-01 00:00:00
abstract::The maternal and paternal copies of the genome are both required for mammalian development, and this is primarily due to imprinted genes, those that are monoallelically expressed based on parent-of-origin. Typically, this pattern of expression is regulated by differentially methylated regions (DMRs) that are establish...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.196139.115
更新日期:2016-06-01 00:00:00
abstract::Knowledge of the genome-wide rate and spectrum of mutations is necessary to understand the origin of disease and the genetic variation driving all evolutionary processes. Here, we provide a genome-wide analysis of the rate and spectrum of mutations obtained in two Daphnia pulex genotypes via separate mutation-accumula...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.191338.115
更新日期:2016-01-01 00:00:00
abstract::The incorporation and creation of modified nucleobases in DNA have profound effects on genome function. We describe methods for mapping positions and local content of modified DNA nucleobases in genomic DNA. We combined in vitro nucleobase excision with massively parallel DNA sequencing (Excision-seq) to determine the...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.174052.114
更新日期:2014-09-01 00:00:00
abstract::Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently d...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.237610.118
更新日期:2018-10-01 00:00:00
abstract::Although much is known about genetic variation in human and African great ape (chimpanzee, bonobo, and gorilla) genomes, substantially less is known about variation in gene-expression profiles within and among these species. This information is necessary for defining transcriptional regulatory networks that contribute...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1289803
更新日期:2003-07-01 00:00:00
abstract::Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth facto...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.149674.112
更新日期:2013-06-01 00:00:00
abstract::Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, an...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.2438004
更新日期:2005-01-01 00:00:00
abstract::The Saccharomyces cerevisiae genome contains about 35 copies of dispersed retrotransposons called Ty1 elements. Ty1 elements target regions upstream of tRNA genes and other Pol III-transcribed genes when retrotransposing to new sites. We used deep sequencing of Ty1-flanking sequence amplicons to characterize Ty1 integ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.129460.111
更新日期:2012-04-01 00:00:00
abstract::The exponential growth of pathogen nucleic acid sequences available in public domain databases has invited their direct use in pathogen detection, identification, and surveillance strategies. DNA microarray technology has offered the potential for the direct DNA sequence analysis of a broad spectrum of pathogens of in...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4337206
更新日期:2006-04-01 00:00:00
abstract::Large copy number variants (CNVs) have been recently found as structural polymorphisms of the human genome of still unknown biological significance. CNVs are significantly enriched in regions with segmental duplications or low-copy repeats (LCRs). Williams-Beuren syndrome (WBS) is a neurodevelopmental disorder caused ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.073197.107
更新日期:2008-05-01 00:00:00
abstract::The centromere is the structural unit responsible for the faithful segregation of chromosomes. Although regulation of centromeric function by epigenetic factors has been well-studied, the contributions of the underlying DNA sequences have been much less well defined, and existing methodologies for studying centromere ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.219709.116
更新日期:2017-12-01 00:00:00
abstract::The recent publication of the FANTOM mouse transcriptome has provided a unique opportunity to study the diversity of transcripts arising from a single gene locus. We have focused on the Gnas complex, as imprinting loci themselves provide unique insights into transcriptional regulation. Thirteen full-length cDNAs from ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.955503
更新日期:2003-06-01 00:00:00
abstract::Recent advances toward the characterization of Alzheimer's disease (AD) have permitted the identification of a dozen of genetic risk factors, although many more remain undiscovered. In parallel, works in the field of network biology have shown a strong link between protein connectivity and disease. In this manuscript,...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.114280.110
更新日期:2011-03-01 00:00:00
abstract::Human genomic data of many types are readily available, but the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it from a systems level, and apply it to the study of specific pathways or genetic disorders. An investigator could best explore a particular prot...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.082214.108
更新日期:2009-06-01 00:00:00