Computational comparison of human genomic sequence assemblies for a region of chromosome 4.

Abstract:

:Much of the available human genomic sequence data exist in a fragmentary draft state following the completion of the initial high-volume sequencing performed by the International Human Genome Sequencing Consortium (IHGSC) and Celera Genomics (CG). We compared six draft genome assemblies over a region of chromosome 4p (D4S394-D4S403), two consecutive releases by the IHGSC at University of California, Santa Cruz (UCSC), two consecutive releases from the National Centre for Biotechnology Information (NCBI), the public release from CG, and a hybrid assembly we have produced using IHGSC and CG sequence data. This region presents particular problems for genomic sequence assembly algorithms as it contains a large tandem repeat and is sparsely covered by draft sequences. The six assemblies differed both in terms of their relative coverage of sequence data from the region and in their estimated rates of misassembly. The CG assembly method attained the lowest level of misassembly, whereas NCBI and UCSC assemblies had the highest levels of coverage. All assemblies examined included <60% of the publicly available sequence from the region. At least 6% of the sequence data within the CG assembly for the D4S394-D4S403 region was not present in publicly available sequence data. We also show that even in a problematic region, existing software tools can be used with high-quality mapping data to produce genomic sequence contigs with a low rate of rearrangements.

journal_name

Genome Res

journal_title

Genome research

authors

Semple CA,Morris SW,Porteous DJ,Evans KL

doi

10.1101/gr.207902

subject

Has Abstract

pub_date

2002-03-01 00:00:00

pages

424-9

issue

3

eissn

1088-9051

issn

1549-5469

journal_volume

12

pub_type

杂志文章
  • Complete genomic sequence and analysis of the prion protein gene region from three mammalian species.

    abstract::The prion protein (PrP), first identified in scrapie-infected rodents, is encoded by a single exon of a single-copy chromosomal gene. In addition to the protein-coding exon, PrP genes in mammals contain one or two 5'-noncoding exons. To learn more about the genomic organization of regions surrounding the PrP exons, we...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.10.1022

    authors: Lee IY,Westaway D,Smit AF,Wang K,Seto J,Chen L,Acharya C,Ankener M,Baskin D,Cooper C,Yao H,Prusiner SB,Hood LE

    更新日期:1998-10-01 00:00:00

  • Massive reshaping of genome-nuclear lamina interactions during oncogene-induced senescence.

    abstract::Cellular senescence is a mechanism that virtually irreversibly suppresses the proliferative capacity of cells in response to various stress signals. This includes the expression of activated oncogenes, which causes Oncogene-Induced Senescence (OIS). A body of evidence points to the involvement in OIS of chromatin reor...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.225763.117

    authors: Lenain C,de Graaf CA,Pagie L,Visser NL,de Haas M,de Vries SS,Peric-Hupkes D,van Steensel B,Peeper DS

    更新日期:2017-10-01 00:00:00

  • Copy number variation at the 7q11.23 segmental duplications is a susceptibility factor for the Williams-Beuren syndrome deletion.

    abstract::Large copy number variants (CNVs) have been recently found as structural polymorphisms of the human genome of still unknown biological significance. CNVs are significantly enriched in regions with segmental duplications or low-copy repeats (LCRs). Williams-Beuren syndrome (WBS) is a neurodevelopmental disorder caused ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.073197.107

    authors: Cuscó I,Corominas R,Bayés M,Flores R,Rivera-Brugués N,Campuzano V,Pérez-Jurado LA

    更新日期:2008-05-01 00:00:00

  • Whole-genome sequence assembly for mammalian genomes: Arachne 2.

    abstract::We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal change...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.828403

    authors: Jaffe DB,Butler J,Gnerre S,Mauceli E,Lindblad-Toh K,Mesirov JP,Zody MC,Lander ES

    更新日期:2003-01-01 00:00:00

  • Construction of an approximately 700-kb transcript map around the familial Mediterranean fever locus on human chromosome 16p13.3.

    abstract::We used a combination of cDNA selection, exon amplification, and computational prediction from genomic sequence to isolate transcribed sequences from genomic DNA surrounding the familial Mediterranean fever (FMF) locus. Eighty-seven kb of genomic DNA around D16S3370, a marker showing a high degree of linkage disequili...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.11.1172

    authors: Centola M,Chen X,Sood R,Deng Z,Aksentijevich I,Blake T,Ricke DO,Chen X,Wood G,Zaks N,Richards N,Krizman D,Mansfield E,Apostolou S,Liu J,Shafran N,Vedula A,Hamon M,Cercek A,Kahan T,Gumucio D,Callen DF,Richards

    更新日期:1998-11-01 00:00:00

  • GeneID in Drosophila.

    abstract::GeneID is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, and start and stop codons are predicted and scored along the sequence using position weight matrices (PWMs). In the second step, exons are built from the sites. Exons are scored ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.4.511

    authors: Parra G,Blanco E,Guigó R

    更新日期:2000-04-01 00:00:00

  • Genes and transposons are differentially methylated in plants, but not in mammals.

    abstract::DNA methylation is found in many eukaryotes, but its function is still controversial. We have studied the methylation of plant and animal genomes using a PCR-based technique amenable for high throughput. Repetitive elements are methylated in both organisms, but whereas most mammalian exons are methylated, plant exons ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1784803

    authors: Rabinowicz PD,Palmer LE,May BP,Hemann MT,Lowe SW,McCombie WR,Martienssen RA

    更新日期:2003-12-01 00:00:00

  • The landscape of histone modifications across 1% of the human genome in five human cell lines.

    abstract::We generated high-resolution maps of histone H3 lysine 9/14 acetylation (H3ac), histone H4 lysine 5/8/12/16 acetylation (H4ac), and histone H3 at lysine 4 mono-, di-, and trimethylation (H3K4me1, H3K4me2, H3K4me3, respectively) across the ENCODE regions. Studying each modification in five human cell lines including th...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5704207

    authors: Koch CM,Andrews RM,Flicek P,Dillon SC,Karaöz U,Clelland GK,Wilcox S,Beare DM,Fowler JC,Couttet P,James KD,Lefebvre GC,Bruce AW,Dovey OM,Ellis PD,Dhami P,Langford CF,Weng Z,Birney E,Carter NP,Vetrie D,Dunham I

    更新日期:2007-06-01 00:00:00

  • Transcriptional alterations in glioma result primarily from DNA methylation-independent mechanisms.

    abstract::In cancer cells, aberrant DNA methylation is commonly associated with transcriptional alterations, including silencing of tumor suppressor genes. However, multiple epigenetic mechanisms, including polycomb repressive marks, contribute to gene deregulation in cancer. To dissect the relative contribution of DNA methylat...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.249219.119

    authors: Court F,Le Boiteux E,Fogli A,Müller-Barthélémy M,Vaurs-Barrière C,Chautard E,Pereira B,Biau J,Kemeny JL,Khalil T,Karayan-Tapon L,Verrelle P,Arnaud P

    更新日期:2019-10-01 00:00:00

  • Genomic evolution, patterns of global dissemination, and interspecies transmission of human and simian T-cell leukemia/lymphotropic viruses.

    abstract::Using both env and long terminal repeat (LTR) sequences, with maximal representation of genetic diversity within primate strains, we revise and expand the unique evolutionary history of human and simian T-cell leukemia/lymphotropic viruses (HTLV/STLV). Based on the robust application of three different phylogenetic al...

    journal_title:Genome research

    pub_type: 杂志文章,评审

    doi:

    authors: Slattery JP,Franchini G,Gessain A

    更新日期:1999-06-01 00:00:00

  • New class of microRNA targets containing simultaneous 5'-UTR and 3'-UTR interaction sites.

    abstract::MicroRNAs (miRNAs) are known to post-transcriptionally regulate target mRNAs through the 3'-UTR, which interacts mainly with the 5'-end of miRNA in animals. Here we identify many endogenous motifs within human 5'-UTRs specific to the 3'-ends of miRNAs. The 3'-end of conserved miRNAs in particular has significant inter...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.089367.108

    authors: Lee I,Ajay SS,Yook JI,Kim HS,Hong SH,Kim NH,Dhanasekaran SM,Chinnaiyan AM,Athey BD

    更新日期:2009-07-01 00:00:00

  • Interactome mapping suggests new mechanistic details underlying Alzheimer's disease.

    abstract::Recent advances toward the characterization of Alzheimer's disease (AD) have permitted the identification of a dozen of genetic risk factors, although many more remain undiscovered. In parallel, works in the field of network biology have shown a strong link between protein connectivity and disease. In this manuscript,...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.114280.110

    authors: Soler-López M,Zanzoni A,Lluís R,Stelzl U,Aloy P

    更新日期:2011-03-01 00:00:00

  • Evolution and comparative genomics of odorant- and pheromone-associated genes in rodents.

    abstract::Chemical cues influence a range of behavioral responses in rodents. The involvement of protein odorants and odorant receptors in mediating reproductive behavior, foraging, and predator avoidance suggests that their genes may have been subject to adaptive evolution. We have estimated the consequences of selection on ro...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1940604

    authors: Emes RD,Beatson SA,Ponting CP,Goodstadt L

    更新日期:2004-04-01 00:00:00

  • Whole genome shotgun sequencing of Brassica oleracea and its application to gene discovery and annotation in Arabidopsis.

    abstract::Through comparative studies of the model organism Arabidopsis thaliana and its close relative Brassica oleracea, we have identified conserved regions that represent potentially functional sequences overlooked by previous Arabidopsis genome annotation methods. A total of 454,274 whole genome shotgun sequences covering ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.3176505

    authors: Ayele M,Haas BJ,Kumar N,Wu H,Xiao Y,Van Aken S,Utterback TR,Wortman JR,White OR,Town CD

    更新日期:2005-04-01 00:00:00

  • Dynamic changes in replication timing and gene expression during lineage specification of human pluripotent stem cells.

    abstract::Duplication of the genome in mammalian cells occurs in a defined temporal order referred to as its replication-timing (RT) program. RT changes dynamically during development, regulated in units of 400-800 kb referred to as replication domains (RDs). Changes in RT are generally coordinated with transcriptional competen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.187989.114

    authors: Rivera-Mulia JC,Buckley Q,Sasaki T,Zimmerman J,Didier RA,Nazor K,Loring JF,Lian Z,Weissman S,Robins AJ,Schulz TC,Menendez L,Kulik MJ,Dalton S,Gabr H,Kahveci T,Gilbert DM

    更新日期:2015-08-01 00:00:00

  • 2C-Cas9: a versatile tool for clonal analysis of gene function.

    abstract::CRISPR/Cas9-mediated targeted mutagenesis allows efficient generation of loss-of-function alleles in zebrafish. To date, this technology has been primarily used to generate genetic knockout animals. Nevertheless, the study of the function of certain loci might require tight spatiotemporal control of gene inactivation....

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.196170.115

    authors: Di Donato V,De Santis F,Auer TO,Testa N,Sánchez-Iranzo H,Mercader N,Concordet JP,Del Bene F

    更新日期:2016-05-01 00:00:00

  • EbEST: an automated tool using expressed sequence tags to delineate gene structure.

    abstract::Large numbers of expressed sequence tags (ESTs) continue to fill public and private databases with partial cDNA sequences. However, using this huge amount of ESTs to facilitate gene finding in genomic sequence imposes a challenge, especially to wet-lab scientists who often have limited computing resources. In an effor...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.3.268

    authors: Jiang J,Jacob HJ

    更新日期:1998-03-01 00:00:00

  • Rapid evolution of mouse Y centromere repeat DNA belies recent sequence stability.

    abstract::The Y centromere sequence of house mouse, Mus musculus, remains unknown despite our otherwise significant knowledge of the genome sequence of this important mammalian model organism. Here, we report the complete molecular characterization of the C57BL/6J chromosome Y centromere, which comprises a highly diverged minor...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.092080.109

    authors: Pertile MD,Graham AN,Choo KH,Kalitsis P

    更新日期:2009-12-01 00:00:00

  • Characterization and dynamics of pericentromere-associated domains in mice.

    abstract::Despite recent progress in genome topology knowledge, the role of repeats, which make up the majority of mammalian genomes, remains elusive. Satellite repeats are highly abundant sequences that cluster around centromeres, attract pericentromeric heterochromatin, and aggregate into nuclear chromocenters. These nuclear ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.186643.114

    authors: Wijchers PJ,Geeven G,Eyres M,Bergsma AJ,Janssen M,Verstegen M,Zhu Y,Schell Y,Vermeulen C,de Wit E,de Laat W

    更新日期:2015-07-01 00:00:00

  • Evolution of a genomic regulatory domain: the role of gene co-option and gene duplication in the Enhancer of split complex.

    abstract::The Drosophila Enhancer of split complex [E(spl)-C] is a remarkable complex of genes many of which are effectors or modulators of Notch signaling. The complex contains different classes of genes including four bearded genes and seven basic helix-loop-helix (bHLH) genes. We examined the evolution of this unusual comple...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.104794.109

    authors: Duncan EJ,Dearden PK

    更新日期:2010-07-01 00:00:00

  • Construction of a linkage map of the medaka (Oryzias latipes) and mapping of the Da mutant locus defective in dorsoventral patterning.

    abstract::Double anal fin (Da) is a medaka with an autosomal semidominant mutation that causes mirror image duplication of the ventral region concentrating on the caudal region. The chromosomal location of the Da gene and its sequence have remained unknown. We constructed a medaka linkage map as a first step to approach positio...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.9.12.1277

    authors: Ohtsuka M,Makino S,Yoda K,Wada H,Naruse K,Mitani H,Shima A,Ozato K,Kimura M,Inoko H

    更新日期:1999-12-01 00:00:00

  • Ancient duplicated conserved noncoding elements in vertebrates: a genomic and functional analysis.

    abstract::Fish-mammal genomic comparisons have proved powerful in identifying conserved noncoding elements likely to be cis-regulatory in nature, and the majority of those tested in vivo have been shown to act as tissue-specific enhancers associated with genes involved in transcriptional regulation of development. Although most...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4143406

    authors: McEwen GK,Woolfe A,Goode D,Vavouri T,Callaway H,Elgar G

    更新日期:2006-04-01 00:00:00

  • General gene movement off the X chromosome in the Drosophila genus.

    abstract::In Drosophila melanogaster, there is an excess of genes duplicated by retroposition from the X chromosome to the autosomes. Most of those retrogenes that originated on the X chromosome have testis expression pattern. These observations could be explained by natural selection favoring genes that avoided spermatogenesis...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.088609.108

    authors: Vibranovski MD,Zhang Y,Long M

    更新日期:2009-05-01 00:00:00

  • The use of exome capture RNA-seq for highly degraded RNA with application to clinical cancer sequencing.

    abstract::RNA-seq by poly(A) selection is currently the most common protocol for whole transcriptome sequencing as it provides a broad, detailed, and accurate view of the RNA landscape. Unfortunately, the utility of poly(A) libraries is greatly limited when the input RNA is degraded, which is the norm for research tissues and c...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.189621.115

    authors: Cieslik M,Chugh R,Wu YM,Wu M,Brennan C,Lonigro R,Su F,Wang R,Siddiqui J,Mehra R,Cao X,Lucas D,Chinnaiyan AM,Robinson D

    更新日期:2015-09-01 00:00:00

  • metaSPAdes: a new versatile metagenomic assembler.

    abstract::While metagenomics has emerged as a technology of choice for analyzing bacterial populations, the assembly of metagenomic data remains challenging, thus stifling biological discoveries. Moreover, recent studies revealed that complex bacterial populations may be composed from dozens of related strains, thus further amp...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.213959.116

    authors: Nurk S,Meleshko D,Korobeynikov A,Pevzner PA

    更新日期:2017-05-01 00:00:00

  • A role for palindromic structures in the cis-region of maize Sirevirus LTRs in transposable element evolution and host epigenetic response.

    abstract::Transposable elements (TEs) proliferate within the genome of their host, which responds by silencing them epigenetically. Much is known about the mechanisms of silencing in plants, particularly the role of siRNAs in guiding DNA methylation. In contrast, little is known about siRNA targeting patterns along the length o...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.193763.115

    authors: Bousios A,Diez CM,Takuno S,Bystry V,Darzentas N,Gaut BS

    更新日期:2016-02-01 00:00:00

  • The first five years of single-cell cancer genomics and beyond.

    abstract::Single-cell sequencing (SCS) is a powerful new tool for investigating evolution and diversity in cancer and understanding the role of rare cells in tumor progression. These methods have begun to unravel key questions in cancer biology that have been difficult to address with bulk tumor measurements. Over the past five...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.191098.115

    authors: Navin NE

    更新日期:2015-10-01 00:00:00

  • A bioinformatics-based strategy identifies c-Myc and Cdc25A as candidates for the Apmt mammary tumor latency modifiers.

    abstract::The epistatically interacting modifier loci (Apmt1 and Apmt2) accelerate the polyoma Middle-T (PyVT)-induced mammary tumor. To identify potential candidate genes loci, a combined bioinformatics and genomics strategy was used. On the basis of the assumption that the loci were functioning in the same or intersecting pat...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.210502

    authors: Cozma D,Lukes L,Rouse J,Qiu TH,Liu ET,Hunter KW

    更新日期:2002-06-01 00:00:00

  • Evidence for widespread subfunctionalization of splice forms in vertebrate genomes.

    abstract::Gene duplication and alternative splicing are important sources of proteomic diversity. Despite research indicating that gene duplication and alternative splicing are negatively correlated, the evolutionary relationship between the two remains unclear. One manner in which alternative splicing and gene duplication may ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.184473.114

    authors: Lambert MJ,Cochran WO,Wilde BM,Olsen KG,Cooper CD

    更新日期:2015-05-01 00:00:00

  • Rapid comparative genomic analysis for clinical microbiology: the Francisella tularensis paradigm.

    abstract::It is critical to avoid delays in detecting strain manipulations, such as the addition/deletion of a gene or modification of genes for increased virulence or antibiotic resistance, using genome analysis during an epidemic outbreak or a bioterrorist attack. Our objective was to evaluate the efficiency of genome analysi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.071266.107

    authors: La Scola B,Elkarkouri K,Li W,Wahab T,Fournous G,Rolain JM,Biswas S,Drancourt M,Robert C,Audic S,Löfdahl S,Raoult D

    更新日期:2008-05-01 00:00:00