Abstract:
:We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal changes were simultaneously made and applied to the assembly of the mouse genome, during a six-month period of development: (1) Supercontigs (scaffolds) were iteratively broken and rejoined using several criteria, yielding a 64-fold increase in length (N50), and apparent elimination of all global misjoins; (2) gaps between contigs in supercontigs were filled (partially or completely) by insertion of reads, as suggested by pairing within the supercontig, increasing the N50 contig length by 50%; (3) memory usage was reduced fourfold. The outcome of this mouse assembly and its analysis are described in (Mouse Genome Sequencing Consortium 2002).
journal_name
Genome Resjournal_title
Genome researchauthors
Jaffe DB,Butler J,Gnerre S,Mauceli E,Lindblad-Toh K,Mesirov JP,Zody MC,Lander ESdoi
10.1101/gr.828403subject
Has Abstractpub_date
2003-01-01 00:00:00pages
91-6issue
1eissn
1088-9051issn
1549-5469journal_volume
13pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k-mer set memory (KSM), which consist...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.226852.117
更新日期:2018-06-01 00:00:00
abstract::Mirtrons are intronic hairpin substrates of the dicing machinery that generate functional microRNAs. In this study, we describe experimental assays that defined the essential requirements for entry of introns into the mirtron pathway. These data informed a bioinformatic screen that effectively identified functional mi...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.113050.110
更新日期:2011-02-01 00:00:00
abstract::It is widely accepted that newly arisen duplicate gene pairs experience an altered selective regime that is often manifested as an increase in the rate of protein sequence evolution. Many details about the nature of the rate acceleration remain unknown, however, including its typical magnitude and duration, and whethe...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6341207
更新日期:2008-01-01 00:00:00
abstract::Comparative genomics is a promising approach to the challenging problem of eukaryotic regulatory element identification, because functional noncoding sequences may be conserved across species from evolutionary constraints. We systematically analyzed known human and Saccharomyces cerevisiae regulatory elements and disc...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1327604
更新日期:2004-03-01 00:00:00
abstract::Inbred strains of the laboratory rat are widely used for identifying genetic regions involved in the control of complex quantitative phenotypes of biomedical importance. The draft genomic sequence of the rat now provides essential information for annotating rat quantitative trait locus (QTL) maps. Following the survey...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.2001604
更新日期:2004-04-01 00:00:00
abstract::To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains....
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.5302
更新日期:2002-04-01 00:00:00
abstract::Highly overlapping patterns of genome-wide binding of many distinct transcription factors have been observed in worms, insects, and mammals, but the origins and consequences of this overlapping binding remain unclear. While analyzing chromatin immunoprecipitation data sets from 21 sequence-specific transcription facto...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.130682.111
更新日期:2012-04-01 00:00:00
abstract::Single-nucleotide polymorphisms (SNPs) are the most frequently found DNA sequence variations in the human genome. It has been argued that a dense set of SNP markers can be used to identify genetic factors associated with complex disease traits. Because all high-throughput genotyping methods require precise sequence kn...
journal_title:Genome research
pub_type: 杂志文章
doi:
更新日期:1999-05-01 00:00:00
abstract::The exponential growth of pathogen nucleic acid sequences available in public domain databases has invited their direct use in pathogen detection, identification, and surveillance strategies. DNA microarray technology has offered the potential for the direct DNA sequence analysis of a broad spectrum of pathogens of in...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4337206
更新日期:2006-04-01 00:00:00
abstract::Targeted genotyping of transcriptome-scale genetic markers is highly attractive for genetic, ecological, and evolutionary studies, but achieving this goal in a cost-effective manner remains a major challenge, especially for laboratories working on nonmodel organisms. Here, we develop a high-throughput, sequencing-base...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.235820.118
更新日期:2018-12-01 00:00:00
abstract::It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.097543.109
更新日期:2010-01-01 00:00:00
abstract::The rate of transcription elongation plays an important role in the timing of expression of full-length transcripts as well as in the regulation of alternative splicing. In this study, we coupled Bru-seq technology with 5,6-dichlorobenzimidazole 1-β-D-ribofuranoside (DRB) to estimate the elongation rates of over 2000 ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.171405.113
更新日期:2014-06-01 00:00:00
abstract::We have developed a mutation-scanning approach suitable for whole population screening for unknown mutations. The method, meltMADGE, combines thermal ramp electrophoresis with MADGE to achieve suitable cost efficiency and throughput. The sensitivity was tested in blind trials using 54 amplicons representing the BRCA1 ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3313405
更新日期:2005-07-01 00:00:00
abstract::Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth facto...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.149674.112
更新日期:2013-06-01 00:00:00
abstract::Increasing evidence suggests that interactions between regulatory genomic elements play an important role in regulating gene expression. We generated a genome-wide interaction map of regulatory elements in human cells (ENCODE tier 1 cells, K562, GM12878) using Chromatin Interaction Analysis by Paired-End Tag sequencin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.176586.114
更新日期:2014-12-01 00:00:00
abstract::Here we use a chromosome-level genome assembly of a prairie rattlesnake (Crotalus viridis), together with Hi-C, RNA-seq, and whole-genome resequencing data, to study key features of genome biology and evolution in reptiles. We identify the rattlesnake Z Chromosome, including the recombining pseudoautosomal region, and...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.240952.118
更新日期:2019-04-01 00:00:00
abstract::Identity-by-descent (IBD) inference is the problem of establishing a genetic connection between two individuals through a genomic segment that is inherited by both individuals from a recent common ancestor. IBD inference is an important preceding step in a variety of population genomic studies, ranging from demographi...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.173641.114
更新日期:2015-02-01 00:00:00
abstract::The functional classification of genes on a genome-wide scale is now in its infancy, and we make a first attempt to assess existing methods and identify sources of error. To this end, we compared two independent efforts for associating proteins with functions, one implemented by FlyBase and the other by PANTHER at Cel...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.771603
更新日期:2003-09-01 00:00:00
abstract::Here we describe software tools for the automated detection of DNA restriction fragments resolved on agarose fingerprinting gels. We present a mathematical model for the location and shape of the restriction fragments as a function of fragment size, with model parameters determined empirically from "marker" lanes cont...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.904303
更新日期:2003-05-01 00:00:00
abstract::The capacity of the honey bee to produce three phenotypically distinct organisms (two female castes; queens and sterile workers, and haploid male drones) from one genotype represents one of the most remarkable examples of developmental plasticity in any phylum. The queen-worker morphological and reproductive divide is...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.236497.118
更新日期:2018-10-01 00:00:00
abstract::Interindividual variability in response to chemicals and drugs is a common regulatory concern. It is assumed that xenobiotic-induced adverse reactions have a strong genetic basis, but many mechanism-based investigations have not been successful in identifying susceptible individuals. While recent advances in pharmacog...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.090241.108
更新日期:2009-09-01 00:00:00
abstract::Essential genes refer to those whose null mutation leads to lethality or sterility. Theoretical reasoning and empirical data both suggest that the fatal effect of inactivating an essential gene can be attributed to either the loss of indispensable core cellular function (Type I), or the gain of fatal side effects afte...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.205955.116
更新日期:2016-10-01 00:00:00
abstract::Low-copy repeats, or segmental duplications, are highly dynamic regions in the genome. The low-copy repeats on chromosome 22q11.2 (LCR22) are a complex mosaic of genes and pseudogenes formed by duplication processes; they mediate chromosome rearrangements associated with velo-cardio-facial syndrome/DiGeorge syndrome, ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1549503
更新日期:2003-12-01 00:00:00
abstract::Contamination by present-day human and microbial DNA is one of the major hindrances for large-scale genomic studies using ancient biological material. We describe a new molecular method, U selection, which exploits one of the most distinctive features of ancient DNA--the presence of deoxyuracils--for selective enrichm...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.174201.114
更新日期:2014-09-01 00:00:00
abstract::Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. el...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.244830.118
更新日期:2019-06-01 00:00:00
abstract::Large scale gene perturbation experiments generate information about the number of genes whose activity is directly or indirectly affected by a gene perturbation. From this information, one can numerically estimate coarse structural network features such as the total number of direct regulatory interactions and the nu...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.193902
更新日期:2002-02-01 00:00:00
abstract::Disturbance of DNA methylation leading to aberrant gene expression has been implicated in the etiology of many diseases. Whereas variation at the genetic level has been studied extensively, less is known about the extent and function of epigenetic variation. To explore variation and heritability of DNA methylation, we...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.119685.110
更新日期:2011-11-01 00:00:00
abstract::Hi-C is a powerful technology for studying genome-wide chromatin interactions. However, current methods for assessing Hi-C data reproducibility can produce misleading results because they ignore spatial features in Hi-C data, such as domain structure and distance dependence. We present HiCRep, a framework for assessin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.220640.117
更新日期:2017-11-01 00:00:00
abstract::After completion of the Schizosaccharomyces pombe genome sequence, we have carried out a pilot gene deletion project to assess the feasibility of a genome-wide deletion project and to estimate the percentage of essential genes. Using a PCR-based gene deletion procedure, we investigated 100 genes within a 253-kb region...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.636103
更新日期:2003-03-01 00:00:00
abstract::The initial comparison of the human and chimpanzee genome sequences revealed 16 genomic regions with an unusually high density of rapidly evolving genes. One such region is the whey acidic protein (WAP) four-disulfide core domain locus (or WFDC locus), which contains 14 WFDC genes organized in two subloci on human chr...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6004607
更新日期:2007-03-01 00:00:00