Abstract:
:Next-generation sequencing is a powerful approach for discovering genetic variation. Sensitive variant calling and haplotype inference from population sequencing data remain challenging. We describe methods for high-quality discovery, genotyping, and phasing of SNPs for low-coverage (approximately 5×) sequencing of populations, implemented in a pipeline called SNPTools. Our pipeline contains several innovations that specifically address challenges caused by low-coverage population sequencing: (1) effective base depth (EBD), a nonparametric statistic that enables more accurate statistical modeling of sequencing data; (2) variance ratio scoring, a variance-based statistic that discovers polymorphic loci with high sensitivity and specificity; and (3) BAM-specific binomial mixture modeling (BBMM), a clustering algorithm that generates robust genotype likelihoods from heterogeneous sequencing data. Last, we develop an imputation engine that refines raw genotype likelihoods to produce high-quality phased genotypes/haplotypes. Designed for large population studies, SNPTools' input/output (I/O) and storage aware design leads to improved computing performance on large sequencing data sets. We apply SNPTools to the International 1000 Genomes Project (1000G) Phase 1 low-coverage data set and obtain genotyping accuracy comparable to that of SNP microarray.
journal_name
Genome Resjournal_title
Genome researchauthors
Wang Y,Lu J,Yu J,Gibbs RA,Yu Fdoi
10.1101/gr.146084.112subject
Has Abstractpub_date
2013-05-01 00:00:00pages
833-42issue
5eissn
1088-9051issn
1549-5469pii
gr.146084.112journal_volume
23pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::DNA microarrays produced by deposition (or 'spotting')of a single long oligonucleotide probe for each gene may be an attractive alternative to other types of arrays. We produced spotted oligonucleotide arrays using two large collections of approximately 70-mer probes, and used these arrays to analyze gene expression i...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1048803
更新日期:2003-07-01 00:00:00
abstract::Molecular evolution studies are usually based on the analysis of individual genes and thus reflect only small-range variations in genomic sequences. A complementary approach is to study the evolutionary history of rearrangements in entire genomes based on the analysis of gene orders. The progress in whole genome seque...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3002305
更新日期:2005-01-01 00:00:00
abstract::Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic genealogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat S...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.135780.111
更新日期:2012-06-01 00:00:00
abstract::Extracellular cues play critical roles in the establishment of the epigenome during development and may also contribute to epigenetic perturbations found in disease states. The direct role of the local tissue environment on the post-development human epigenome, however, remains unclear due to limitations in studies of...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.166439.113
更新日期:2014-04-01 00:00:00
abstract::Here we describe software tools for the automated detection of DNA restriction fragments resolved on agarose fingerprinting gels. We present a mathematical model for the location and shape of the restriction fragments as a function of fragment size, with model parameters determined empirically from "marker" lanes cont...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.904303
更新日期:2003-05-01 00:00:00
abstract::We have developed a novel quantitative method for rapidly assessing the CpG methylation density of a DNA region in mammalian cells. After bisulfite modification of genomic DNA, the region of interest is PCR amplified with primers containing two dam sites (GATC). The purified PCR products are then incubated with 14C-la...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.202501
更新日期:2002-01-01 00:00:00
abstract::Thousands of long noncoding RNAs (lncRNAs) have been found in vertebrate animals, a few of which have known biological roles. To better understand the genomics and features of lncRNAs in invertebrates, we used available RNA-seq, poly(A)-site, and ribosome-mapping data to identify lncRNAs of Caenorhabditis elegans. We ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.140475.112
更新日期:2012-12-01 00:00:00
abstract::Contamination by present-day human and microbial DNA is one of the major hindrances for large-scale genomic studies using ancient biological material. We describe a new molecular method, U selection, which exploits one of the most distinctive features of ancient DNA--the presence of deoxyuracils--for selective enrichm...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.174201.114
更新日期:2014-09-01 00:00:00
abstract::In-gel competitive reassociation (IGCR) is a method of differential subtraction to enrich polymorphic DNA restriction fragments between two DNA samples without probes or specific sequence information. Here, we show that by combining IGCR and expressed sequence tags (EST) array hybridization, polymorphic DNA fragments ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.434103
更新日期:2003-03-01 00:00:00
abstract::Upon invasion of the erythrocyte cell, the malaria parasite remodels its environment; in particular, it establishes a complex membrane network, which connects the parasitophorous vacuole to the host plasma membrane and is involved in protein transport and trafficking. We have identified a novel subtelomeric gene famil...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.2126104
更新日期:2004-06-01 00:00:00
abstract::Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. el...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.244830.118
更新日期:2019-06-01 00:00:00
abstract::Primate pericentromeric regions recently have been shown to exhibit extraordinary evolutionary plasticity. In this paper we report an additional peculiar feature of these regions that we discovered while analyzing, by FISH, the evolutionary conservation of primate phylogenetic chromosome IX. If the position of the cen...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.9.12.1184
更新日期:1999-12-01 00:00:00
abstract::All individuals in a finite population are related if traced back long enough and will, therefore, share regions of their genomes identical by descent (IBD). Detection of such regions has several important applications-from answering questions about human evolution to locating regions in the human genome containing di...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.115360.110
更新日期:2011-07-01 00:00:00
abstract::A time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose time course regulatory analysis (TimeReg) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility da...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.257063.119
更新日期:2020-04-01 00:00:00
abstract::The need to translate genes to function has positioned the rat as an invaluable animal model for genomic research. The significant increase in genomic resources in recent years has had an immediate functional application in the rat. Many of the resources for translational research are already in place and are ready to...
journal_title:Genome research
pub_type: 杂志文章,评审
doi:10.1101/gr.3744005
更新日期:2005-12-01 00:00:00
abstract::Low-copy repeats, or segmental duplications, are highly dynamic regions in the genome. The low-copy repeats on chromosome 22q11.2 (LCR22) are a complex mosaic of genes and pseudogenes formed by duplication processes; they mediate chromosome rearrangements associated with velo-cardio-facial syndrome/DiGeorge syndrome, ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1549503
更新日期:2003-12-01 00:00:00
abstract::Increasing evidence suggests that interactions between regulatory genomic elements play an important role in regulating gene expression. We generated a genome-wide interaction map of regulatory elements in human cells (ENCODE tier 1 cells, K562, GM12878) using Chromatin Interaction Analysis by Paired-End Tag sequencin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.176586.114
更新日期:2014-12-01 00:00:00
abstract::Cephalochordates, urochordates, and vertebrates evolved from a common ancestor over 520 million years ago. To improve our understanding of chordate evolution and the origin of vertebrates, we intensively searched for particular genes, gene families, and conserved noncoding elements in the sequenced genome of the cepha...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.073676.107
更新日期:2008-07-01 00:00:00
abstract::Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth facto...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.149674.112
更新日期:2013-06-01 00:00:00
abstract::In contrast to other animal cell lines, the chicken pre-B cell lymphoma line, DT40, exhibits a high level of homologous recombination, which can be exploited to generate site-specific alterations in defined target genes or regions. In addition, the ability to generate human/chicken monochromosomal hybrids in the DT40 ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.8.6.666
更新日期:1998-06-01 00:00:00
abstract::CTCF is a ubiquitously expressed regulator of fundamental genomic processes including transcription, intra- and interchromosomal interactions, and chromatin structure. Because of its critical role in genome function, CTCF binding patterns have long been assumed to be largely invariant across different cellular environ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.136101.111
更新日期:2012-09-01 00:00:00
abstract::The very small fraction of putative binding sites (BSs) that are occupied by transcription factors (TFs) in vivo can be highly variable across different cell types. This observation has been partly attributed to changes in chromatin accessibility and histone modification (HM) patterns surrounding BSs. Previous studies...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.220079.116
更新日期:2018-01-11 00:00:00
abstract::The reported human genome sequence includes about 400 gaps of unknown sequence that were not found in the bacterial artificial chromosome (BAC) and cosmid libraries used for sequencing of the genome. These missing sequences correspond to approximately 1% of euchromatic regions of the human genome. Gap filling is a lab...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1929904
更新日期:2004-02-01 00:00:00
abstract::Double anal fin (Da) is a medaka with an autosomal semidominant mutation that causes mirror image duplication of the ventral region concentrating on the caudal region. The chromosomal location of the Da gene and its sequence have remained unknown. We constructed a medaka linkage map as a first step to approach positio...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.9.12.1277
更新日期:1999-12-01 00:00:00
abstract::The detailed genomic organization of a gene-dense region at human chromosome 12p13, spanning 223 kb of contiguous sequence, was determined. This region is composed of 20 genes and several other expressed sequences. Experimental tools including RT-PCR and cDNA sequencing, combined with gene prediction programs, were ut...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.7.3.268
更新日期:1997-03-01 00:00:00
abstract::Chicken B cells create their immunoglobulin repertoire within the Bursa of Fabricius by gene conversion. The high homologous recombination activity is shared by the bursal B-cell-derived DT40 cell line, which integrates transfected DNA constructs at high rates into its endogenous loci. Targeted integration in DT40 is ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.12.2062
更新日期:2000-12-01 00:00:00
abstract::The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.8.8.842
更新日期:1998-08-01 00:00:00
abstract::DNA is a universal language encrypted with biological instruction for life. In higher organisms, the genetic information is preserved predominantly in an organized exon/intron structure. When a gene is expressed, the exons are spliced together to form the transcript for protein synthesis. We have developed a complexit...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.313703
更新日期:2003-02-01 00:00:00
abstract::Sequence-specific DNA-binding transcription factors have widespread biological significance in the regulation of gene expression. However, in lower prokaryotes and eukaryotic metazoans, it is usually difficult to find transcription regulatory factors that recognize specific target promoters. To address this, we have d...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.086595.108
更新日期:2009-07-01 00:00:00
abstract::How pathogens evolve their virulence to humans in nature is a scientific issue of great medical and biological importance. Shiga toxin (Stx)-producing Escherichia coli (STEC) and enteropathogenic E. coli (EPEC) are the major foodborne pathogens that can cause hemolytic uremic syndrome and infantile diarrhea, respectiv...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.249268.119
更新日期:2019-09-01 00:00:00