Accurate detection and genotyping of SNPs utilizing population sequencing data.

Abstract:

:Next-generation sequencing technologies have made it possible to sequence targeted regions of the human genome in hundreds of individuals. Deep sequencing represents a powerful approach for the discovery of the complete spectrum of DNA sequence variants in functionally important genomic intervals. Current methods for single nucleotide polymorphism (SNP) detection are designed to detect SNPs from single individual sequence data sets. Here, we describe a novel method SNIP-Seq (single nucleotide polymorphism identification from population sequence data) that leverages sequence data from a population of individuals to detect SNPs and assign genotypes to individuals. To evaluate our method, we utilized sequence data from a 200-kilobase (kb) region on chromosome 9p21 of the human genome. This region was sequenced in 48 individuals (five sequenced in duplicate) using the Illumina GA platform. Using this data set, we demonstrate that our method is highly accurate for detecting variants and can filter out false SNPs that are attributable to sequencing errors. The concordance of sequencing-based genotype assignments between duplicate samples was 98.8%. The 200-kb region was independently sequenced to a high depth of coverage using two sequence pools containing the 48 individuals. Many of the novel SNPs identified by SNIP-Seq from the individual sequencing were validated by the pooled sequencing data and were subsequently confirmed by Sanger sequencing. We estimate that SNIP-Seq achieves a low false-positive rate of approximately 2%, improving upon the higher false-positive rate for existing methods that do not utilize population sequence data. Collectively, these results suggest that analysis of population sequencing data is a powerful approach for the accurate detection of SNPs and the assignment of genotypes to individual samples.

journal_name

Genome Res

journal_title

Genome research

authors

Bansal V,Harismendy O,Tewhey R,Murray SS,Schork NJ,Topol EJ,Frazer KA

doi

10.1101/gr.100040.109

subject

Has Abstract

pub_date

2010-04-01 00:00:00

pages

537-45

issue

4

eissn

1088-9051

issn

1549-5469

pii

gr.100040.109

journal_volume

20

pub_type

杂志文章
  • Toward the development of a gene index to the human genome: an assessment of the nature of high-throughput EST sequence data.

    abstract::A rigorous analysis of the Merck-sponsored EST data with respect to known gene sequences increases the utility of the data set and helps refine methods for building a gene index. A highly curated human transcript data base was used as a reference data set of known genes. A detailed analysis of EST sequences derived fr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6.9.829

    authors: Aaronson JS,Eckman B,Blevins RA,Borkowski JA,Myerson J,Imran S,Elliston KO

    更新日期:1996-09-01 00:00:00

  • Alterations in TCF7L2 expression define its role as a key regulator of glucose metabolism.

    abstract::Genome-wide association studies (GWAS) have consistently implicated noncoding variation within the TCF7L2 locus with type 2 diabetes (T2D) risk. While this locus represents the strongest genetic determinant for T2D risk in humans, it remains unclear how these noncoding variants affect disease etiology. To test the hyp...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.123745.111

    authors: Savic D,Ye H,Aneas I,Park SY,Bell GI,Nobrega MA

    更新日期:2011-09-01 00:00:00

  • Ancestral grass karyotype reconstruction unravels new mechanisms of genome shuffling as a source of plant evolution.

    abstract::The comparison of the chromosome numbers of today's species with common reconstructed paleo-ancestors has led to intense speculation of how chromosomes have been rearranged over time in mammals. However, similar studies in plants with respect to genome evolution as well as molecular mechanisms leading to mosaic synten...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.109744.110

    authors: Murat F,Xu JH,Tannier E,Abrouk M,Guilhot N,Pont C,Messing J,Salse J

    更新日期:2010-11-01 00:00:00

  • The genome sequence of Mycoplasma mycoides subsp. mycoides SC type strain PG1T, the causative agent of contagious bovine pleuropneumonia (CBPP).

    abstract::Mycoplasma mycoides subsp. mycoidesSC (MmymySC)is the etiological agent of contagious bovine pleuropneumonia (CBPP), a highly contagious respiratory disease in cattle. The genome of Mmymy SC type strain PG1(T) has been sequenced to map all the genes and to facilitate further studies regarding the cell function of the ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1673304

    authors: Westberg J,Persson A,Holmberg A,Goesmann A,Lundeberg J,Johansson KE,Pettersson B,Uhlén M

    更新日期:2004-02-01 00:00:00

  • Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms.

    abstract::Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous pr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4759706

    authors: Zhang W,Qi W,Albert TJ,Motiwala AS,Alland D,Hyytia-Trees EK,Ribot EM,Fields PI,Whittam TS,Swaminathan B

    更新日期:2006-06-01 00:00:00

  • Spatial enhancer clustering and regulation of enhancer-proximal genes by cohesin.

    abstract::In addition to mediating sister chromatid cohesion during the cell cycle, the cohesin complex associates with CTCF and with active gene regulatory elements to form long-range interactions between its binding sites. Genome-wide chromosome conformation capture had shown that cohesin's main role in interphase genome orga...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.184986.114

    authors: Ing-Simmons E,Seitan VC,Faure AJ,Flicek P,Carroll T,Dekker J,Fisher AG,Lenhard B,Merkenschlager M

    更新日期:2015-04-01 00:00:00

  • A scalable high-throughput chemical synthesizer.

    abstract::A machine that employs a novel reagent delivery technique for biomolecular synthesis has been developed. This machine separates the addressing of individual synthesis sites from the actual process of reagent delivery by using masks placed over the sites. Because of this separation, this machine is both cost-effective ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.359002

    authors: Livesay EA,Liu YH,Luebke KJ,Irick J,Belosludtsev Y,Rayner S,Balog R,Johnston SA

    更新日期:2002-12-01 00:00:00

  • Genome-scale cloning and expression of individual open reading frames using topoisomerase I-mediated ligation.

    abstract::The in vitro cloning of DNA molecules traditionally uses PCR amplification or site-specific restriction endonucleases to generate linear DNA inserts with defined termini and requires DNA ligase to covalently join those inserts to vectors with the corresponding ends. We have used the properties of Vaccinia DNA topoisom...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:

    authors: Heyman JA,Cornthwaite J,Foncerrada L,Gilmore JR,Gontang E,Hartman KJ,Hernandez CL,Hood R,Hull HM,Lee WY,Marcil R,Marsh EJ,Mudd KM,Patino MJ,Purcell TJ,Rowland JJ,Sindici ML,Hoeffler JP

    更新日期:1999-04-01 00:00:00

  • The discovery of integrated gene networks for autism and related disorders.

    abstract::Despite considerable genetic heterogeneity underlying neurodevelopmental diseases, there is compelling evidence that many disease genes will map to a much smaller number of biological subnetworks. We developed a computational method, termed MAGI (merging affected genes into integrated networks), that simultaneously in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.178855.114

    authors: Hormozdiari F,Penn O,Borenstein E,Eichler EE

    更新日期:2015-01-01 00:00:00

  • Function and evolution of a gene family encoding odorant binding-like proteins in a social insect, the honey bee (Apis mellifera).

    abstract::The remarkable olfactory power of insect species is thought to be generated by a combinatorial action of two large protein families, G protein-coupled olfactory receptors (ORs) and odorant binding proteins (OBPs). In olfactory sensilla, OBPs deliver hydrophobic airborne molecules to ORs, but their expression in nonolf...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5075706

    authors: Forêt S,Maleszka R

    更新日期:2006-11-01 00:00:00

  • A novel k-mer set memory (KSM) motif representation improves regulatory variant prediction.

    abstract::The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k-mer set memory (KSM), which consist...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.226852.117

    authors: Guo Y,Tian K,Zeng H,Guo X,Gifford DK

    更新日期:2018-06-01 00:00:00

  • Massive reshaping of genome-nuclear lamina interactions during oncogene-induced senescence.

    abstract::Cellular senescence is a mechanism that virtually irreversibly suppresses the proliferative capacity of cells in response to various stress signals. This includes the expression of activated oncogenes, which causes Oncogene-Induced Senescence (OIS). A body of evidence points to the involvement in OIS of chromatin reor...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.225763.117

    authors: Lenain C,de Graaf CA,Pagie L,Visser NL,de Haas M,de Vries SS,Peric-Hupkes D,van Steensel B,Peeper DS

    更新日期:2017-10-01 00:00:00

  • High-throughput genotyping by whole-genome resequencing.

    abstract::The next-generation sequencing technology coupled with the growing number of genome sequences opens the opportunity to redesign genotyping strategies for more effective genetic mapping and genome analysis. We have developed a high-throughput method for genotyping recombinant populations utilizing whole-genome resequen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.089516.108

    authors: Huang X,Feng Q,Qian Q,Zhao Q,Wang L,Wang A,Guan J,Fan D,Weng Q,Huang T,Dong G,Sang T,Han B

    更新日期:2009-06-01 00:00:00

  • Pattern of sequence variation across 213 environmental response genes.

    abstract::To promote the clinical and epidemiological studies that improve our understanding of human genetic susceptibility to environmental exposure, the Environmental Genome Project (EGP) has scanned 213 environmental response genes involved in DNA repair, cell cycle regulation, apoptosis, and metabolism for single nucleotid...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2730004

    authors: Livingston RJ,von Niederhausern A,Jegga AG,Crawford DC,Carlson CS,Rieder MJ,Gowrisankar S,Aronow BJ,Weiss RB,Nickerson DA

    更新日期:2004-10-01 00:00:00

  • Functional DNA methylation differences between tissues, cell types, and across individuals discovered using the M&M algorithm.

    abstract::DNA methylation plays key roles in diverse biological processes such as X chromosome inactivation, transposable element repression, genomic imprinting, and tissue-specific gene expression. Sequencing-based DNA methylation profiling provides an unprecedented opportunity to map and compare complete DNA methylomes. This ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.156539.113

    authors: Zhang B,Zhou Y,Lin N,Lowdon RF,Hong C,Nagarajan RP,Cheng JB,Li D,Stevens M,Lee HJ,Xing X,Zhou J,Sundaram V,Elliott G,Gu J,Shi T,Gascard P,Sigaroudinia M,Tlsty TD,Kadlecek T,Weiss A,O'Geen H,Farnham PJ,Maire

    更新日期:2013-09-01 00:00:00

  • Arabidopsis-rice: will colinearity allow gene prediction across the eudicot-monocot divide?

    abstract::With the genomic sequencing of Arabidopsis nearing completion and rice sequencing very much in its infancy, a key question is whether we can exploit the Arabidopsis sequence to identify candidate genes for traits in cereal crops using a map-based approach. This requires the existence of colinearity between the Arabido...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.9.9.825

    authors: Devos KM,Beales J,Nagamura Y,Sasaki T

    更新日期:1999-09-01 00:00:00

  • Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome.

    abstract::In diploid mammalian genomes, parental alleles can exhibit different methylation patterns (allele-specific DNA methylation, ASM), which have been documented in a small number of cases except for the imprinted regions and X chromosomes in females. We carried out a chromosome-wide survey of ASM across 16 human pluripote...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.104695.109

    authors: Shoemaker R,Deng J,Wang W,Zhang K

    更新日期:2010-07-01 00:00:00

  • Translation initiation downstream from annotated start codons in human mRNAs coevolves with the Kozak context.

    abstract::Eukaryotic translation initiation involves preinitiation ribosomal complex 5'-to-3' directional probing of mRNA for codons suitable for starting protein synthesis. The recognition of codons as starts depends on the codon identity and on its immediate nucleotide context known as Kozak context. When the context is weak ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.257352.119

    authors: Benitez-Cantos MS,Yordanova MM,O'Connor PBF,Zhdanov AV,Kovalchuk SI,Papkovsky DB,Andreev DE,Baranov PV

    更新日期:2020-07-01 00:00:00

  • Mutation detection using mass spectrometric separation of tiny oligonucleotide fragments.

    abstract::A DNA mutation detection protocol able to identify and characterize a previously unknown change in a given sequence in a rapid, efficient, sensitive, and inexpensive manner is required to take advantage of the resources now available to researchers through the genome sequencing projects. We have developed a method bas...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.gr-1578r

    authors: Elso C,Toohey B,Reid GE,Poetter K,Simpson RJ,Foote SJ

    更新日期:2002-09-01 00:00:00

  • Polygenic cis-regulatory adaptation in the evolution of yeast pathogenicity.

    abstract::The acquisition of new genes, via horizontal transfer or gene duplication/diversification, has been the dominant mechanism thus far implicated in the evolution of microbial pathogenicity. In contrast, the role of many other modes of evolution--such as changes in gene expression regulation-remains unknown. A transition...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.134080.111

    authors: Fraser HB,Levy S,Chavan A,Shah HB,Perez JC,Zhou Y,Siegal ML,Sinha H

    更新日期:2012-10-01 00:00:00

  • CG dinucleotides enhance promoter activity independent of DNA methylation.

    abstract::Most mammalian RNA polymerase II initiation events occur at CpG islands, which are rich in CpGs and devoid of DNA methylation. Despite their relevance for gene regulation, it is unknown to what extent the CpG dinucleotide itself actually contributes to promoter activity. To address this question, we determined the tra...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.241653.118

    authors: Hartl D,Krebs AR,Grand RS,Baubec T,Isbel L,Wirbelauer C,Burger L,Schübeler D

    更新日期:2019-04-01 00:00:00

  • Genes and transposons are differentially methylated in plants, but not in mammals.

    abstract::DNA methylation is found in many eukaryotes, but its function is still controversial. We have studied the methylation of plant and animal genomes using a PCR-based technique amenable for high throughput. Repetitive elements are methylated in both organisms, but whereas most mammalian exons are methylated, plant exons ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1784803

    authors: Rabinowicz PD,Palmer LE,May BP,Hemann MT,Lowe SW,McCombie WR,Martienssen RA

    更新日期:2003-12-01 00:00:00

  • The origins and evolution of chromosomes, dosage compensation, and mechanisms underlying venom regulation in snakes.

    abstract::Here we use a chromosome-level genome assembly of a prairie rattlesnake (Crotalus viridis), together with Hi-C, RNA-seq, and whole-genome resequencing data, to study key features of genome biology and evolution in reptiles. We identify the rattlesnake Z Chromosome, including the recombining pseudoautosomal region, and...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.240952.118

    authors: Schield DR,Card DC,Hales NR,Perry BW,Pasquesi GM,Blackmon H,Adams RH,Corbin AB,Smith CF,Ramesh B,Demuth JP,Betrán E,Tollis M,Meik JM,Mackessy SP,Castoe TA

    更新日期:2019-04-01 00:00:00

  • An MCMC algorithm for haplotype assembly from whole-genome sequence data.

    abstract::In comparison to genotypes, knowledge about haplotypes (the combination of alleles present on a single chromosome) is much more useful for whole-genome association studies and for making inferences about human evolutionary history. Haplotypes are typically inferred from population genotype data using computational met...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.077065.108

    authors: Bansal V,Halpern AL,Axelrod N,Bafna V

    更新日期:2008-08-01 00:00:00

  • X chromosome cDNA microarray screening identifies a functional PLP2 promoter polymorphism enriched in patients with X-linked mental retardation.

    abstract::X-linked Mental Retardation (XLMR) occurs in 1 in 600 males and is highly genetically heterogeneous. We used a novel human X chromosome cDNA microarray (XCA) to survey the expression profile of X-linked genes in lymphoblasts of XLMR males. Genes with altered expression verified by Northern blot and/or quantitative PCR...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5336307

    authors: Zhang L,Jie C,Obie C,Abidi F,Schwartz CE,Stevenson RE,Valle D,Wang T

    更新日期:2007-05-01 00:00:00

  • Pervasive polymorphic imprinted methylation in the human placenta.

    abstract::The maternal and paternal copies of the genome are both required for mammalian development, and this is primarily due to imprinted genes, those that are monoallelically expressed based on parent-of-origin. Typically, this pattern of expression is regulated by differentially methylated regions (DMRs) that are establish...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.196139.115

    authors: Hanna CW,Peñaherrera MS,Saadeh H,Andrews S,McFadden DE,Kelsey G,Robinson WP

    更新日期:2016-06-01 00:00:00

  • An assessment of gene prediction accuracy in large DNA sequences.

    abstract::One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.122800

    authors: Guigó R,Agarwal P,Abril JF,Burset M,Fickett JW

    更新日期:2000-10-01 00:00:00

  • Coding and noncoding variants in HFM1, MLH3, MSH4, MSH5, RNF212, and RNF212B affect recombination rate in cattle.

    abstract::We herein study genetic recombination in three cattle populations from France, New Zealand, and the Netherlands. We identify 2,395,177 crossover (CO) events in 94,516 male gametes, and 579,996 CO events in 25,332 female gametes. The average number of COs was found to be larger in males (23.3) than in females (21.4). T...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.204214.116

    authors: Kadri NK,Harland C,Faux P,Cambisano N,Karim L,Coppieters W,Fritz S,Mullaart E,Baurain D,Boichard D,Spelman R,Charlier C,Georges M,Druet T

    更新日期:2016-10-01 00:00:00

  • A transposon-based strategy for sequencing repetitive DNA in eukaryotic genomes.

    abstract::Repetitive DNA is a significant component of eukaryotic genomes. We have developed a strategy to efficiently and accurately sequence repetitive DNA in the nematode Caenorhabditis elegans using integrated artificial transposons and automated fluorescent sequencing. Mapping and assembly tools represent important compone...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.7.5.551

    authors: Devine SE,Chissoe SL,Eby Y,Wilson RK,Boeke JD

    更新日期:1997-05-01 00:00:00

  • A first-generation whole genome-radiation hybrid map spanning the mouse genome.

    abstract::We have assembled a first-generation anchor map of the mouse genome using a panel of 94 whole-genome-radiation hybrids (WG-RHs) and 271 sequence-tagged sites (STSs). This is the first genome-wide RH anchor map of a model organism. All of the STSs have been previously localized on the genetic map and are located 8.8 Mb...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.7.12.1153

    authors: McCarthy LC,Terrett J,Davis ME,Knights CJ,Smith AL,Critcher R,Schmitt K,Hudson J,Spurr NK,Goodfellow PN

    更新日期:1997-12-01 00:00:00