Inference and analysis of haplotypes from combined genotyping studies deposited in dbSNP.

Abstract:

:In the attempt to understand human variation and the genetic basis of complex disease, a tremendous number of single nucleotide polymorphisms (SNPs) have been discovered and deposited into NCBI's dbSNP public database. More than 2.7 million SNPs in the database have genotype information. This data provides an invaluable resource for understanding the structure of human variation and the design of genetic association studies. The genotypes deposited to dbSNP are unphased, and thus, the haplotype information is unknown. We applied the phasing method HAP to obtain the haplotype information, block partitions, and tag SNPs for all publicly available genotype data and deposited this information into the dbSNP database. We also deposited the orthologous chimpanzee reference sequence for each predicted haplotype block computed using the UCSC BLASTZ alignments of human and chimpanzee. Using dbSNP, researchers can now easily perform analyses using multiple genotype data sets from the same genomic regions. Dense and sparse genotype data sets from the same region were combined to show that the number of common haplotypes is significantly underestimated in whole genome data sets, while the predicted haplotypes over the common SNPs are consistent between studies. To validate the accuracy of the predictions, we bench-marked HAP's running time and phasing accuracy against PHASE. Although HAP is slightly less accurate than PHASE, HAP is over 1000 times faster than PHASE, making it suitable for application to the entire set of genotypes in dbSNP.

journal_name

Genome Res

journal_title

Genome research

authors

Zaitlen NA,Kang HM,Feolo ML,Sherry ST,Halperin E,Eskin E

doi

10.1101/gr.4297805

subject

Has Abstract

pub_date

2005-11-01 00:00:00

pages

1594-600

issue

11

eissn

1088-9051

issn

1549-5469

pii

15/11/1594

journal_volume

15

pub_type

杂志文章
  • A cross-platform analysis of 14,177 expression quantitative trait loci derived from lymphoblastoid cell lines.

    abstract::Gene expression levels can be an important link DNA between variation and phenotypic manifestations. Our previous map of global gene expression, based on ~400K single nucleotide polymorphisms (SNPs) and 50K transcripts in 400 sib pairs from the MRCA family panel, has been widely used to interpret the results of genome...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.142521.112

    authors: Liang L,Morar N,Dixon AL,Lathrop GM,Abecasis GR,Moffatt MF,Cookson WO

    更新日期:2013-04-01 00:00:00

  • Short-insert libraries as a method of problem solving in genome sequencing.

    abstract::As the Human Genome Project moves into its sequencing phase, a serious problem has arisen. The same problem has been increasingly vexing in the closing phase of the Caenorhabditis elegans project. The difficulty lies in sequencing efficiently through certain regions in which the templates (DNA substrates for the seque...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.5.562

    authors: McMurray AA,Sulston JE,Quail MA

    更新日期:1998-05-01 00:00:00

  • DNA enrichment by allele-specific hybridization (DEASH): a novel method for haplotyping and for detecting low-frequency base substitutional variants and recombinant DNA molecules.

    abstract::Detecting rare sequence variants in genomic DNA is central to the analysis of de novo mutation and recombination events and the detection of rare pathological mutations in mixed cell populations. Current PCR techniques suffer from noise that limits detection to variants present at a frequency of at least 10(-4)-10(-5)...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1214603

    authors: Jeffreys AJ,May CA

    更新日期:2003-10-01 00:00:00

  • Evolutionary features of the 4-Mb Xq21.3 XY homology region revealed by a map at 60-kb resolution.

    abstract::Forty-three yeast artificial chromosomes (YACs) from the X chromosome have been overlapped across the 4-Mb Xq21.3 region, which is homologous to a segment in Yp11.1. The region is formatted to 60-kb resolution with 57 STSs and is merged at its edges with contigs specific for X. This allows a direct comparison of marke...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.7.4.307

    authors: Mumm S,Molini B,Terrell J,Srivastava A,Schlessinger D

    更新日期:1997-04-01 00:00:00

  • 1-Mb resolution array-based comparative genomic hybridization using a BAC clone set optimized for cancer gene analysis.

    abstract::Array-based comparative genomic hybridization (aCGH) is a recently developed tool for genome-wide determination of DNA copy number alterations. This technology has tremendous potential for disease-gene discovery in cancer and developmental disorders as well as numerous other applications. However, widespread utilizati...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1847304

    authors: Greshock J,Naylor TL,Margolin A,Diskin S,Cleaver SH,Futreal PA,deJong PJ,Zhao S,Liebman M,Weber BL

    更新日期:2004-01-01 00:00:00

  • BRAFV600E remodels the melanocyte transcriptome and induces BANCR to regulate melanoma cell migration.

    abstract::Aberrations of protein-coding genes are a focus of cancer genomics; however, the impact of oncogenes on expression of the ~50% of transcripts without protein-coding potential, including long noncoding RNAs (lncRNAs), has been largely uncharacterized. Activating mutations in the BRAF oncogene are present in >70% of mel...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.140061.112

    authors: Flockhart RJ,Webster DE,Qu K,Mascarenhas N,Kovalski J,Kretz M,Khavari PA

    更新日期:2012-06-01 00:00:00

  • An assessment of gene prediction accuracy in large DNA sequences.

    abstract::One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.122800

    authors: Guigó R,Agarwal P,Abril JF,Burset M,Fickett JW

    更新日期:2000-10-01 00:00:00

  • The nonessentiality of essential genes in yeast provides therapeutic insights into a human disease.

    abstract::Essential genes refer to those whose null mutation leads to lethality or sterility. Theoretical reasoning and empirical data both suggest that the fatal effect of inactivating an essential gene can be attributed to either the loss of indispensable core cellular function (Type I), or the gain of fatal side effects afte...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.205955.116

    authors: Chen P,Wang D,Chen H,Zhou Z,He X

    更新日期:2016-10-01 00:00:00

  • Gene amplification as double minutes or homogeneously staining regions in solid tumors: origin and structure.

    abstract::Double minutes (dmin) and homogeneously staining regions (hsr) are the cytogenetic hallmarks of genomic amplification in cancer. Different mechanisms have been proposed to explain their genesis. Recently, our group showed that the MYC-containing dmin in leukemia cases arise by excision and amplification (episome model...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.106252.110

    authors: Storlazzi CT,Lonoce A,Guastadisegni MC,Trombetta D,D'Addabbo P,Daniele G,L'Abbate A,Macchia G,Surace C,Kok K,Ullmann R,Purgato S,Palumbo O,Carella M,Ambros PF,Rocchi M

    更新日期:2010-09-01 00:00:00

  • DNA methylation at hepatitis B viral integrants is associated with methylation at flanking human genomic sequences.

    abstract::Integration of DNA viruses into the human genome plays an important role in various types of tumors, including hepatitis B virus (HBV)-related hepatocellular carcinoma. However, the molecular details and clinical impact of HBV integration on either human or HBV epigenomes are unknown. Here, we show that methylation of...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.175240.114

    authors: Watanabe Y,Yamamoto H,Oikawa R,Toyota M,Yamamoto M,Kokudo N,Tanaka S,Arii S,Yotsuyanagi H,Koike K,Itoh F

    更新日期:2015-03-01 00:00:00

  • Genomics and hearing impairment.

    abstract::Hearing impairment is clinically and genetically heterogeneous. There are >400 disorders in which hearing impairment is a characteristic of the syndrome, and family studies demonstrate that there are at least 30 autosomal loci for nonsyndromic hearing impairment. The genes that have been identified encode diaphanous (...

    journal_title:Genome research

    pub_type: 历史文章,杂志文章,评审

    doi:

    authors: Keats BJ,Berlin CI

    更新日期:1999-01-01 00:00:00

  • Copy-number-aware differential analysis of quantitative DNA sequencing data.

    abstract::Developments in microarray and high-throughput sequencing (HTS) technologies have resulted in a rapid expansion of research into epigenomic changes that occur in normal development and in the progression of disease, such as cancer. Not surprisingly, copy number variation (CNV) has a direct effect on HTS read densities...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.139055.112

    authors: Robinson MD,Strbenac D,Stirzaker C,Statham AL,Song J,Speed TP,Clark SJ

    更新日期:2012-12-01 00:00:00

  • The evolution of evolvability in microRNA target sites in vertebrates.

    abstract::The lack of long-term evolutionary conservation of microRNA (miRNA) target sites appears to contradict many analyses of their functions. Several hypotheses have been offered, but an attractive one-that the conservation may be a function of taxonomic hierarchy (vertebrates, mammals, primates, etc.)-has rarely been disc...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.148916.112

    authors: Xu J,Zhang R,Shen Y,Liu G,Lu X,Wu CI

    更新日期:2013-11-01 00:00:00

  • Obligate ligation-gated recombination (ObLiGaRe): custom-designed nuclease-mediated targeted integration through nonhomologous end joining.

    abstract::Custom-designed nucleases (CDNs) greatly facilitate genetic engineering by generating a targeted DNA double-strand break (DSB) in the genome. Once a DSB is created, specific modifications can be introduced around the breakage site during its repair by two major DNA damage repair (DDR) mechanisms: the dominant but erro...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.145441.112

    authors: Maresca M,Lin VG,Guo N,Yang Y

    更新日期:2013-03-01 00:00:00

  • The mouse Aire gene: comparative genomic sequencing, gene organization, and expression.

    abstract::Mutations in the human AIRE gene (hAIRE) result in the development of an autoimmune disease named APECED (autoimmune polyendocrinopathy candidiasis ectodermal dystrophy; OMIM 240300). Previously, we have cloned hAIRE and shown that it codes for a putative transcription-associated factor. Here we report the cloning and...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:

    authors: Blechschmidt K,Schweiger M,Wertz K,Poulson R,Christensen HM,Rosenthal A,Lehrach H,Yaspo ML

    更新日期:1999-02-01 00:00:00

  • Accurate detection and genotyping of SNPs utilizing population sequencing data.

    abstract::Next-generation sequencing technologies have made it possible to sequence targeted regions of the human genome in hundreds of individuals. Deep sequencing represents a powerful approach for the discovery of the complete spectrum of DNA sequence variants in functionally important genomic intervals. Current methods for ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.100040.109

    authors: Bansal V,Harismendy O,Tewhey R,Murray SS,Schork NJ,Topol EJ,Frazer KA

    更新日期:2010-04-01 00:00:00

  • Comparative analysis of mammalian Y chromosomes illuminates ancestral structure and lineage-specific evolution.

    abstract::Although more than thirty mammalian genomes have been sequenced to draft quality, very few of these include the Y chromosome. This has limited our understanding of the evolutionary dynamics of gene persistence and loss, our ability to identify conserved regulatory elements, as well our knowledge of the extent to which...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.154286.112

    authors: Li G,Davis BW,Raudsepp T,Pearks Wilkerson AJ,Mason VC,Ferguson-Smith M,O'Brien PC,Waters PD,Murphy WJ

    更新日期:2013-09-01 00:00:00

  • Arabidopsis-rice: will colinearity allow gene prediction across the eudicot-monocot divide?

    abstract::With the genomic sequencing of Arabidopsis nearing completion and rice sequencing very much in its infancy, a key question is whether we can exploit the Arabidopsis sequence to identify candidate genes for traits in cereal crops using a map-based approach. This requires the existence of colinearity between the Arabido...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.9.9.825

    authors: Devos KM,Beales J,Nagamura Y,Sasaki T

    更新日期:1999-09-01 00:00:00

  • Identification and analysis of internal promoters in Caenorhabditis elegans operons.

    abstract::The current Caenorhabditis elegans genomic annotation has many genes organized in operons. Using directionally stitched promoterGFP methodology, we have conducted the largest survey to date on the regulatory regions of annotated C. elegans operons and identified 65, over 25% of those studied, with internal promoters. ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6824707

    authors: Huang P,Pleasance ED,Maydan JS,Hunt-Newbury R,O'Neil NJ,Mah A,Baillie DL,Marra MA,Moerman DG,Jones SJ

    更新日期:2007-10-01 00:00:00

  • Why do human diversity levels vary at a megabase scale?

    abstract::Levels of diversity vary across the human genome. This variation is caused by two forces: differences in mutation rates and the differential impact of natural selection. Pertinent to the question of the relative importance of these two forces is the observation that both diversity within species and interspecies diver...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.3461105

    authors: Hellmann I,Prüfer K,Ji H,Zody MC,Pääbo S,Ptak SE

    更新日期:2005-09-01 00:00:00

  • Spatial enhancer clustering and regulation of enhancer-proximal genes by cohesin.

    abstract::In addition to mediating sister chromatid cohesion during the cell cycle, the cohesin complex associates with CTCF and with active gene regulatory elements to form long-range interactions between its binding sites. Genome-wide chromosome conformation capture had shown that cohesin's main role in interphase genome orga...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.184986.114

    authors: Ing-Simmons E,Seitan VC,Faure AJ,Flicek P,Carroll T,Dekker J,Fisher AG,Lenhard B,Merkenschlager M

    更新日期:2015-04-01 00:00:00

  • Transposon expression in the Drosophila brain is driven by neighboring genes and diversifies the neural transcriptome.

    abstract::Somatic transposon expression in neural tissue is commonly considered as a measure of mobilization and has therefore been linked to neuropathology and organismal individuality. We combined genome sequencing data with single-cell mRNA sequencing of the same inbred fly strain to map transposon expression in the Drosophi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.259200.119

    authors: Treiber CD,Waddell S

    更新日期:2020-11-01 00:00:00

  • Genomic localization of RNA binding proteins reveals links between pre-mRNA processing and transcription.

    abstract::Pre-mRNA processing often occurs in coordination with transcription thereby coupling these two key regulatory events. As such, many proteins involved in mRNA processing associate with the transcriptional machinery and are in proximity to DNA. This proximity allows for the mapping of the genomic associations of RNA bin...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5211806

    authors: Swinburne IA,Meyer CA,Liu XS,Silver PA,Brodsky AS

    更新日期:2006-07-01 00:00:00

  • Single-cell strand sequencing of a macaque genome reveals multiple nested inversions and breakpoint reuse during primate evolution.

    abstract::Rhesus macaque is an Old World monkey that shared a common ancestor with human ∼25 Myr ago and is an important animal model for human disease studies. A deep understanding of its genetics is therefore required for both biomedical and evolutionary studies. Among structural variants, inversions represent a driving force...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.265322.120

    authors: Maggiolini FAM,Sanders AD,Shew CJ,Sulovari A,Mao Y,Puig M,Catacchio CR,Dellino M,Palmisano D,Mercuri L,Bitonto M,Porubský D,Cáceres M,Eichler EE,Ventura M,Dennis MY,Korbel JO,Antonacci F

    更新日期:2020-11-01 00:00:00

  • The amphioxus genome illuminates vertebrate origins and cephalochordate biology.

    abstract::Cephalochordates, urochordates, and vertebrates evolved from a common ancestor over 520 million years ago. To improve our understanding of chordate evolution and the origin of vertebrates, we intensively searched for particular genes, gene families, and conserved noncoding elements in the sequenced genome of the cepha...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.073676.107

    authors: Holland LZ,Albalat R,Azumi K,Benito-Gutiérrez E,Blow MJ,Bronner-Fraser M,Brunet F,Butts T,Candiani S,Dishaw LJ,Ferrier DE,Garcia-Fernàndez J,Gibson-Brown JJ,Gissi C,Godzik A,Hallböök F,Hirose D,Hosomichi K,Ikuta T,I

    更新日期:2008-07-01 00:00:00

  • Arabidopsis thaliana centromere regions: genetic map positions and repetitive DNA structure.

    abstract::The genetic positions of the five Arabidopsis thaliana centromere regions have been identified by mapping size polymorphisms in the centromeric 180-bp repeat arrays. Structural and genetic analysis indicates that 180-bp repeat arrays of up to 1000 kb are found in the centromere region of each chromosome. The genetic b...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.7.11.1045

    authors: Round EK,Flowers SK,Richards EJ

    更新日期:1997-11-01 00:00:00

  • Phenotypic diversity and genotypic flexibility of Burkholderia cenocepacia during long-term chronic infection of cystic fibrosis lungs.

    abstract::Chronic bacterial infections of the lung are the leading cause of morbidity and mortality in cystic fibrosis patients. Tracking bacterial evolution during chronic infections can provide insights into how host selection pressures-including immune responses and therapeutic interventions-shape bacterial genomes. We carri...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.213363.116

    authors: Lee AH,Flibotte S,Sinha S,Paiero A,Ehrlich RL,Balashov S,Ehrlich GD,Zlosnik JE,Mell JC,Nislow C

    更新日期:2017-04-01 00:00:00

  • A dynamic H3K27ac signature identifies VEGFA-stimulated endothelial enhancers and requires EP300 activity.

    abstract::Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth facto...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.149674.112

    authors: Zhang B,Day DS,Ho JW,Song L,Cao J,Christodoulou D,Seidman JG,Crawford GE,Park PJ,Pu WT

    更新日期:2013-06-01 00:00:00

  • Schizosaccharomyces pombe essential genes: a pilot study.

    abstract::After completion of the Schizosaccharomyces pombe genome sequence, we have carried out a pilot gene deletion project to assess the feasibility of a genome-wide deletion project and to estimate the percentage of essential genes. Using a PCR-based gene deletion procedure, we investigated 100 genes within a 253-kb region...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.636103

    authors: Decottignies A,Sanchez-Perez I,Nurse P

    更新日期:2003-03-01 00:00:00

  • Mapping the pericentric heterochromatin by comparative genomic hybridization analysis and chromosome deletions in Drosophila melanogaster.

    abstract::Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy u...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.137406.112

    authors: He B,Caudy A,Parsons L,Rosebrock A,Pane A,Raj S,Wieschaus E

    更新日期:2012-12-01 00:00:00