Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II.

Abstract:

:The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC region has been intractable to high-throughput analysis at complete sequence resolution, and current reference haplotypes are inadequate for large-scale studies. To address these challenges, we developed a method that specifically captures and sequences the 4.8-Mbp MHC region from genomic DNA. For 95 MHC homozygous cell lines we assembled, de novo, a set of high-fidelity contigs and a sequence scaffold, representing a mean 98% of the target region. Included are six alternative MHC reference sequences of the human genome that we completed and refined. Characterization of the sequence and structural diversity of the MHC region shows the approach accurately determines the sequences of the highly polymorphic HLA class I and HLA class II genes and the complex structural diversity of complement factor C4A/C4B It has also uncovered extensive and unexpected diversity in other MHC genes; an example is MUC22, which encodes a lung mucin and exhibits more coding sequence alleles than any HLA class I or II gene studied here. More than 60% of the coding sequence alleles analyzed were previously uncharacterized. We have created a substantial database of robust reference MHC haplotype sequences that will enable future population scale studies of this complicated and clinically important region of the human genome.

journal_name

Genome Res

journal_title

Genome research

authors

Norman PJ,Norberg SJ,Guethlein LA,Nemat-Gorgani N,Royce T,Wroblewski EE,Dunn T,Mann T,Alicata C,Hollenbach JA,Chang W,Shults Won M,Gunderson KL,Abi-Rached L,Ronaghi M,Parham P

doi

10.1101/gr.213538.116

subject

Has Abstract

pub_date

2017-05-01 00:00:00

pages

813-823

issue

5

eissn

1088-9051

issn

1549-5469

pii

gr.213538.116

journal_volume

27

pub_type

杂志文章
  • The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine.

    abstract::ClinSeq is a pilot project to investigate the use of whole-genome sequencing as a tool for clinical research. By piloting the acquisition of large amounts of DNA sequence data from individual human subjects, we are fostering the development of hypothesis-generating approaches for performing research in genomic medicin...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.092841.109

    authors: Biesecker LG,Mullikin JC,Facio FM,Turner C,Cherukuri PF,Blakesley RW,Bouffard GG,Chines PS,Cruz P,Hansen NF,Teer JK,Maskeri B,Young AC,NISC Comparative Sequencing Program.,Manolio TA,Wilson AF,Finkel T,Hwang P,Arai A

    更新日期:2009-09-01 00:00:00

  • Impact of genomics on research in the rat.

    abstract::The need to translate genes to function has positioned the rat as an invaluable animal model for genomic research. The significant increase in genomic resources in recent years has had an immediate functional application in the rat. Many of the resources for translational research are already in place and are ready to...

    journal_title:Genome research

    pub_type: 杂志文章,评审

    doi:10.1101/gr.3744005

    authors: Lazar J,Moreno C,Jacob HJ,Kwitek AE

    更新日期:2005-12-01 00:00:00

  • Discovery of regulatory elements by a computational method for phylogenetic footprinting.

    abstract::Phylogenetic footprinting is a method for the discovery of regulatory elements in a set of orthologous regulatory regions from multiple species. It does so by identifying the best conserved motifs in those orthologous regions. We describe a computer algorithm designed specifically for this purpose, making use of the p...

    journal_title:Genome research

    pub_type: 信件

    doi:10.1101/gr.6902

    authors: Blanchette M,Tompa M

    更新日期:2002-05-01 00:00:00

  • Estimating population genetic parameters and comparing model goodness-of-fit using DNA sequences with error.

    abstract::It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.097543.109

    authors: Liu X,Fu YX,Maxwell TJ,Boerwinkle E

    更新日期:2010-01-01 00:00:00

  • Broad-spectrum respiratory tract pathogen identification using resequencing DNA microarrays.

    abstract::The exponential growth of pathogen nucleic acid sequences available in public domain databases has invited their direct use in pathogen detection, identification, and surveillance strategies. DNA microarray technology has offered the potential for the direct DNA sequence analysis of a broad spectrum of pathogens of in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4337206

    authors: Lin B,Wang Z,Vora GJ,Thornton JA,Schnur JM,Thach DC,Blaney KM,Ligler AG,Malanoski AP,Santiago J,Walter EA,Agan BK,Metzgar D,Seto D,Daum LT,Kruzelock R,Rowley RK,Hanson EH,Tibbetts C,Stenger DA

    更新日期:2006-04-01 00:00:00

  • Copy number and targeted mutational analysis reveals novel somatic events in metastatic prostate tumors.

    abstract::Advanced prostate cancer can progress to systemic metastatic tumors, which are generally androgen insensitive and ultimately lethal. Here, we report a comprehensive genomic survey for somatic events in systemic metastatic prostate tumors using both high-resolution copy number analysis and targeted mutational survey of...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.107961.110

    authors: Robbins CM,Tembe WA,Baker A,Sinari S,Moses TY,Beckstrom-Sternberg S,Beckstrom-Sternberg J,Barrett M,Long J,Chinnaiyan A,Lowey J,Suh E,Pearson JV,Craig DW,Agus DB,Pienta KJ,Carpten JD

    更新日期:2011-01-01 00:00:00

  • Predicting deleterious amino acid substitutions.

    abstract::Many missense substitutions are identified in single nucleotide polymorphism (SNP) data and large-scale random mutagenesis projects. Each amino acid substitution potentially affects protein function. We have constructed a tool that uses sequence homology to predict whether a substitution affects protein function. SIFT...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.176601

    authors: Ng PC,Henikoff S

    更新日期:2001-05-01 00:00:00

  • Detecting ancient positive selection in humans using extended lineage sorting.

    abstract::Natural selection that affected modern humans early in their evolution has likely shaped some of the traits that set present-day humans apart from their closest extinct and living relatives. The ability to detect ancient natural selection in the human genome could provide insights into the molecular basis for these hu...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.219493.116

    authors: Peyrégne S,Boyle MJ,Dannemann M,Prüfer K

    更新日期:2017-09-01 00:00:00

  • Widespread genome duplications throughout the history of flowering plants.

    abstract::Genomic comparisons provide evidence for ancient genome-wide duplications in a diverse array of animals and plants. We developed a birth-death model to identify evidence for genome duplication in EST data, and applied a mixture model to estimate the age distribution of paralogous pairs identified in EST sets for speci...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4825606

    authors: Cui L,Wall PK,Leebens-Mack JH,Lindsay BG,Soltis DE,Doyle JJ,Soltis PS,Carlson JE,Arumuganathan K,Barakat A,Albert VA,Ma H,dePamphilis CW

    更新日期:2006-06-01 00:00:00

  • Nonrandom domain organization of the Arabidopsis genome at the nuclear periphery.

    abstract::The nuclear space is not a homogeneous biochemical environment. Many studies have demonstrated that the transcriptional activity of a gene is linked to its positioning within the nuclear space. Following the discovery of lamin-associated domains (LADs), which are transcriptionally repressed chromatin regions, the nonr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.215186.116

    authors: Bi X,Cheng YJ,Hu B,Ma X,Wu R,Wang JW,Liu C

    更新日期:2017-07-01 00:00:00

  • Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell.

    abstract::Comparative analysis of the protein sequences encoded in the four euryarchaeal species whose genomes have been sequenced completely (Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, and Pyrococcus horikoshii) revealed 1326 orthologous sets, of which 543 are represented in all fou...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:

    authors: Makarova KS,Aravind L,Galperin MY,Grishin NV,Tatusov RL,Wolf YI,Koonin EV

    更新日期:1999-07-01 00:00:00

  • Parallel radiation hybrid mapping: a powerful tool for high-resolution genomic comparison.

    abstract::Comparative gene mapping in mammals typically involves identification of segments of conserved synteny in diverse genomes. The development of maps that permit comparison of gene order within conserved synteny has not advanced beyond the mouse map that takes advantage of linkage analysis in interspecific backcrosses. R...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.7.731

    authors: Yang YP,Womack JE

    更新日期:1998-07-01 00:00:00

  • Conservation, regulation, synteny, and introns in a large-scale C. briggsae-C. elegans genomic alignment.

    abstract::A new algorithm, WABA, was developed for doing large-scale alignments between genomic DNA of different species. WABA was used to align 8 million bases of Caenorhabditis briggsae genomic DNA against the entire 97-million-base Caenorhabditis elegans genome. The alignment, including C. briggsae homologs of 154 geneticall...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.8.1115

    authors: Kent WJ,Zahler AM

    更新日期:2000-08-01 00:00:00

  • Copy number variation at the breakpoint region of isochromosome 17q.

    abstract::Isochromosome 17q, or i(17q), is one of the most frequent nonrandom changes occurring in human neoplasia. Most of the i(17q) breakpoints cluster within a approximately 240-kb interval located in the Smith-Magenis syndrome common deletion region in 17p11.2. The breakpoint cluster region is characterized by a complex ar...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.080697.108

    authors: Carvalho CM,Lupski JR

    更新日期:2008-11-01 00:00:00

  • Copy-number-aware differential analysis of quantitative DNA sequencing data.

    abstract::Developments in microarray and high-throughput sequencing (HTS) technologies have resulted in a rapid expansion of research into epigenomic changes that occur in normal development and in the progression of disease, such as cancer. Not surprisingly, copy number variation (CNV) has a direct effect on HTS read densities...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.139055.112

    authors: Robinson MD,Strbenac D,Stirzaker C,Statham AL,Song J,Speed TP,Clark SJ

    更新日期:2012-12-01 00:00:00

  • Recent segmental duplications in the working draft assembly of the brown Norway rat.

    abstract::We assessed the content, structure, and distribution of segmental duplications (> or =90% sequence identity, > or =5 kb length) within the published version of the Rattus norvegicus genome assembly (v.3.1). The overall fraction of duplicated sequence within the rat assembly (2.92%) is greater than that of the mouse (1...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1907504

    authors: Tuzun E,Bailey JA,Eichler EE

    更新日期:2004-04-01 00:00:00

  • Rapid comparative genomic analysis for clinical microbiology: the Francisella tularensis paradigm.

    abstract::It is critical to avoid delays in detecting strain manipulations, such as the addition/deletion of a gene or modification of genes for increased virulence or antibiotic resistance, using genome analysis during an epidemic outbreak or a bioterrorist attack. Our objective was to evaluate the efficiency of genome analysi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.071266.107

    authors: La Scola B,Elkarkouri K,Li W,Wahab T,Fournous G,Rolain JM,Biswas S,Drancourt M,Robert C,Audic S,Löfdahl S,Raoult D

    更新日期:2008-05-01 00:00:00

  • Bacillus subtilis during feast and famine: visualization of the overall regulation of protein synthesis during glucose starvation by proteome analysis.

    abstract::Dual channel imaging and warping of two-dimensional (2D) protein gels were used to visualize global changes of the gene expression patterns in growing Bacillus subtilis cells during entry into the stationary phase as triggered by glucose exhaustion. The 2D gels only depict single moments during the cells' growth cycle...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.905003

    authors: Bernhardt J,Weibezahn J,Scharf C,Hecker M

    更新日期:2003-02-01 00:00:00

  • Core promoter T-blocks correlate with gene expression levels in C. elegans.

    abstract::Core promoters mediate transcription initiation by the integration of diverse regulatory signals encoded in the proximal promoter and enhancers. It has been suggested that genes under simple regulation may have low-complexity permissive promoters. For these genes, the core promoter may serve as the principal regulator...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.113381.110

    authors: Grishkevich V,Hashimshony T,Yanai I

    更新日期:2011-05-01 00:00:00

  • Alternative approach to a heavy weight problem.

    abstract::Obesity is reaching epidemic proportions in developed countries and represents a significant risk factor for hypertension, heart disease, diabetes, and dyslipidemia. Splicing mutations constitute at least 14% of disease-causing mutations, thus implicating polymorphisms that affect splicing as likely candidates for dis...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6661308

    authors: Goren A,Kim E,Amit M,Bochner R,Lev-Maor G,Ahituv N,Ast G

    更新日期:2008-02-01 00:00:00

  • Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci.

    abstract::The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human genome that is translated remains to be ascertained. We previously developed PhyloCSF, a widely used tool to identify evolutionary signatures of protein-coding regions using multispecies genome alignments. Here, we present...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.246462.118

    authors: Mudge JM,Jungreis I,Hunt T,Gonzalez JM,Wright JC,Kay M,Davidson C,Fitzgerald S,Seal R,Tweedie S,He L,Waterhouse RM,Li Y,Bruford E,Choudhary JS,Frankish A,Kellis M

    更新日期:2019-12-01 00:00:00

  • Widespread somatic L1 retrotransposition occurs early during gastrointestinal cancer evolution.

    abstract::Somatic L1 retrotransposition events have been shown to occur in epithelial cancers. Here, we attempted to determine how early somatic L1 insertions occurred during the development of gastrointestinal (GI) cancers. Using L1-targeted resequencing (L1-seq), we studied different stages of four colorectal cancers arising ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.196238.115

    authors: Ewing AD,Gacita A,Wood LD,Ma F,Xing D,Kim MS,Manda SS,Abril G,Pereira G,Makohon-Moore A,Looijenga LH,Gillis AJ,Hruban RH,Anders RA,Romans KE,Pandey A,Iacobuzio-Donahue CA,Vogelstein B,Kinzler KW,Kazazian HH Jr,Sol

    更新日期:2015-10-01 00:00:00

  • Identification of protein features encoded by alternative exons using Exon Ontology.

    abstract::Transcriptomic genome-wide analyses demonstrate massive variation of alternative splicing in many physiological and pathological situations. One major challenge is now to establish the biological contribution of alternative splicing variation in physiological- or pathological-associated cellular phenotypes. Toward thi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.212696.116

    authors: Tranchevent LC,Aubé F,Dulaurier L,Benoit-Pilven C,Rey A,Poret A,Chautard E,Mortada H,Desmet FO,Chakrama FZ,Moreno-Garcia MA,Goillot E,Janczarski S,Mortreux F,Bourgeois CF,Auboeuf D

    更新日期:2017-06-01 00:00:00

  • Evolution of transcript modification by N6-methyladenosine in primates.

    abstract::Phenotypic differences within populations and between closely related species are often driven by variation and evolution of gene expression. However, most analyses have focused on the effects of genomic variation at cis-regulatory elements such as promoters and enhancers that control transcriptional activity, and lit...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.212563.116

    authors: Ma L,Zhao B,Chen K,Thomas A,Tuteja JH,He X,He C,White KP

    更新日期:2017-03-01 00:00:00

  • Distinct contributions of DNA methylation and histone acetylation to the genomic occupancy of transcription factors.

    abstract::Epigenetic modifications on chromatin play important roles in regulating gene expression. Although chromatin states are often governed by multilayered structure, how individual pathways contribute to gene expression remains poorly understood. For example, DNA methylation is known to regulate transcription factor bindi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.257576.119

    authors: Cusack M,King HW,Spingardi P,Kessler BM,Klose RJ,Kriaucionis S

    更新日期:2020-10-01 00:00:00

  • Massive reshaping of genome-nuclear lamina interactions during oncogene-induced senescence.

    abstract::Cellular senescence is a mechanism that virtually irreversibly suppresses the proliferative capacity of cells in response to various stress signals. This includes the expression of activated oncogenes, which causes Oncogene-Induced Senescence (OIS). A body of evidence points to the involvement in OIS of chromatin reor...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.225763.117

    authors: Lenain C,de Graaf CA,Pagie L,Visser NL,de Haas M,de Vries SS,Peric-Hupkes D,van Steensel B,Peeper DS

    更新日期:2017-10-01 00:00:00

  • Systematic recovery and analysis of full-ORF human cDNA clones.

    abstract::The Mammalian Gene Collection (MGC) consortium (http://mgc.nci.nih.gov) seeks to establish publicly available collections of full-ORF cDNAs for several organisms of significance to biomedical research, including human. To date over 15,200 human cDNA clones containing full-length open reading frames (ORFs) have been id...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2473704

    authors: Baross A,Butterfield YS,Coughlin SM,Zeng T,Griffith M,Griffith OL,Petrescu AS,Smailus DE,Khattra J,McDonald HL,McKay SJ,Moksa M,Holt RA,Marra MA

    更新日期:2004-10-01 00:00:00

  • Integrative functional genomics identifies an enhancer looping to the SOX9 gene disrupted by the 17q24.3 prostate cancer risk locus.

    abstract::Genome-wide association studies (GWAS) are identifying genetic predisposition to various diseases. The 17q24.3 locus harbors the single nucleotide polymorphism (SNP) rs1859962 that is statistically associated with prostate cancer (PCa). It defines a 130-kb linkage disequilibrium (LD) block that lies in an ∼2-Mb gene d...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.135665.111

    authors: Zhang X,Cowper-Sal lari R,Bailey SD,Moore JH,Lupien M

    更新日期:2012-08-01 00:00:00

  • Gene amplification as double minutes or homogeneously staining regions in solid tumors: origin and structure.

    abstract::Double minutes (dmin) and homogeneously staining regions (hsr) are the cytogenetic hallmarks of genomic amplification in cancer. Different mechanisms have been proposed to explain their genesis. Recently, our group showed that the MYC-containing dmin in leukemia cases arise by excision and amplification (episome model...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.106252.110

    authors: Storlazzi CT,Lonoce A,Guastadisegni MC,Trombetta D,D'Addabbo P,Daniele G,L'Abbate A,Macchia G,Surace C,Kok K,Ullmann R,Purgato S,Palumbo O,Carella M,Ambros PF,Rocchi M

    更新日期:2010-09-01 00:00:00

  • Unamplified cap analysis of gene expression on a single-molecule sequencer.

    abstract::We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3' end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.115469.110

    authors: Kanamori-Katayama M,Itoh M,Kawaji H,Lassmann T,Katayama S,Kojima M,Bertin N,Kaiho A,Ninomiya N,Daub CO,Carninci P,Forrest AR,Hayashizaki Y

    更新日期:2011-07-01 00:00:00