Accounting for population stratification in DNA methylation studies.

Abstract:

:DNA methylation is an important epigenetic mechanism that has been linked to complex diseases and is of great interest to researchers as a potential link between genome, environment, and disease. As the scale of DNA methylation association studies approaches that of genome-wide association studies, issues such as population stratification will need to be addressed. It is well-documented that failure to adjust for population stratification can lead to false positives in genetic association studies, but population stratification is often unaccounted for in DNA methylation studies. Here, we propose several approaches to correct for population stratification using principal components (PCs) from different subsets of genome-wide methylation data. We first illustrate the potential for confounding due to population stratification by demonstrating widespread associations between DNA methylation and race in 388 individuals (365 African American and 23 Caucasian). We subsequently evaluate the performance of our PC-based approaches and other methods in adjusting for confounding due to population stratification. Our simulations show that (1) all of the methods considered are effective at removing inflation due to population stratification, and (2) maximum power can be obtained with single-nucleotide polymorphism (SNP)-based PCs, followed by methylation-based PCs, which outperform both surrogate variable analysis and genomic control. Among our different approaches to computing methylation-based PCs, we find that PCs based on CpG sites chosen for their potential to proxy nearby SNPs can provide a powerful and computationally efficient approach to adjust for population stratification in DNA methylation studies when genome-wide SNP data are unavailable.

journal_name

Genet Epidemiol

journal_title

Genetic epidemiology

authors

Barfield RT,Almli LM,Kilaru V,Smith AK,Mercer KB,Duncan R,Klengel T,Mehta D,Binder EB,Epstein MP,Ressler KJ,Conneely KN

doi

10.1002/gepi.21789

subject

Has Abstract

pub_date

2014-04-01 00:00:00

pages

231-41

issue

3

eissn

0741-0395

issn

1098-2272

journal_volume

38

pub_type

杂志文章
  • A flexible and parallelizable approach to genome-wide polygenic risk scores.

    abstract::The heritability of most complex traits is driven by variants throughout the genome. Consequently, polygenic risk scores, which combine information on multiple variants genome-wide, have demonstrated improved accuracy in genetic risk prediction. We present a new two-step approach to constructing genome-wide polygenic ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22245

    authors: Newcombe PJ,Nelson CP,Samani NJ,Dudbridge F

    更新日期:2019-10-01 00:00:00

  • The insulin gene and susceptibility to IDDM.

    abstract::The association between insulin-dependent diabetes mellitus (IDDM) and an allele of a restriction fragment length polymorphism (RFLP) 5' to the coding region of the insulin gene has raised the possibility that variation in the vicinity of the insulin gene confers susceptibility to IDDM. To test this hypothesis, the di...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370060113

    authors: Cox NJ,Spielman RS

    更新日期:1989-01-01 00:00:00

  • Improving estimates of genetic maps: a meta-analysis-based approach.

    abstract::Inaccurate genetic (or linkage) maps can reduce the power to detect linkage, increase type I error, and distort haplotype and relationship inference. To improve the accuracy of existing maps, I propose a meta-analysis-based method that combines independent map estimates into a single estimate of the linkage map. The m...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20221

    authors: Stewart WC

    更新日期:2007-07-01 00:00:00

  • Analysis of two-locus traits under heterogeneity for recessive versus dominant inheritance.

    abstract::Complex traits have been modeled under various modes of two-locus inheritance. One example of a two-locus threshold model is the situation where an individual is susceptible to a disease trait if he or she carries three or more disease alleles. Under this model, if each locus is examined individually the inheritance a...

    journal_title:Genetic epidemiology

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/(SICI)1098-2272(1997)14:6<1097::AID-GEPI89

    authors: Leal SM,Ott J

    更新日期:1997-01-01 00:00:00

  • Robust inference for variance components models in families ascertained through probands: I. Conditioning on proband's phenotype.

    abstract::A robust approach for estimating standard errors of variance components by using quantitative phenotypes from families ascertained through a proband with an extreme phenotypic value is presented. Estimators that use the multivariate normal distribution as a "working likelihood" are obtained by computing conditional ln...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370040305

    authors: Beaty TH,Liang KY

    更新日期:1987-01-01 00:00:00

  • Bayesian variable and model selection methods for genetic association studies.

    abstract::Variable selection is growing in importance with the advent of high throughput genotyping methods requiring analysis of hundreds to thousands of single nucleotide polymorphisms (SNPs) and the increased interest in using these genetic studies to better understand common, complex diseases. Up to now, the standard approa...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20353

    authors: Fridley BL

    更新日期:2009-01-01 00:00:00

  • Use of variable marker density, principal components, and neural networks in the dissection of disease etiology.

    abstract::Several approaches were taken to identify the loci contributing to the quantitative and qualitative phenotypes in the Genetic Analysis Workshop 12 simulated data set. To identify possible quantitative trait loci (QTL), the quantitative traits were analyzed using SOLAR. The four replicates identified as the "best repli...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s732

    authors: Pankratz N,Kirkwood SC,Flury L,Koller DL,Foroud T

    更新日期:2001-01-01 00:00:00

  • Association mapping, using a mixture model for complex traits.

    abstract::Association mapping for complex diseases using unrelated individuals can be more powerful than family-based analysis in many settings. In addition, this approach has major practical advantages, including greater efficiency in sample recruitment. Association mapping may lead to false-positive findings, however, if popu...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.210

    authors: Zhu X,Zhang S,Zhao H,Cooper RS

    更新日期:2002-08-01 00:00:00

  • Detecting epistatic interactions contributing to quantitative traits.

    abstract::The restricted partition method (RPM) is a partitioning algorithm for examining multi-locus genotypes as (potentially non-additive) predictors of a quantitative trait. The motivating application was to develop a robust method to examine quantitative phenotypes for epistasis (gene-gene interactions), but the method can...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章,评审

    doi:10.1002/gepi.20006

    authors: Culverhouse R,Klein T,Shannon W

    更新日期:2004-09-01 00:00:00

  • A general autosomal/X-linked model.

    abstract::This paper describes a general genetic model which encompasses both autosomal and X-linked inheritance as submodels. It allows one to test for X-linked inheritance of a trait by comparing the likelihood of X-linked inheritance to the likelihood of the general genetic model. The general model is formulated as two loci,...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370010105

    authors: Hasstedt SJ,Skolnick M

    更新日期:1984-01-01 00:00:00

  • Statistical considerations for the analysis of massively parallel reporter assays data.

    abstract::Noncoding DNA contains gene regulatory elements that alter gene expression, and the function of these elements can be modified by genetic variation. Massively parallel reporter assays (MPRA) enable high-throughput identification and characterization of functional genetic variants, but the statistical methods to identi...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22337

    authors: Qiao D,Zigler CM,Cho MH,Silverman EK,Zhou X,Castaldi PJ,Laird NH

    更新日期:2020-10-01 00:00:00

  • A novel association test for multiple secondary phenotypes from a case-control GWAS.

    abstract::In the past decade, many genome-wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case-control design. These GWASs not only collect information on the disease status (primary phenotype, D) and the SNPs (genotypes, X), but...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章,随机对照试验

    doi:10.1002/gepi.22045

    authors: Ray D,Basu S

    更新日期:2017-07-01 00:00:00

  • Efficient strategy for detecting gene × gene joint action and its application in schizophrenia.

    abstract::We propose a new approach to detect gene × gene joint action in genome-wide association studies (GWASs) for case-control designs. This approach offers an exhaustive search for all two-way joint action (including, as a special case, single gene action) that is computationally feasible at the genome-wide level and has r...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21779

    authors: Won S,Kwon MS,Mattheisen M,Park S,Park C,Kihara D,Cichon S,Ophoff R,Nöthen MM,Rietschel M,Baur M,Uitterlinden AG,Hofmann A,GROUP Investigators.,Lange C

    更新日期:2014-01-01 00:00:00

  • Genome-wide detection and characterization of mating asymmetry in human populations.

    abstract::The study of the genetic component of early-onset diseases requires investigation into parental genetic effects, particularly those mediated by the mother who can influence the offspring's risk of disease through the effects of her genes acting directly on the intrauterine milieu or indirectly through maternal-gene ch...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20602

    authors: Bourgey M,Healy J,Saint-Onge P,Massé H,Sinnett D,Roy-Gagnon MH

    更新日期:2011-09-01 00:00:00

  • Sib-pair linkage tests for disease susceptibility loci: common tests vs. the asymptotically most powerful test.

    abstract::Several statistical tests for linkage between a disease susceptibility locus and a marker locus for sib-pair data are examined analytically. Two common statistics, a test based on the mean number of marker alleles shared identical by descent by sib-pairs, and a test based on the proportion of sib-pairs sharing exactly...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370070506

    authors: Schaid DJ,Nick TG

    更新日期:1990-01-01 00:00:00

  • Path analysis under generalized marital resemblance: evaluation of the assumptions underlying the mixed homogamy model by the Monte Carlo method.

    abstract::Path analysis of nuclear family data has been widely applied to resolve genetic and environmental sources of familial resemblance. Here we report the results of a systematic evaluation of the effects of departures from five modeling assumptions often made when analyzing nuclear family data; i) the observed environment...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370060207

    authors: McGue M,Wette R,Rao DC

    更新日期:1989-01-01 00:00:00

  • Familial resemblance of bone mass in adult women.

    abstract::Bone mass may be so reduced in some individuals as to be characterized as osteoporotic, with resulting fracture, particularly of the proximal femur, vertebrae, or wrist. We identified 34 mother-daughter sets (n = 70) and 29 sibling sets (n = 59) from a community study of bone mass correlates to assess the degree of re...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370030204

    authors: Sowers MR,Burns TL,Wallace RB

    更新日期:1986-01-01 00:00:00

  • Parental transmission and D18S37 allele sharing in bipolar affective disorder.

    abstract::We combined the five chromosome 18 bipolar affective disorder data sets provided by GAW10, totaling 185 families with 3,394 individuals, and performed analysis of differential parental transmission and chromosome 18 marker allele sharing in families with transmission through fathers vs those through mothers. Results i...

    journal_title:Genetic epidemiology

    pub_type: 临床试验,杂志文章

    doi:10.1002/(SICI)1098-2272(1997)14:6<665::AID-GEPI19>

    authors: Lin JP,Bale SJ

    更新日期:1997-01-01 00:00:00

  • Linkage analysis in alcohol dependence.

    abstract::Alcohol dependence often is a familial disorder and has a genetic component. Research in causative factors of alcoholism is coordinated by a multi-center program, COGA [The Collaborative Study on the Genetics of Alcoholism, Begleiter et al., 1995]. We analyzed a subset of the COGA family sample, 84 pedigrees of Caucas...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370170768

    authors: Windemuth C,Hahn A,Strauch K,Baur MP,Wienker TF

    更新日期:1999-01-01 00:00:00

  • Heritability analysis of nontraditional glycemic biomarkers in the Atherosclerosis Risk in Communities Study.

    abstract::Nontraditional glycemic biomarkers, including fructosamine, glycated albumin, and 1,5-anhydroglucitol (1,5-AG) are potential alternatives or complement to traditional measures of hyperglycemia. Genetic variants are associated with these biomarkers, but the heritability, or extent to which genetics control their variat...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22243

    authors: Loomis SJ,Tin A,Coresh J,Boerwinkle E,Pankow JS,Köttgen A,Selvin E,Duggal P

    更新日期:2019-10-01 00:00:00

  • Power of the linkage test for a heterogeneous disorder due to two independent inherited causes: a simulation study.

    abstract::We have conducted a simulation study in small pedigrees to investigate the power to detect linkage and heterogeneity for a disorder due to either one of two independent disease loci. We have considered a highly polymorphic marker locus (PIC = 70%) linked to one disease locus and unlinked to the second. The power to de...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370070306

    authors: Martinez M,Goldin LR

    更新日期:1990-01-01 00:00:00

  • Relevance of the genes for bone mass variation to susceptibility to osteoporotic fractures and its implications to gene search for complex human diseases.

    abstract::We investigate the relevance of the genetic determination of bone mineral density (BMD) variation to that of differential risk to osteoporotic fractures (OF). The high heritability (h(2)) of BMD and the significant phenotypic correlations between high BMD and low risk to OF are well known. Little is reported on h(2) f...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1040

    authors: Deng HW,Mahaney MC,Williams JT,Li J,Conway T,Davies KM,Li JL,Deng H,Recker RR

    更新日期:2002-01-01 00:00:00

  • Using single nucleotide polymorphisms to investigate association between a candidate gene and disease.

    abstract::A range of study designs, using unrelated or family controls, were used to investigate the pattern of association with disease of single nucleotide polymorphisms (SNPs) within candidate gene 1 (simulated data). Strong evidence of disease association at the functional locus was detected using all study designs, and in ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s415

    authors: Saunders CL,Crockford GP,Bishop DT,Barrett JH

    更新日期:2001-01-01 00:00:00

  • Method for calculating risk associated with family history of a disease.

    abstract::A method is described for estimating excess relative risks of a disease from familial factors. Beginning with population-based series of cases and controls, a cohort of each subject's relatives is formed and checked for disease against a population based registry. The disease experience of the cohort formed from each ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370120306

    authors: Kerber RA

    更新日期:1995-01-01 00:00:00

  • Haplotype sharing analysis in affected individuals from nuclear families with at least one affected offspring.

    abstract::In diseases with a complex mode of inheritance, families with multiple affected individuals are difficult to ascertain. The haplotype sharing statistic (HSS) uses (hidden) co-ancestry between affected individuals from a founder population. These affected individuals will likely not only share the same mutation(s), but...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1997)14:6<915::AID-GEPI59>

    authors: Van der Meulen MA,te Meerman GJ

    更新日期:1997-01-01 00:00:00

  • Identifying genetic interactions in genome-wide data using Bayesian networks.

    abstract::It is believed that interactions among genes (epistasis) may play an important role in susceptibility to common diseases (Moore and Williams [2002]. Ann Med 34:88-95; Ritchie et al. [2001]. Am J Hum Genet 69:138-147). To study the underlying genetic variants of diseases, genome-wide association studies (GWAS) that sim...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20514

    authors: Jiang X,Barmada MM,Visweswaran S

    更新日期:2010-09-01 00:00:00

  • Generalization of the extended transmission disequilibrium test to two unlinked disease loci.

    abstract::The extended transmission disequilibrium test (ETDT) of Sham and Curtis [1995] is a powerful test of the null hypothesis of no linkage between a multi-allelic marker locus and a disease susceptibility locus of unknown location in the presence of association between alleles at the two loci. We propose a generalization ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.13701707108

    authors: Morris A,Whittaker J

    更新日期:1999-01-01 00:00:00

  • SNP selection in genome-wide and candidate gene studies via penalized logistic regression.

    abstract::Penalized regression methods offer an attractive alternative to single marker testing in genetic association analysis. Penalized regression methods shrink down to zero the coefficient of markers that have little apparent effect on the trait of interest, resulting in a parsimonious subset of what we hope are true perti...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20543

    authors: Ayers KL,Cordell HJ

    更新日期:2010-12-01 00:00:00

  • Exploiting pleiotropy to map genes for oligogenic phenotypes using extended pedigree data.

    abstract::We investigated the utility of two approaches for exploiting pleiotropy to search for genes influencing related traits. To do this we first assessed the genetic correlations among a set of five closely related quantitative traits (Q1, Q2, Q3, Q4, Q5). We then used the genetic correlations among these five traits both ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1997)14:6<975::AID-GEPI69>

    authors: Comuzzie AG,Mahaney MC,Almasy L,Dyer TD,Blangero J

    更新日期:1997-01-01 00:00:00

  • Identification of susceptibility loci contributing to a complex disease using conventional segregation, linkage, and association methods.

    abstract::We set out to apply conventional analytic methods to a GAW data set of nuclear families with an oligogenic disease that has a population prevalence of 0.023. We chose methods generally applied to disorders with at least one major gene. Our approaches included: 1) complex segregation analysis under two models of ascert...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370120613

    authors: Falk CT,Ashley A,Lamb N,Sherman SL

    更新日期:1995-01-01 00:00:00