Genetic background comparison using distance-based regression, with applications in population stratification evaluation and adjustment.

Abstract:

:Population stratification (PS) can lead to an inflated rate of false-positive findings in genome-wide association studies (GWAS). The commonly used approach of adjustment for a fixed number of principal components (PCs) could have a deleterious impact on power when selected PCs are equally distributed in cases and controls, or the adjustment of certain covariates, such as self-identified ethnicity or recruitment center, already included in the association analyses, correctly maps to major axes of genetic heterogeneity. We propose a computationally efficient procedure, PC-Finder, to identify a minimal set of PCs while permitting an effective correction for PS. A general pseudo F statistic, derived from a non-parametric multivariate regression model, can be used to assess whether PS exists or has been adequately corrected by a set of selected PCs. Empirical data from two GWAS conducted as part of the Cancer Genetic Markers of Susceptibility (CGEMS) project demonstrate the application of the procedure. Furthermore, simulation studies show the power advantage of the proposed procedure in GWAS over currently used PS correction strategies, particularly when the PCs with substantial genetic variation are distributed similarly in cases and controls and therefore do not induce PS.

journal_name

Genet Epidemiol

journal_title

Genetic epidemiology

authors

Li Q,Wacholder S,Hunter DJ,Hoover RN,Chanock S,Thomas G,Yu K

doi

10.1002/gepi.20396

subject

Has Abstract

pub_date

2009-07-01 00:00:00

pages

432-41

issue

5

eissn

0741-0395

issn

1098-2272

journal_volume

33

pub_type

杂志文章
  • Linkage disequilibrium structure and its impact on the localization of a candidate functional mutation.

    abstract::We have used the unblinded MG1/Q1 Genetic Analysis Workshop 12 simulated data as a model system for investigating the use of linkage disequilibrium structure and simple genotype-phenotype associations to identify candidate functional mutations within a gene of interest. Analysis of the pattern of pairwise linkage dise...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s620

    authors: Huang Q,Morrison AC,Boerwinkle E

    更新日期:2001-01-01 00:00:00

  • Estimation of a significance threshold for epigenome-wide association studies.

    abstract::Epigenome-wide association studies (EWAS) are designed to characterise population-level epigenetic differences across the genome and link them to disease. Most commonly, they assess DNA-methylation status at cytosine-guanine dinucleotide (CpG) sites, using platforms such as the Illumina 450k array that profile a subse...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22086

    authors: Saffari A,Silver MJ,Zavattari P,Moi L,Columbano A,Meaburn EL,Dudbridge F

    更新日期:2018-02-01 00:00:00

  • Score tests for familial correlation in genotyped-proband designs.

    abstract::In the genotyped-proband design, a proband is selected based on an observed phenotype, the genotype of the proband is observed, and then the phenotypes of all first-degree relatives are obtained. The genotypes of these first-degree relatives are not observed. Gail et al. [(1999) Genet Epidemiol] discuss likelihood ana...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(200004)18:4<293::AID-GEPI3

    authors: Carroll RJ,Gail MH,Benichou J,Pee D

    更新日期:2000-04-01 00:00:00

  • Allelic association patterns for a dense SNP map.

    abstract::A dense set of 5,000 SNPs on a 10-Mb region of human chromosome 20 has been typed on samples of African Americans, East Asians, and United Kingdom Caucasians. There are departures from Hardy-Weinberg equilibrium beyond the level at which markers are often discarded because of possible genotyping errors. The observatio...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20038

    authors: Weir BS,Hill WG,Cardon LR,SNP Consortium.

    更新日期:2004-12-01 00:00:00

  • Genetic prediction in the Genetic Analysis Workshop 18 sequencing data.

    abstract::High-throughput sequencing data can be used to predict phenotypes from genotypes, and this corresponds to establishing a prognostic model. In extended pedigrees the relatedness of subjects provides additional information so that genetic values, fixed or random genetic components, and heritability can be estimated. At ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21826

    authors: Ziegler A,Bohossian N,Diego VP,Yao C

    更新日期:2014-09-01 00:00:00

  • Gene-dropping vs. empirical variance estimation for allele-sharing linkage statistics.

    abstract::In this study, we compare the statistical properties of a number of methods for estimating P-values for allele-sharing statistics in non-parametric linkage analysis. Some of the methods are based on the normality assumption, using different variance estimation methods, and others use simulation (gene-dropping) to find...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20177

    authors: Jung J,Weeks DE,Feingold E

    更新日期:2006-12-01 00:00:00

  • Kernel Approach for Modeling Interaction Effects in Genetic Association Studies of Complex Quantitative Traits.

    abstract::The etiology of complex traits likely involves the effects of genetic and environmental factors, along with complicated interaction effects between them. Consequently, there has been interest in applying genetic association tests of complex traits that account for potential modification of the genetic effect in the pr...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21901

    authors: Broadaway KA,Duncan R,Conneely KN,Almli LM,Bradley B,Ressler KJ,Epstein MP

    更新日期:2015-07-01 00:00:00

  • Genome-wide family-based linkage analysis of exome chip variants and cardiometabolic risk.

    abstract::Linkage analysis of complex traits has had limited success in identifying trait-influencing loci. Recently, coding variants have been implicated as the basis for some biomedical associations. We tested whether coding variants are the basis for linkage peaks of complex traits in 42 African-American (n = 596) and 90 His...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21801

    authors: Hellwege JN,Palmer ND,Raffield LM,Ng MC,Hawkins GA,Long J,Lorenzo C,Norris JM,Ida Chen YD,Speliotes EK,Rotter JI,Langefeld CD,Wagenknecht LE,Bowden DW

    更新日期:2014-05-01 00:00:00

  • Rare-variant association tests in longitudinal studies, with an application to the Multi-Ethnic Study of Atherosclerosis (MESA).

    abstract::Over the past few years, an increasing number of studies have identified rare variants that contribute to trait heritability. Due to the extreme rarity of some individual variants, gene-based association tests have been proposed to aggregate the genetic variants within a gene, pathway, or specific genomic region as op...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22081

    authors: He Z,Lee S,Zhang M,Smith JA,Guo X,Palmas W,Kardia SLR,Ionita-Laza I,Mukherjee B

    更新日期:2017-12-01 00:00:00

  • Effect of physical activity on lipid levels in a population-based sample of men with and without the Arg192 variant of the human paraoxonase gene.

    abstract::The prevalence of cardiovascular risk factors in Gerona, Spain, is high for the low myocardial infarction incidence and mortality rates in the province. Physical activity is a protective factor against coronary heart disease. We investigated whether the genetic variants Q and R of the paraoxonase Gln-Arg 192 polymorph...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(200003)18:3<276::AID-GEPI6

    authors: Sentí M,Aubó C,Elosua R,Sala J,Tomás M,Marrugat J

    更新日期:2000-03-01 00:00:00

  • Evaluation of path analysis through computer simulation: effect of incorrectly assuming independent distribution of familial correlations.

    abstract::Path analysis of family data has been widely applied to resolve genetic and environmental patterns of familial resemblance. A prevalent statistical approach in path analysis has been, first, to estimate the familial correlations and, second, by assuming these estimates to be independently distributed, define a likelih...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370010305

    authors: McGue M,Wette R,Rao DC

    更新日期:1984-01-01 00:00:00

  • Multiethnic polygenic risk scores improve risk prediction in diverse populations.

    abstract::Methods for genetic risk prediction have been widely investigated in recent years. However, most available training data involves European samples, and it is currently unclear how to accurately predict disease risk in other populations. Previous studies have used either training data from European samples in large sam...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22083

    authors: Márquez-Luna C,Loh PR,South Asian Type 2 Diabetes (SAT2D) Consortium.,SIGMA Type 2 Diabetes Consortium.,Price AL

    更新日期:2017-12-01 00:00:00

  • Data mining and computationally intensive methods: summary of Group 7 contributions to Genetic Analysis Workshop 13.

    abstract::The Framingham Heart Study data, as well as a related simulated data set, were generously provided to the participants of the Genetic Analysis Workshop 13 in order that newly developed and emerging statistical methodologies could be tested on that well-characterized data set. The impetus driving the development of nov...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.10285

    authors: Costello TJ,Falk CT,Ye KQ

    更新日期:2003-01-01 00:00:00

  • Variance component models for X-linked QTLs.

    abstract::This paper discusses the theory and implementation of a model for mapping X-linked quantitative trait loci (QTL). As a result of X inactivation, a female's body is subdivided into a number of patches. In each patch one of her two X chromosomes is randomly switched off. This smooths the allelic contributions in a heter...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20158

    authors: Lange K,Sobel E

    更新日期:2006-07-01 00:00:00

  • A novel association test for multiple secondary phenotypes from a case-control GWAS.

    abstract::In the past decade, many genome-wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case-control design. These GWASs not only collect information on the disease status (primary phenotype, D) and the SNPs (genotypes, X), but...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章,随机对照试验

    doi:10.1002/gepi.22045

    authors: Ray D,Basu S

    更新日期:2017-07-01 00:00:00

  • APO B 3' HVR polymorphism in healthy population: relationships to serum lipid levels.

    abstract::We have analyzed allele frequency distribution at the hypervariable locus 3' to the apolipoprotein B gene in a healthy population sample (241 women and 246 men) from the Belgrade area. The bimodal distribution of sixteen different hypervariable region (HVR) alleles and the heterozygosity index (average 0.76) in both s...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1998)15:2<113::AID-GEPI1>3

    authors: Alavantić D,Glisić S,Kandić I

    更新日期:1998-01-01 00:00:00

  • Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale.

    abstract::Logistic regression is the primary analysis tool for binary traits in genome-wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22276

    authors: German CA,Sinsheimer JS,Klimentidis YC,Zhou H,Zhou JJ

    更新日期:2020-04-01 00:00:00

  • Improving power for rare-variant tests by integrating external controls.

    abstract::Due to the drop in sequencing cost, the number of sequenced genomes is increasing rapidly. To improve power of rare-variant tests, these sequenced samples could be used as external control samples in addition to control samples from the study itself. However, when using external controls, possible batch effects due to...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22057

    authors: Lee S,Kim S,Fuchsberger C

    更新日期:2017-11-01 00:00:00

  • Increased risk for familial ovarian cancer among Jewish women: a population-based case-control study.

    abstract::Jewish women have been reported to have a higher risk for familial breast cancer than non-Jewish women and to be more likely to carry mutations in breast cancer genes such as BRCA1. Because BRCA1 mutations also increase women's risk for ovarian cancer, we asked whether Jewish women are at higher risk for familial ovar...

    journal_title:Genetic epidemiology

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/(SICI)1098-2272(1998)15:1<51::AID-GEPI4>3.

    authors: Steinberg KK,Pernarelli JM,Marcus M,Khoury MJ,Schildkraut JM,Marchbanks PA

    更新日期:1998-01-01 00:00:00

  • Comparison of the QTDT analysis for IgE in the CSGA data set.

    abstract::Over the past few years at least 13 transmission/disequilibrium test (TDT)-based tests have been developed for quantitative (Q) traits for the assessment of association or linkage in the presence of the other. A total of six of these QTDT methods were used to analyze log10IgE in the Collaborative Study on the Genetics...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s312

    authors: Page GP,Wilcox MA,Occhiuto J,Adak S,Neuberg D,Bajorunaite R,George V

    更新日期:2001-01-01 00:00:00

  • Bayesian meta-analysis across genome-wide association studies of diverse phenotypes.

    abstract::Genome-wide association studies (GWAS) are a powerful tool for understanding the genetic basis of diseases and traits, but most studies have been conducted in isolation, with a focus on either a single or a set of closely related phenotypes. We describe MetABF, a simple Bayesian framework for performing integrative me...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章,meta分析

    doi:10.1002/gepi.22202

    authors: Trochet H,Pirinen M,Band G,Jostins L,McVean G,Spencer CCA

    更新日期:2019-07-01 00:00:00

  • Scope and strategies of genetic epidemiology: analysis of articles published in Genetic Epidemiology, 1984-1991.

    abstract::Genetic epidemiology is a relatively new discipline that seeks to unravel the role of genetic factors and their interactions with environmental factors in the etiology of diseases, using population and family study approaches. To characterize the overall direction and emphasis of research strategies used in this field...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370100505

    authors: Khoury MJ,Beaty TH,Cohen BH

    更新日期:1993-01-01 00:00:00

  • The impact of improved microarray coverage and larger sample sizes on future genome-wide association studies.

    abstract::Genome-wide association studies (GWAS) have identified many single nucleotide polymorphisms (SNPs) associated with complex traits. However, the genetic heritability of most of these traits remains unexplained. To help guide future studies, we address the crucial question of whether future GWAS can detect new SNP assoc...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21724

    authors: Lindquist KJ,Jorgenson E,Hoffmann TJ,Witte JS

    更新日期:2013-05-01 00:00:00

  • Trends in prenatal diagnosis of Down syndrome and other autosomal trisomies in Scotland 1990 to 1994, with associated cytogenetic and epidemiological findings.

    abstract::The present report summarizes findings on 670 cases of autosomal trisomy diagnosed in Scotland, with actual or expected dates of delivery in 1990 to 1994 inclusive. Cases were notified by cytogenetic service laboratories. There were 277 prenatal and 369 postnatal diagnoses and 24 spontaneous losses. Excluding the latt...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1999)16:2<179::AID-GEPI5>3

    authors: Carothers AD,Boyd E,Lowther G,Ellis PM,Couzin DA,Faed MJ,Robb A

    更新日期:1999-01-01 00:00:00

  • Comparison of variance components, ANOVA and regression of offspring on midparent (ROMP) methods for SNP markers.

    abstract::An extension of the traditional regression of offspring on midparent (ROMP) method was used to estimate the heritability of the trait, test for marker association, and estimate the heritability attributable to a marker locus. The fifty replicates of the Genetic Analysis Workshop (GAW) 12 simulated general population d...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s794

    authors: Pugh EW,Papanicolaou GJ,Justice CM,Roy-Gagnon MH,Sorant AJ,Kingman A,Wilson AF

    更新日期:2001-01-01 00:00:00

  • Familial resemblance of bone mass in adult women.

    abstract::Bone mass may be so reduced in some individuals as to be characterized as osteoporotic, with resulting fracture, particularly of the proximal femur, vertebrae, or wrist. We identified 34 mother-daughter sets (n = 70) and 29 sibling sets (n = 59) from a community study of bone mass correlates to assess the degree of re...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370030204

    authors: Sowers MR,Burns TL,Wallace RB

    更新日期:1986-01-01 00:00:00

  • Measuring the inflation of the lod score due to its maximization over model parameter values in human linkage analysis.

    abstract::A computer-simulation method is presented for determining and correcting for the effect of maximizing the lod score over disease definitions, penetrance values, and perhaps other model parameters. The method consists of simulating the complete analysis using marker genotypes randomly generated under the assumption of ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370070402

    authors: Weeks DE,Lehner T,Squires-Wheeler E,Kaufmann C,Ott J

    更新日期:1990-01-01 00:00:00

  • Power of non-parametric linkage analysis in mapping genes contributing to human longevity in long-lived sib-pairs.

    abstract::This report investigates the power issue in applying the non-parametric linkage analysis of affected sib-pairs (ASP) [Kruglyak and Lander, 1995: Am J Hum Genet 57:439-454] to localize genes that contribute to human longevity using long-lived sib-pairs. Data were simulated by introducing a recently developed statistica...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.10304

    authors: Tan Q,Zhao JH,Iachine I,Hjelmborg J,Vach W,Vaupel JW,Christensen K,Kruse TA

    更新日期:2004-04-01 00:00:00

  • POLARIS: Polygenic LD-adjusted risk score approach for set-based analysis of GWAS data.

    abstract::Polygenic risk scores (PRSs) are a method to summarize the additive trait variance captured by a set of SNPs, and can increase the power of set-based analyses by leveraging public genome-wide association study (GWAS) datasets. PRS aims to assess the genetic liability to some phenotype on the basis of polygenic risk fo...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22117

    authors: Baker E,Schmidt KM,Sims R,O'Donovan MC,Williams J,Holmans P,Escott-Price V,Consortium WTG

    更新日期:2018-06-01 00:00:00

  • A two-locus model for familial Alzheimer's disease?

    abstract::The present findings for familial Alzheimer's disease suggest a possible linkage to gene(s) on chromosome 21 for the early onset form and to chromosome 19 for the late onset. Since these results are not unequivocal, possible alternative hypotheses include the effect of genetic heterogeneity or of an oligogenic model o...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370100618

    authors: Macciardi F,Cavallini MC

    更新日期:1993-01-01 00:00:00