Abstract:
:Methods for genetic risk prediction have been widely investigated in recent years. However, most available training data involves European samples, and it is currently unclear how to accurately predict disease risk in other populations. Previous studies have used either training data from European samples in large sample size or training data from the target population in small sample size, but not both. Here, we introduce a multiethnic polygenic risk score that combines training data from European samples and training data from the target population. We applied this approach to predict type 2 diabetes (T2D) in a Latino cohort using both publicly available European summary statistics in large sample size (Neff = 40k) and Latino training data in small sample size (Neff = 8k). Here, we attained a >70% relative improvement in prediction accuracy (from R2 = 0.027 to 0.047) compared to methods that use only one source of training data, consistent with large relative improvements in simulations. We observed a systematically lower load of T2D risk alleles in Latino individuals with more European ancestry, which could be explained by polygenic selection in ancestral European and/or Native American populations. We predict T2D in a South Asian UK Biobank cohort using European (Neff = 40k) and South Asian (Neff = 16k) training data and attained a >70% relative improvement in prediction accuracy, and application to predict height in an African UK Biobank cohort using European (N = 113k) and African (N = 2k) training data attained a 30% relative improvement. Our work reduces the gap in polygenic risk prediction accuracy between European and non-European target populations.
journal_name
Genet Epidemioljournal_title
Genetic epidemiologyauthors
Márquez-Luna C,Loh PR,South Asian Type 2 Diabetes (SAT2D) Consortium.,SIGMA Type 2 Diabetes Consortium.,Price ALdoi
10.1002/gepi.22083subject
Has Abstractpub_date
2017-12-01 00:00:00pages
811-823issue
8eissn
0741-0395issn
1098-2272journal_volume
41pub_type
杂志文章abstract::Testing association between a genetic marker and multiple-dependent traits is a challenging task when both binary and quantitative traits are involved. The inverted regression model is a convenient method, in which the traits are treated as predictors although the genetic marker is an ordinal response. It is known tha...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21738
更新日期:2013-09-01 00:00:00
abstract::Genome-wide association (GWA) studies have proved extremely successful in identifying novel genetic loci contributing effects to complex human diseases. In doing so, they have highlighted the fact that many potential loci of modest effect remain undetected, partly due to the need for samples consisting of many thousan...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20482
更新日期:2010-05-01 00:00:00
abstract::We develop a Bayesian multi-SNP Markov chain Monte Carlo approach that allows published functional significance scores to objectively inform single nucleotide polymorphism (SNP) prior effect sizes in expression quantitative trait locus (eQTL) studies. We developed the Normal Gamma prior to allow the inclusion of funct...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21961
更新日期:2016-05-01 00:00:00
abstract::Healthy male monozygotic (MZ) and dizygotic (DZ) twin pairs (MZ pairs = 77; DZ pairs = 88) were studied to assess the effect of dietary intake, physical activity, physical fitness, body mass index (BMI), sum of the triceps and subscapular skinfold measurements, alcohol and caffeine consumption, and smoking patterns on...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370050409
更新日期:1988-01-01 00:00:00
abstract::Twin pairs are sometimes included in studies because at least one of them is a proband, and conventionally the analysis of the data is based on the conditional distribution of the co twin given the proband. In the case of more than one proband in each pair, an often used "ad hoc" method of analysis is to allow each tw...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.10253
更新日期:2003-11-01 00:00:00
abstract::Given the rapid pace with which genomics and other -omics disciplines are evolving, it is sometimes necessary to shift down a gear to consider more general scientific questions. In this line, in my presidential address I formulate six questions for genetic epidemiologists to ponder on. These cover the areas of reprodu...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22191
更新日期:2019-04-01 00:00:00
abstract::Regressive models that incorporate measured variables and assumed genetic parameters were used to detect interactions between gene, research site, and environmental variables in GAW11 Problem 2. Replicates 1 to 5 were used in the analyses. Significant three-way gene x environment x site interactions were seen for all ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.13701707118
更新日期:1999-01-01 00:00:00
abstract::We investigated a variety of methods for pooling data from eight data sets (n = 5,424 subjects) to validate evidence for linkage of markers in the cytokine cluster on chromosome 5q31-33 to asthma and asthma-associated phenotypes. Chromosome 5 markers were integrated into current genetic linkage and physical maps, and ...
journal_title:Genetic epidemiology
pub_type: 杂志文章,meta分析
doi:10.1002/gepi.2001.21.s1.s103
更新日期:2001-01-01 00:00:00
abstract::Recently, Liang et al. ([2001b] Genet. Epidemiol. 21:105-122) proposed a conditional approach to assess linkage evidence on the target region by incorporating linkage information from an unlinked (reference) region using allele shared IBD (identity-by-decent) from affected sib pairs. This is carried out by conditionin...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.10305
更新日期:2004-02-01 00:00:00
abstract::Multipoint linkage analysis using sibpair designs remains a common approach to help investigators to narrow chromosomal regions for traits (either qualitative or quantitative) of interest. Despite its popularity, the success of this approach depends heavily on how issues such as genetic heterogeneity, gene-gene, and g...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20036
更新日期:2005-01-01 00:00:00
abstract::The aim of this population-based study was to determine whether asthma aggregates in families, and if so, whether aggregation was consistent with environmental and/or genetic etiologies. Data were from 7,394 nuclear families (41,506 individuals) from the 1968 Tasmanian Asthma Survey, in which all Tasmanian schoolchild...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1997)14:3<317::AID-GEPI9>3
更新日期:1997-01-01 00:00:00
abstract::Population stratification (PS) can lead to an inflated rate of false-positive findings in genome-wide association studies (GWAS). The commonly used approach of adjustment for a fixed number of principal components (PCs) could have a deleterious impact on power when selected PCs are equally distributed in cases and con...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20396
更新日期:2009-07-01 00:00:00
abstract::This paper discusses the theory and implementation of a model for mapping X-linked quantitative trait loci (QTL). As a result of X inactivation, a female's body is subdivided into a number of patches. In each patch one of her two X chromosomes is randomly switched off. This smooths the allelic contributions in a heter...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20158
更新日期:2006-07-01 00:00:00
abstract::Path analysis of family data has been widely applied to resolve genetic and environmental patterns of familial resemblance. A prevalent statistical approach in path analysis has been, first, to estimate the familial correlations and, second, by assuming these estimates to be independently distributed, define a likelih...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370010305
更新日期:1984-01-01 00:00:00
abstract::We construct data exploration tools for recognizing important covariate patterns associated with a phenotype, with particular focus on searching for association with gene-gene patterns. To this end, we propose a new variable selection procedure that employs latent selection weights and compare it to an alternative for...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21661
更新日期:2012-09-01 00:00:00
abstract::To evaluate the risk of a disease associated with the joint effects of genetic susceptibility and environmental exposures, epidemiologic researchers often test for non-multiplicative gene-environment effects from case-control studies. In this article, we present a comparative study of four alternative tests for intera...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20337
更新日期:2008-11-01 00:00:00
abstract::Genes with imprinting (parent-of-origin) effects express differently when inheriting from the mother or from the father. Some genes for development and behavior in mammals are known to be imprinted. We developed parametric linkage analysis that accounts for imprinting effects for continuous traits, implementing it in ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20321
更新日期:2008-07-01 00:00:00
abstract::Artificial neural networks were applied to the alcoholism data to reveal nonlinear relationships between intermediate phenotypes, marker identity-by-descent sharing, and the affection status. A variable number of hidden units were considered to achieve a balance between the minimal mean-squared error and over-fitting ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370170738
更新日期:1999-01-01 00:00:00
abstract::Human apolipoprotein A-IV (APO A-IV) exhibits a common protein polymorphism detectable by isoelectric focusing (IEF) due to a single base substitution at codon 360 which replaces the frequently occurring glutamine residue (allele 1) with histidine (allele 2). Recently, sequence analysis of the APO A-IV coding region h...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370090503
更新日期:1992-01-01 00:00:00
abstract::The role of a gene in a disease may be hidden by the presence of another risk factor such as an environmental factor. In that case, stratifying the data according to this factor strengthens power to detect linkage or association. We followed this strategy on the simulated data provided by GAW11. The transmission/diseq...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370170788
更新日期:1999-01-01 00:00:00
abstract::We have analyzed allele frequency distribution at the hypervariable locus 3' to the apolipoprotein B gene in a healthy population sample (241 women and 246 men) from the Belgrade area. The bimodal distribution of sixteen different hypervariable region (HVR) alleles and the heterozygosity index (average 0.76) in both s...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1998)15:2<113::AID-GEPI1>3
更新日期:1998-01-01 00:00:00
abstract::Recently, testing for anticipation has received renewed interest. It is well known that standard statistical methods are inappropriate for this purpose due to problems of sampling bias. Few statistical tests have been proposed for comparing mean age of onset in affected parents with mean age of onset in affected child...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20057
更新日期:2005-04-01 00:00:00
abstract::Variable selection is growing in importance with the advent of high throughput genotyping methods requiring analysis of hundreds to thousands of single nucleotide polymorphisms (SNPs) and the increased interest in using these genetic studies to better understand common, complex diseases. Up to now, the standard approa...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20353
更新日期:2009-01-01 00:00:00
abstract::Genetic association studies of obstetric complications may genotype case and control mothers, or their respective newborns, or both case-control mothers and their children. The relatively high prevalence of many obstetric complications and the availability of both maternal and offspring's genotype data have provided m...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20406
更新日期:2009-09-01 00:00:00
abstract::We investigate the relevance of the genetic determination of bone mineral density (BMD) variation to that of differential risk to osteoporotic fractures (OF). The high heritability (h(2)) of BMD and the significant phenotypic correlations between high BMD and low risk to OF are well known. Little is reported on h(2) f...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1040
更新日期:2002-01-01 00:00:00
abstract::Variance component linkage analysis is commonly used to map quantitative trait loci (QTLs) in general pedigrees. Large pedigrees are especially attractive for these studies because they provide greater power per genotyped individual than small pedigrees. We propose accurate and computationally efficient methods to cal...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20160
更新日期:2006-09-01 00:00:00
abstract::A family cancer database was constructed from the nationwide Swedish registries and includes approximately 6 million persons and >30,000 cancers in offspring diagnosed at ages 15-51 years and their parents. A particular advantage of the database is that the contribution of both parental lineages on cancer risk can be ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1998)15:3<225::AID-GEPI2>3
更新日期:1998-01-01 00:00:00
abstract::The etiology of complex traits likely involves the effects of genetic and environmental factors, along with complicated interaction effects between them. Consequently, there has been interest in applying genetic association tests of complex traits that account for potential modification of the genetic effect in the pr...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21901
更新日期:2015-07-01 00:00:00
abstract::Power estimations are important for optimizing genotype-phenotype association study designs. However, existing frameworks are designed for common disorders, and thus ill-suited for the inherent challenges of studies for low-prevalence conditions such as rare diseases and infrequent adverse drug reactions. These challe...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22129
更新日期:2018-07-01 00:00:00
abstract::Case-only studies are often used to identify interactions between a genetic factor and an environmental factor under the assumption both factors are independent in the population. However, interpreting a statistical association between the genetic and the environmental factors among the cases, as evidence of a mechani...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20484
更新日期:2010-05-01 00:00:00