Hierarchical Bayesian model for rare variant association analysis integrating genotype uncertainty in human sequence data.

Abstract:

:Next-generation sequencing (NGS) has led to the study of rare genetic variants, which possibly explain the missing heritability for complex diseases. Most existing methods for rare variant (RV) association detection do not account for the common presence of sequencing errors in NGS data. The errors can largely affect the power and perturb the accuracy of association tests due to rare observations of minor alleles. We developed a hierarchical Bayesian approach to estimate the association between RVs and complex diseases. Our integrated framework combines the misclassification probability with shrinkage-based Bayesian variable selection. It allows for flexibility in handling neutral and protective RVs with measurement error, and is robust enough for detecting causal RVs with a wide spectrum of minor allele frequency (MAF). Imputation uncertainty and MAF are incorporated into the integrated framework to achieve the optimal statistical power. We demonstrate that sequencing error does significantly affect the findings, and our proposed model can take advantage of it to improve statistical power in both simulated and real data. We further show that our model outperforms existing methods, such as sequence kernel association test (SKAT). Finally, we illustrate the behavior of the proposed method using a Finnish low-density lipoprotein cholesterol study, and show that it identifies an RV known as FH North Karelia in LDLR gene with three carriers in 1,155 individuals, which is missed by both SKAT and Granvil.

journal_name

Genet Epidemiol

journal_title

Genetic epidemiology

authors

He L,Pitkäniemi J,Sarin AP,Salomaa V,Sillanpää MJ,Ripatti S

doi

10.1002/gepi.21871

subject

Has Abstract

pub_date

2015-02-01 00:00:00

pages

89-100

issue

2

eissn

0741-0395

issn

1098-2272

journal_volume

39

pub_type

杂志文章
  • Apolipoprotein E phenotype, arterial disease, and mortality among older women: the study of osteoporotic fractures.

    abstract::This study is an investigation of the relationship between apolipoprotein E (apoE) phenotype, arterial disease, and mortality in a group of women (n = 1,751) aged 65 years and older enrolled in the Study of Osteoporotic Fractures. Crude mortality rates were highest among women with the 4-3 and 4-4 phenotypes but age-a...

    journal_title:Genetic epidemiology

    pub_type: 临床试验,杂志文章,多中心研究

    doi:10.1002/(SICI)1098-2272(1997)14:2<147::AID-GEPI4>3

    authors: Vogt MT,Cauley JA,Kuller LH

    更新日期:1997-01-01 00:00:00

  • Allelic association patterns for a dense SNP map.

    abstract::A dense set of 5,000 SNPs on a 10-Mb region of human chromosome 20 has been typed on samples of African Americans, East Asians, and United Kingdom Caucasians. There are departures from Hardy-Weinberg equilibrium beyond the level at which markers are often discarded because of possible genotyping errors. The observatio...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20038

    authors: Weir BS,Hill WG,Cardon LR,SNP Consortium.

    更新日期:2004-12-01 00:00:00

  • Allelic association in large pedigrees.

    abstract::We subjected the first replication of the simulated isolated population data set to a novel analysis for association between marker alleles and either disease phenotypes or quantitative variable. The analysis depends on being able to reliably reconstruct all haplotypes in the pedigree. This was achieved using the MCLI...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s571

    authors: Gutin A,Abkevich V,Camp NJ,Farnham JM,Cannon-Albright L,Thomas A

    更新日期:2001-01-01 00:00:00

  • The impact of improved microarray coverage and larger sample sizes on future genome-wide association studies.

    abstract::Genome-wide association studies (GWAS) have identified many single nucleotide polymorphisms (SNPs) associated with complex traits. However, the genetic heritability of most of these traits remains unexplained. To help guide future studies, we address the crucial question of whether future GWAS can detect new SNP assoc...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21724

    authors: Lindquist KJ,Jorgenson E,Hoffmann TJ,Witte JS

    更新日期:2013-05-01 00:00:00

  • Linkage analysis of Alzheimer's disease with methods using relative pairs.

    abstract::Four relative-pair methods for detecting genetic linkage were applied to familial Alzheimer's disease data. Results obtained using an extended Haseman-Elston test and a weighted rank pairwise correlation test, which both use information from all relative pairs, were consistent with previously published likelihood resu...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370100608

    authors: Blossey H,Commenges D,Olson JM

    更新日期:1993-01-01 00:00:00

  • Segregation analysis of juvenile myoclonic epilepsy.

    abstract::We examined the inheritance of juvenile myoclonic epilepsy (JME). We looked at both the trait of "epilepsy" and the trait of "epilepsy-plus-EEG abnormalities," since EEG abnormalities are frequently found in the clinically unaffected sibs of JME patients. We tested several modes of inheritance including the fully pene...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370050204

    authors: Greenberg DA,Delgado-Escueta AV,Maldonado HM,Widelitz H

    更新日期:1988-01-01 00:00:00

  • Demonstration of a common major gene with pleiotropic effects on immunoglobulin E levels and allergy.

    abstract::Atopic disease is generally recognized to be familial, although specific genetic components have yet to be identified. High levels of a unique class of immunoglobulins, immunoglobulin E (IgE), have been shown to be associated with allergies. Several investigators have reported evidence indicating a recessive regulator...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370020402

    authors: Borecki IB,Rao DC,Lalouel JM,McGue M,Gerrard JW

    更新日期:1985-01-01 00:00:00

  • Increasing the power of identifying gene x gene interactions in genome-wide association studies.

    abstract::In this paper we investigate the power to identify gene x gene interactions in genome-wide association studies. In our analysis we focus on two-stage analyses: analyses in which we only test for interactions between single nucleotide polymorphisms that show some marginal effect. We give two algorithms to compute signi...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20300

    authors: Kooperberg C,Leblanc M

    更新日期:2008-04-01 00:00:00

  • Investigation of a candidate gene, environment, and G x E interaction using case-control and case-parent study designs.

    abstract::We investigated the independent contributions of a candidate gene and an environmental factor, and the presence of gene x environment (G x E) interaction, in the etiology of a disease in the Genetic Analysis Workshop (GAW) 12 problem 2 simulated data using a two-stage approach utilizing both case-control and case-pare...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s843

    authors: Norris JM,Selinger-Leneman H,Génin E

    更新日期:2001-01-01 00:00:00

  • Lessons learned from Genetic Analysis Workshop 17: transitioning from genome-wide association studies to whole-genome statistical genetic analysis.

    abstract::Genetic Analysis Workshop 17 (GAW17) focused on the transition from genome-wide association study designs and methods to the study designs and statistical genetic methods that will be required for the analysis of next-generation sequence data including both common and rare sequence variants. In the 166 contributions t...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20659

    authors: Wilson AF,Ziegler A

    更新日期:2011-01-01 00:00:00

  • Testing for association in SLE families.

    abstract::Systemic lupus erythematosus (SLE) is a complex disease which is partly determined by genetic factors which influence susceptibility to the disease phenotype. In this association study we try to define the high risk haplotypes which are responsible for this disease, together with other environmental factors. In many o...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370080607

    authors: Seuchter SA,Knapp M,Hartung K,Coldewey R,Kalden JR,Lakomek HJ,Peter HH,Deicher H,Baur MP

    更新日期:1991-01-01 00:00:00

  • Improving power in genome-wide association studies: weights tip the scale.

    abstract::The potential of genome-wide association analysis can only be realized when they have power to detect signals despite the detrimental effect of multiple testing on power. We develop a weighted multiple testing procedure that facilitates the input of prior information in the form of groupings of tests. For each group a...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20237

    authors: Roeder K,Devlin B,Wasserman L

    更新日期:2007-11-01 00:00:00

  • Use of variable marker density, principal components, and neural networks in the dissection of disease etiology.

    abstract::Several approaches were taken to identify the loci contributing to the quantitative and qualitative phenotypes in the Genetic Analysis Workshop 12 simulated data set. To identify possible quantitative trait loci (QTL), the quantitative traits were analyzed using SOLAR. The four replicates identified as the "best repli...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s732

    authors: Pankratz N,Kirkwood SC,Flury L,Koller DL,Foroud T

    更新日期:2001-01-01 00:00:00

  • How can maximum likelihood methods reveal candidate gene effects on a quantitative trait?

    abstract::Different maximum likelihood approaches were used to explore the role of candidate genes in the variability of quantitative trait Q1 while accounting for the effects of age, Q2, and Q3. Segregation analysis, under the class D regressive model, provides evidence for a Mendelian gene effect on the adjusted trait Q1. Res...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370120643

    authors: Martinez M,Abel L,Demenais F

    更新日期:1995-01-01 00:00:00

  • Sib-pair linkage tests for disease susceptibility loci: common tests vs. the asymptotically most powerful test.

    abstract::Several statistical tests for linkage between a disease susceptibility locus and a marker locus for sib-pair data are examined analytically. Two common statistics, a test based on the mean number of marker alleles shared identical by descent by sib-pairs, and a test based on the proportion of sib-pairs sharing exactly...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370070506

    authors: Schaid DJ,Nick TG

    更新日期:1990-01-01 00:00:00

  • Entropy-supported marker selection and Mantel statistics for haplotype sharing analysis.

    abstract::Haplotype sharing analysis is a well-established option for the investigation of the etiology of complex diseases. The statistical power of haplotype association methods depends strongly on how the information of unobserved haplotypes can be captured by multilocus genotypes. In this study we combine an entropy-based m...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20491

    authors: Schulz A,Fischer C,Chang-Claude J,Beckmann L

    更新日期:2010-05-01 00:00:00

  • Constructing meiotic maps with known error probability.

    abstract::We propose methods to construct meiotic gene maps while controlling the probability of a decision-error. First, a single step gene ordering procedure is presented whose decision-error probability is bounded above by a prespecified threshold. The bound for the error probability is valid under quite general circumstance...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1999)16:3<274::AID-GEPI4>3

    authors: Rogatko A,Babb J,Jordan H,Zacks S

    更新日期:1999-01-01 00:00:00

  • Fitting ACE structural equation models to case-control family data.

    abstract::Investigators interested in whether a disease aggregates in families often collect case-control family data, which consist of disease status and covariate information for members of families selected via case or control probands. Here, we focus on the use of case-control family data to investigate the relative contrib...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20454

    authors: Javaras KN,Hudson JI,Laird NM

    更新日期:2010-04-01 00:00:00

  • Heritability analysis of nontraditional glycemic biomarkers in the Atherosclerosis Risk in Communities Study.

    abstract::Nontraditional glycemic biomarkers, including fructosamine, glycated albumin, and 1,5-anhydroglucitol (1,5-AG) are potential alternatives or complement to traditional measures of hyperglycemia. Genetic variants are associated with these biomarkers, but the heritability, or extent to which genetics control their variat...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22243

    authors: Loomis SJ,Tin A,Coresh J,Boerwinkle E,Pankow JS,Köttgen A,Selvin E,Duggal P

    更新日期:2019-10-01 00:00:00

  • Replication of genetic associations as pseudoreplication due to shared genealogy.

    abstract::The genotypes of individuals in replicate genetic association studies have some level of correlation due to shared descent in the complete pedigree of all living humans. As a result of this genealogical sharing, replicate studies that search for genotype-phenotype associations using linkage disequilibrium between mark...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20400

    authors: Rosenberg NA,Vanliere JM

    更新日期:2009-09-01 00:00:00

  • Risk factors for atherosclerosis in twins.

    abstract::We performed multivariate genetic analyses of cardiovascular risk factors from two sets of data on US and Australian female twins. Similar models for body mass index (BMI), serum low density (LDL) and high density (HDL) lipoproteins, including age as a covariate, were fitted successfully to both groups. These suggeste...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370100638

    authors: Duffy DL,O'Connell DL,Heller RF,Martin NG

    更新日期:1993-01-01 00:00:00

  • Pedigree disequilibrium tests for multilocus haplotypes.

    abstract::Association tests of multilocus haplotypes are of interest both in linkage disequilibrium mapping and in candidate gene studies. For case-parent trios, I discuss the extension of existing multilocus methods to include ambiguous haplotypes in tests of models which distinguish between the cis and trans phase. A likeliho...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.10252

    authors: Dudbridge F

    更新日期:2003-09-01 00:00:00

  • Testing the utility of mod scores and sib-pair analysis to detect presence of disease susceptibility loci.

    abstract::Linkage analyses and association studies were employed to detect disease susceptibility loci leading to elevated Q1 levels in Problem 2B. Phenotypes were defined to be the dichotomous affection status, the quantitative value for Q1, and Q1 adjusted for covariates. The method of mod-scores (for the dichotomous phenotyp...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1997)14:6<1035::AID-GEPI79

    authors: Neuman RJ,Xian H

    更新日期:1997-01-01 00:00:00

  • Two common polymorphisms in the APO A-IV coding gene: their evolution and linkage disequilibrium.

    abstract::Human apolipoprotein A-IV (APO A-IV) exhibits a common protein polymorphism detectable by isoelectric focusing (IEF) due to a single base substitution at codon 360 which replaces the frequently occurring glutamine residue (allele 1) with histidine (allele 2). Recently, sequence analysis of the APO A-IV coding region h...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370090503

    authors: Kamboh MI,Hamman RF,Ferrell RE

    更新日期:1992-01-01 00:00:00

  • Tag SNPs chosen from HapMap perform well in several population isolates.

    abstract::Population isolates may be particularly useful for association studies of complex traits. This utility, however, largely depends on the transferability of tag SNPs chosen from reference samples, such as HapMap, to samples from such populations. Factors that characterize population isolates, such as widespread genetic ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20201

    authors: Service S,International Collaborative Group on Isolated Populations.,Sabatti C,Freimer N

    更新日期:2007-04-01 00:00:00

  • Estimating the power of variance component linkage analysis in large pedigrees.

    abstract::Variance component linkage analysis is commonly used to map quantitative trait loci (QTLs) in general pedigrees. Large pedigrees are especially attractive for these studies because they provide greater power per genotyped individual than small pedigrees. We propose accurate and computationally efficient methods to cal...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20160

    authors: Chen WM,Abecasis GR

    更新日期:2006-09-01 00:00:00

  • Linkage disequilibrium between DNA markers at the low-density lipoprotein receptor gene.

    abstract::We determined pairwise linkage disequilibria between 12 restriction fragment length polymorphism (RFLP) markers at or near the low-density lipoprotein receptor (LDLR) locus on chromosome 19p13.2-13.1 in 92 unrelated individuals. Of these 12 RFLPs, two were newly identified under a cosmid-based strategy designed to scr...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370070114

    authors: Hegele RA,Plaetke R,Lalouel JM

    更新日期:1990-01-01 00:00:00

  • Parental genotype reconstruction: applications of haplotype relative risk to incomplete parental data.

    abstract::Intended to resolve the problem of constructing a matched population-based control sample, haplotype relative risk techniques frequently suffer from loss of power for late-onset diseases due to unavailability of parental genotypes that are required to form parent-offspring pairs. However, much of this missing informat...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1998)15:5<471::AID-GEPI3>3

    authors: Martin RB,Alda M,MacLean CJ

    更新日期:1998-01-01 00:00:00

  • Effect of linkage disequilibrium between markers in linkage and association analyses.

    abstract::Contributions to Group 17 of the Genetic Analysis Workshop 15 considered dense markers in linkage disequilibrium (LD) in the context of either linkage or association analysis. Three contributions reported on methods for modeling LD or selecting a subset of markers in linkage equilibrium to perform linkage analysis. Wh...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20291

    authors: Dupuis J,Albers K,Allen-Brady K,Cho K,Elston RC,Kappen HJ,Tang H,Thomas A,Thomson G,Tsung E,Yang Q,Zhang W,Zhao K,Zheng G,Ziegler JT

    更新日期:2007-01-01 00:00:00

  • The power of iterated generalized least squares (GLS) method to detect direct relationships in the analysis of correlated quantitative traits.

    abstract::We examined the power of the stepwise iterated generalized least squares (GLS) method by modeling the relationship between quantitative traits and other variables using the simulated data for Problem 2A. The comparison between the generating model provided by the workshop and the results of the stepwise iterated GLS m...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1997)14:6<797::AID-GEPI39>

    authors: He Q,Nemesure BB,Mendell NR

    更新日期:1997-01-01 00:00:00