Identification of gene-gene interactions in the presence of missing data using the multifactor dimensionality reduction method.

Abstract:

:Gene-gene interaction is believed to play an important role in understanding complex traits. Multifactor dimensionality reduction (MDR) was proposed by Ritchie et al. [2001. Am J Hum Genet 69:138-147] to identify multiple loci that simultaneously affect disease susceptibility. Although the MDR method has been widely used to detect gene-gene interactions, few studies have been reported on MDR analysis when there are missing data. Currently, there are four approaches available in MDR analysis to handle missing data. The first approach uses only complete observations that have no missing data, which can cause a severe loss of data. The second approach is to treat missing values as an additional genotype category, but interpretation of the results may then be not clear and the conclusions may be misleading. Furthermore, it performs poorly when the missing rates are unbalanced between the case and control groups. The third approach is a simple imputation method that imputes missing genotypes as the most frequent genotype, which may also produce biased results. The fourth approach, Available, uses all data available for the given loci to increase power. In any real data analysis, it is not clear which MDR approach one should use when there are missing data. In this article, we consider a new EM Impute approach to handle missing data more appropriately. Through simulation studies, we compared the performance of the proposed EM Impute approach with the current approaches. Our results showed that Available and EM Impute approaches perform better than the three other current approaches in terms of power and precision.

journal_name

Genet Epidemiol

journal_title

Genetic epidemiology

authors

Namkung J,Elston RC,Yang JM,Park T

doi

10.1002/gepi.20416

subject

Has Abstract

pub_date

2009-11-01 00:00:00

pages

646-56

issue

7

eissn

0741-0395

issn

1098-2272

journal_volume

33

pub_type

杂志文章
  • Genetic association tests based on ranks (GATOR) for quantitative traits with and without censoring.

    abstract::Linkage disequilibrium mapping of quantitative traits is a powerful method for dissecting the genetic etiology of complex phenotypes. Quantitative traits, however, often exhibit characteristics that make their use problematic. For example, the distribution of the trait may be censored, highly skewed, or contaminated w...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20141

    authors: Allen AS,Martin ER,Qin X,Li YJ

    更新日期:2006-04-01 00:00:00

  • Integrative sparse principal component analysis of gene expression data.

    abstract::In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22089

    authors: Liu M,Fan X,Fang K,Zhang Q,Ma S

    更新日期:2017-12-01 00:00:00

  • Identifying genetic interactions in genome-wide data using Bayesian networks.

    abstract::It is believed that interactions among genes (epistasis) may play an important role in susceptibility to common diseases (Moore and Williams [2002]. Ann Med 34:88-95; Ritchie et al. [2001]. Am J Hum Genet 69:138-147). To study the underlying genetic variants of diseases, genome-wide association studies (GWAS) that sim...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20514

    authors: Jiang X,Barmada MM,Visweswaran S

    更新日期:2010-09-01 00:00:00

  • Genetic epidemiology of breast cancer: segregation analysis of 389 Icelandic pedigrees.

    abstract::A genetic epidemiologic investigation of breast cancer involving 389 breast cancer pedigrees including information on 14,721 individuals from the Icelandic population-based cancer registry is presented. Probands were women born in or after 1920 and reported to have breast cancer in the cancer registry. The average age...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(200001)18:1<81::AID-GEPI6>

    authors: Baffoe-Bonnie AB,Beaty TH,Bailey-Wilson JE,Kiemeney LA,Sigvaldason H,Olafsdóttir G,Tryggvadóttir L,Tulinius H

    更新日期:2000-01-01 00:00:00

  • Rank-based robust tests for quantitative-trait genetic association studies.

    abstract::Standard linear regression is commonly used for genetic association studies of quantitative traits. This approach may not be appropriate if the trait, on its original or transformed scales, does not follow a normal distribution. A rank-based nonparametric approach that does not rely on any distributional assumptions c...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21723

    authors: Li Q,Li Z,Zheng G,Gao G,Yu K

    更新日期:2013-05-01 00:00:00

  • Meta-analysis by combining p-values: simulated linkage studies.

    abstract::Meta-analysis has been little explored to make an overall assessment of linkage from different studies. In practice, it is likely that published linkage studies will only report p-values. We compared the performance of the widely used Fisher method for combining p-values with that of pooling raw data. More loci were c...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章,meta分析

    doi:10.1002/gepi.1370170798

    authors: Guerra R,Etzel CJ,Goldstein DR,Sain SR

    更新日期:1999-01-01 00:00:00

  • Affected relative pairs and simultaneous search for two-locus linkage in the presence of epistasis.

    abstract::It is commonly believed that multiple interacting genes increase the susceptibility of genetically complex diseases, yet few linkage analyses of human diseases scan for more than one locus at a time. To overcome some of the statistical and computational limitations of a simultaneous search for two disease susceptibili...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20223

    authors: Schaid DJ,McDonnell SK,Carlson EE,Thibodeau SN,Ostrander EA,Stanford JL

    更新日期:2007-07-01 00:00:00

  • Commentary: the affected sib-pair method in the context of an epidemiologic study design.

    abstract::The purpose of this commentary is to provide a framework for using the well-known sib-pair methodology in the context of epidemiologic study designs. Using examples from the Pittsburgh family studies of insulin-dependent diabetes mellitus, we illustrate that the sib-pair method can be used in family-based epidemiologi...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370080408

    authors: Khoury MJ,Flanders WD,Lipton RB,Dorman JS

    更新日期:1991-01-01 00:00:00

  • TDT with covariates and genomic screens with mod scores: their behavior on simulated data.

    abstract::We describe an extension to the TDT (transmission/disequilibrium test) which allows for more than two marker alleles and for covariates measured on the parent or offspring. We also describe a systematic genomic search where the mod score (maximized lod score) is computed for each marker under constraints on the popula...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370120623

    authors: Rice JP,Neuman RJ,Hoshaw SL,Daw EW,Gu C

    更新日期:1995-01-01 00:00:00

  • Adaptive testing for association between two random vectors in moderate to high dimensions.

    abstract::Testing for association between two random vectors is a common and important task in many fields, however, existing tests, such as Escoufier's RV test, are suitable only for low-dimensional data, not for high-dimensional data. In moderate to high dimensions, it is necessary to consider sparse signals, which are often ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22059

    authors: Xu Z,Xu G,Pan W,Alzheimer's Disease Neuroimaging Initiative.

    更新日期:2017-11-01 00:00:00

  • Multipoint linkage mapping using sibpairs: non-parametric estimation of trait effects with quantitative covariates.

    abstract::Multipoint linkage analysis using sibpair designs remains a common approach to help investigators to narrow chromosomal regions for traits (either qualitative or quantitative) of interest. Despite its popularity, the success of this approach depends heavily on how issues such as genetic heterogeneity, gene-gene, and g...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20036

    authors: Chiou JM,Liang KY,Chiu YF

    更新日期:2005-01-01 00:00:00

  • Gene-dropping vs. empirical variance estimation for allele-sharing linkage statistics.

    abstract::In this study, we compare the statistical properties of a number of methods for estimating P-values for allele-sharing statistics in non-parametric linkage analysis. Some of the methods are based on the normality assumption, using different variance estimation methods, and others use simulation (gene-dropping) to find...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20177

    authors: Jung J,Weeks DE,Feingold E

    更新日期:2006-12-01 00:00:00

  • Generalization of the extended transmission disequilibrium test to two unlinked disease loci.

    abstract::The extended transmission disequilibrium test (ETDT) of Sham and Curtis [1995] is a powerful test of the null hypothesis of no linkage between a multi-allelic marker locus and a disease susceptibility locus of unknown location in the presence of association between alleles at the two loci. We propose a generalization ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.13701707108

    authors: Morris A,Whittaker J

    更新日期:1999-01-01 00:00:00

  • Genome-wide association studies for discrete traits.

    abstract::Genome-wide association studies of discrete traits generally use simple methods of analysis based on chi(2) tests for contingency tables or logistic regression, at least for an initial scan of the entire genome. Nevertheless, more power might be obtained by using various methods that analyze multiple markers in combin...

    journal_title:Genetic epidemiology

    pub_type:

    doi:10.1002/gepi.20465

    authors: Thomas DC

    更新日期:2009-01-01 00:00:00

  • Equivalence of the mixed and regressive models for genetic analysis. I. Continuous traits.

    abstract::The mixed model of segregation analysis specifies major gene effects and partitions the residual variance into polygenic and environmental components. The model explains familial correlations essentially in terms of genetic causation. The regressive model, on the other hand, is constructed by successively conditioning...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370060505

    authors: Demenais FM,Bonney GE

    更新日期:1989-01-01 00:00:00

  • Parental transmission and D18S37 allele sharing in bipolar affective disorder.

    abstract::We combined the five chromosome 18 bipolar affective disorder data sets provided by GAW10, totaling 185 families with 3,394 individuals, and performed analysis of differential parental transmission and chromosome 18 marker allele sharing in families with transmission through fathers vs those through mothers. Results i...

    journal_title:Genetic epidemiology

    pub_type: 临床试验,杂志文章

    doi:10.1002/(SICI)1098-2272(1997)14:6<665::AID-GEPI19>

    authors: Lin JP,Bale SJ

    更新日期:1997-01-01 00:00:00

  • Mapping alcoholism genes using linkage/linkage disequilibrium analysis.

    abstract::Using a recently developed semiparametric method for combined linkage/linkage-disequilibrium analysis, we analyzed the Collaborative Study on the Genetics of Alcoholism data subset developed for Genetic Analysis Workshop 11 (GAW11). This semiparametric approach estimates recombination fractions for linkage, marker log...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370170708

    authors: Aragaki C,Quiaoit F,Hsu L,Zhao LP

    更新日期:1999-01-01 00:00:00

  • Investigation of a candidate gene, environment, and G x E interaction using case-control and case-parent study designs.

    abstract::We investigated the independent contributions of a candidate gene and an environmental factor, and the presence of gene x environment (G x E) interaction, in the etiology of a disease in the Genetic Analysis Workshop (GAW) 12 problem 2 simulated data using a two-stage approach utilizing both case-control and case-pare...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s843

    authors: Norris JM,Selinger-Leneman H,Génin E

    更新日期:2001-01-01 00:00:00

  • Detecting interactions between gene, site, and environmental variables using GAP.

    abstract::Regressive models that incorporate measured variables and assumed genetic parameters were used to detect interactions between gene, research site, and environmental variables in GAW11 Problem 2. Replicates 1 to 5 were used in the analyses. Significant three-way gene x environment x site interactions were seen for all ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.13701707118

    authors: Shin J,Corey M

    更新日期:1999-01-01 00:00:00

  • Linkage analysis of Alzheimer's disease with methods using relative pairs.

    abstract::Four relative-pair methods for detecting genetic linkage were applied to familial Alzheimer's disease data. Results obtained using an extended Haseman-Elston test and a weighted rank pairwise correlation test, which both use information from all relative pairs, were consistent with previously published likelihood resu...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370100608

    authors: Blossey H,Commenges D,Olson JM

    更新日期:1993-01-01 00:00:00

  • Data mining and computationally intensive methods: summary of Group 7 contributions to Genetic Analysis Workshop 13.

    abstract::The Framingham Heart Study data, as well as a related simulated data set, were generously provided to the participants of the Genetic Analysis Workshop 13 in order that newly developed and emerging statistical methodologies could be tested on that well-characterized data set. The impetus driving the development of nov...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.10285

    authors: Costello TJ,Falk CT,Ye KQ

    更新日期:2003-01-01 00:00:00

  • Estimating gene penetrance from family data.

    abstract::Family data are useful for estimating disease risk in carriers of specific genotypes of a given gene (penetrance). Penetrance is frequently estimated assuming that relatives' phenotypes are independent, given their genotypes for the gene of interest. This assumption is unrealistic when multiple shared risk factors con...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20493

    authors: Gong G,Hannon N,Whittemore AS

    更新日期:2010-05-01 00:00:00

  • Trends in prenatal diagnosis of Down syndrome and other autosomal trisomies in Scotland 1990 to 1994, with associated cytogenetic and epidemiological findings.

    abstract::The present report summarizes findings on 670 cases of autosomal trisomy diagnosed in Scotland, with actual or expected dates of delivery in 1990 to 1994 inclusive. Cases were notified by cytogenetic service laboratories. There were 277 prenatal and 369 postnatal diagnoses and 24 spontaneous losses. Excluding the latt...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1999)16:2<179::AID-GEPI5>3

    authors: Carothers AD,Boyd E,Lowther G,Ellis PM,Couzin DA,Faed MJ,Robb A

    更新日期:1999-01-01 00:00:00

  • Variance component models for X-linked QTLs.

    abstract::This paper discusses the theory and implementation of a model for mapping X-linked quantitative trait loci (QTL). As a result of X inactivation, a female's body is subdivided into a number of patches. In each patch one of her two X chromosomes is randomly switched off. This smooths the allelic contributions in a heter...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20158

    authors: Lange K,Sobel E

    更新日期:2006-07-01 00:00:00

  • A multipoint method for meta-analysis of genetic association studies.

    abstract::Meta-analyses of genetic association studies are usually performed using a single polymorphism at a time, even though in many cases the individual studies report results from partially overlapping sets of polymorphisms. We present here a multipoint (or multilocus) method for multivariate meta-analysis of published pop...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20531

    authors: Bagos PG,Liakopoulos TD

    更新日期:2010-11-01 00:00:00

  • Joint analysis of multiple phenotypes using a clustering linear combination method based on hierarchical clustering.

    abstract::Emerging evidence suggests that a genetic variant can affect multiple phenotypes, especially in complex human diseases. Therefore, joint analysis of multiple phenotypes may offer new insights into disease etiology. Recently, many statistical methods have been developed for joint analysis of multiple phenotypes, includ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22263

    authors: Li X,Zhang S,Sha Q

    更新日期:2020-01-01 00:00:00

  • Power of non-parametric linkage analysis in mapping genes contributing to human longevity in long-lived sib-pairs.

    abstract::This report investigates the power issue in applying the non-parametric linkage analysis of affected sib-pairs (ASP) [Kruglyak and Lander, 1995: Am J Hum Genet 57:439-454] to localize genes that contribute to human longevity using long-lived sib-pairs. Data were simulated by introducing a recently developed statistica...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.10304

    authors: Tan Q,Zhao JH,Iachine I,Hjelmborg J,Vach W,Vaupel JW,Christensen K,Kruse TA

    更新日期:2004-04-01 00:00:00

  • Multiethnic polygenic risk scores improve risk prediction in diverse populations.

    abstract::Methods for genetic risk prediction have been widely investigated in recent years. However, most available training data involves European samples, and it is currently unclear how to accurately predict disease risk in other populations. Previous studies have used either training data from European samples in large sam...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22083

    authors: Márquez-Luna C,Loh PR,South Asian Type 2 Diabetes (SAT2D) Consortium.,SIGMA Type 2 Diabetes Consortium.,Price AL

    更新日期:2017-12-01 00:00:00

  • Gene-environment interaction tests for dichotomous traits in trios and sibships.

    abstract::When testing for genetic effects, failure to account for a gene-environment interaction can mask the true association effects of a genetic marker with disease. Family-based association tests are popular because they are completely robust to population substructure and model misspecification. However, when testing for ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20421

    authors: Hoffmann TJ,Lange C,Vansteelandt S,Laird NM

    更新日期:2009-12-01 00:00:00

  • Genome-wide linkage analysis using genetic variance components of alcohol dependency-associated censored and continuous traits.

    abstract::We used variance-components analysis to investigate the additive genetic effects regulating some of the phenotypes included in the GAW11 data set. Variance-components models were fitted using Gibbs sampling methods in BUGS v 0.6. Linkage analyses for both multivariate normal (MvN) traits and right censored survival ti...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370170748

    authors: Palmer LJ,Tiller KJ,Burton PR

    更新日期:1999-01-01 00:00:00