Abstract:
:Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free data mining method for detecting, characterizing, and interpreting epistasis in the absence of significant main effects in genetic and epidemiologic studies of complex traits such as disease susceptibility. The goal of MDR is to change the representation of the data using a constructive induction algorithm to make nonadditive interactions easier to detect using any classification method such as naïve Bayes or logistic regression. Traditionally, MDR constructed variables have been evaluated with a naïve Bayes classifier that is combined with 10-fold cross validation to obtain an estimate of predictive accuracy or generalizability of epistasis models. Traditionally, we have used permutation testing to statistically evaluate the significance of models obtained through MDR. The advantage of permutation testing is that it controls for false positives due to multiple testing. The disadvantage is that permutation testing is computationally expensive. This is an important issue that arises in the context of detecting epistasis on a genome-wide scale. The goal of the present study was to develop and evaluate several alternatives to large-scale permutation testing for assessing the statistical significance of MDR models. Using data simulated from 70 different epistasis models, we compared the power and type I error rate of MDR using a 1,000-fold permutation test with hypothesis testing using an extreme value distribution (EVD). We find that this new hypothesis testing method provides a reasonable alternative to the computationally expensive 1,000-fold permutation test and is 50 times faster. We then demonstrate this new method by applying it to a genetic epidemiology study of bladder cancer susceptibility that was previously analyzed using MDR and assessed using a 1,000-fold permutation test.
journal_name
Genet Epidemioljournal_title
Genetic epidemiologyauthors
Pattin KA,White BC,Barney N,Gui J,Nelson HH,Kelsey KT,Andrew AS,Karagas MR,Moore JHdoi
10.1002/gepi.20360subject
Has Abstractpub_date
2009-01-01 00:00:00pages
87-94issue
1eissn
0741-0395issn
1098-2272journal_volume
33pub_type
杂志文章abstract::A method is described for estimating excess relative risks of a disease from familial factors. Beginning with population-based series of cases and controls, a cohort of each subject's relatives is formed and checked for disease against a population based registry. The disease experience of the cohort formed from each ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370120306
更新日期:1995-01-01 00:00:00
abstract::Twin pairs are sometimes included in studies because at least one of them is a proband, and conventionally the analysis of the data is based on the conditional distribution of the co twin given the proband. In the case of more than one proband in each pair, an often used "ad hoc" method of analysis is to allow each tw...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.10253
更新日期:2003-11-01 00:00:00
abstract::Sub-Saharan Africa has been identified as the part of the world with the greatest human genetic diversity. This high level of diversity causes difficulties for genome-wide association (GWA) studies in African populations-for example, by reducing the accuracy of genotype imputation in African populations compared to no...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20626
更新日期:2011-12-01 00:00:00
abstract::Testing for association between two random vectors is a common and important task in many fields, however, existing tests, such as Escoufier's RV test, are suitable only for low-dimensional data, not for high-dimensional data. In moderate to high dimensions, it is necessary to consider sparse signals, which are often ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22059
更新日期:2017-11-01 00:00:00
abstract::Advances in high throughput technology have enabled the generation of unprecedented amounts of genomic data (e.g., next-generation sequence data, transcriptomics, metabolomics, and proteomics), which promises to unravel the genetic architecture of complex traits. These discoveries may lead to novel therapeutic targets...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21768
更新日期:2013-12-01 00:00:00
abstract::Logistic regression is the primary analysis tool for binary traits in genome-wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22276
更新日期:2020-04-01 00:00:00
abstract::This paper summarizes the analyses by participants in the insulin-dependent diabetes mellitus (IDDM) component of Genetic Analysis Workshop 5 (GAW5). The data were obtained from 94 families with two or more IDDM sibs. Topics treated in the Workshop analysis included the following: methods for detecting associations an...
journal_title:Genetic epidemiology
pub_type: 杂志文章,评审
doi:10.1002/gepi.1370060111
更新日期:1989-01-01 00:00:00
abstract::We determined pairwise linkage disequilibria between 12 restriction fragment length polymorphism (RFLP) markers at or near the low-density lipoprotein receptor (LDLR) locus on chromosome 19p13.2-13.1 in 92 unrelated individuals. Of these 12 RFLPs, two were newly identified under a cosmid-based strategy designed to scr...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370070114
更新日期:1990-01-01 00:00:00
abstract::Jewish women have been reported to have a higher risk for familial breast cancer than non-Jewish women and to be more likely to carry mutations in breast cancer genes such as BRCA1. Because BRCA1 mutations also increase women's risk for ovarian cancer, we asked whether Jewish women are at higher risk for familial ovar...
journal_title:Genetic epidemiology
pub_type: 临床试验,杂志文章,随机对照试验
doi:10.1002/(SICI)1098-2272(1998)15:1<51::AID-GEPI4>3.
更新日期:1998-01-01 00:00:00
abstract::Intended to resolve the problem of constructing a matched population-based control sample, haplotype relative risk techniques frequently suffer from loss of power for late-onset diseases due to unavailability of parental genotypes that are required to form parent-offspring pairs. However, much of this missing informat...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1998)15:5<471::AID-GEPI3>3
更新日期:1998-01-01 00:00:00
abstract::Methods to account for population structure (PS) in genome-wide association studies have been well developed in samples of unrelated individuals, but when a sample is composed of families, the task of finding and accounting for PS is not as straight forward. Family-based tests that condition on parental genotypes or t...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20590
更新日期:2011-09-01 00:00:00
abstract::The potential of genome-wide association analysis can only be realized when they have power to detect signals despite the detrimental effect of multiple testing on power. We develop a weighted multiple testing procedure that facilitates the input of prior information in the form of groupings of tests. For each group a...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20237
更新日期:2007-11-01 00:00:00
abstract::Genetic epidemiology is a relatively new discipline that seeks to unravel the role of genetic factors and their interactions with environmental factors in the etiology of diseases, using population and family study approaches. To characterize the overall direction and emphasis of research strategies used in this field...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370100505
更新日期:1993-01-01 00:00:00
abstract::We propose methods to construct meiotic gene maps while controlling the probability of a decision-error. First, a single step gene ordering procedure is presented whose decision-error probability is bounded above by a prespecified threshold. The bound for the error probability is valid under quite general circumstance...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1999)16:3<274::AID-GEPI4>3
更新日期:1999-01-01 00:00:00
abstract::The etiology of complex traits likely involves the effects of genetic and environmental factors, along with complicated interaction effects between them. Consequently, there has been interest in applying genetic association tests of complex traits that account for potential modification of the genetic effect in the pr...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21901
更新日期:2015-07-01 00:00:00
abstract::Meta-analyses of genetic association studies are usually performed using a single polymorphism at a time, even though in many cases the individual studies report results from partially overlapping sets of polymorphisms. We present here a multipoint (or multilocus) method for multivariate meta-analysis of published pop...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20531
更新日期:2010-11-01 00:00:00
abstract::In this study, we compare the statistical properties of a number of methods for estimating P-values for allele-sharing statistics in non-parametric linkage analysis. Some of the methods are based on the normality assumption, using different variance estimation methods, and others use simulation (gene-dropping) to find...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20177
更新日期:2006-12-01 00:00:00
abstract::We have conducted a study of renal sodium and potassium reabsorption in 205 pairs of twins on freely chosen diets; 89 of the subjects were studied on more than one occasion. Renal tubular sodium and potassium handling, as measured by the fractional excretions FENa and FEK, show repeatable differences between individua...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370020103
更新日期:1985-01-01 00:00:00
abstract::To explain the association between HLA-DRB1 gene and rheumatoid arthritis (RA), two main hypotheses have been proposed. The first, the shared epitope hypothesis, assumes a direct role of DRB1 in RA susceptibility. The second hypothesis assumes a recessive disease susceptibility gene in linkage disequilibrium with DRB1...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1998)15:4<419::AID-GEPI7>3
更新日期:1998-01-01 00:00:00
abstract::This study is an investigation of the relationship between apolipoprotein E (apoE) phenotype, arterial disease, and mortality in a group of women (n = 1,751) aged 65 years and older enrolled in the Study of Osteoporotic Fractures. Crude mortality rates were highest among women with the 4-3 and 4-4 phenotypes but age-a...
journal_title:Genetic epidemiology
pub_type: 临床试验,杂志文章,多中心研究
doi:10.1002/(SICI)1098-2272(1997)14:2<147::AID-GEPI4>3
更新日期:1997-01-01 00:00:00
abstract::In genetic association studies, a single marker is often associated with multiple, correlated phenotypes (e.g., obesity and cardiovascular disease, or nicotine dependence and lung cancer). A pervasive question is then whether that marker exerts independent effects on all phenotypes. In this paper, we address this ques...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21660
更新日期:2012-09-01 00:00:00
abstract::Human apolipoprotein A-IV (APO A-IV) exhibits a common protein polymorphism detectable by isoelectric focusing (IEF) due to a single base substitution at codon 360 which replaces the frequently occurring glutamine residue (allele 1) with histidine (allele 2). Recently, sequence analysis of the APO A-IV coding region h...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370090503
更新日期:1992-01-01 00:00:00
abstract::Copper incorporation studies were performed on individuals from 58 pedigrees, comprising 140 sibships. As previously reported, there is considerable overlap between heterozygotes and normal homozygotes. Segregation analysis supports recessive inheritance of disease, with residual heritability for 64Cu uptake in cultur...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370030403
更新日期:1986-01-01 00:00:00
abstract::The aim of this paper is to generalize permutation methods for multiple testing adjustment of significant partial regression coefficients in a linear regression model used for microarray data. Using a permutation method outlined by Anderson and Legendre [1999] and the permutation P-value adjustment from Simon et al. [...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20255
更新日期:2008-01-01 00:00:00
abstract::Linkage disequilibrium mapping of quantitative traits is a powerful method for dissecting the genetic etiology of complex phenotypes. Quantitative traits, however, often exhibit characteristics that make their use problematic. For example, the distribution of the trait may be censored, highly skewed, or contaminated w...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20141
更新日期:2006-04-01 00:00:00
abstract::We have conducted a simulation study in small pedigrees to investigate the power to detect linkage and heterogeneity for a disorder due to either one of two independent disease loci. We have considered a highly polymorphic marker locus (PIC = 70%) linked to one disease locus and unlinked to the second. The power to de...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370070306
更新日期:1990-01-01 00:00:00
abstract::Recently, Liang et al. ([2001b] Genet. Epidemiol. 21:105-122) proposed a conditional approach to assess linkage evidence on the target region by incorporating linkage information from an unlinked (reference) region using allele shared IBD (identity-by-decent) from affected sib pairs. This is carried out by conditionin...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.10305
更新日期:2004-02-01 00:00:00
abstract::Contributions to Group 17 of the Genetic Analysis Workshop 15 considered dense markers in linkage disequilibrium (LD) in the context of either linkage or association analysis. Three contributions reported on methods for modeling LD or selecting a subset of markers in linkage equilibrium to perform linkage analysis. Wh...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20291
更新日期:2007-01-01 00:00:00
abstract::We investigated the independent contributions of a candidate gene and an environmental factor, and the presence of gene x environment (G x E) interaction, in the etiology of a disease in the Genetic Analysis Workshop (GAW) 12 problem 2 simulated data using a two-stage approach utilizing both case-control and case-pare...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.2001.21.s1.s843
更新日期:2001-01-01 00:00:00
abstract::Complex segregation analysis and linkage methods are mathematical techniques for the genetic dissection of complex diseases. They are used to delineate complex modes of familial transmission and to localize putative disease susceptibility loci to specific chromosomal locations. The computational problem of Bayesian li...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/1098-2272(2000)19:1+<::AID-GEPI8>3.0.CO;2-
更新日期:2000-01-01 00:00:00