Abstract:
:Complex diseases are presumed to be the results of interactions of several genes and environmental factors, with each gene only having a small effect on the disease. Thus, the methods that can account for gene-gene interactions to search for a set of marker loci in different genes or across genome and to analyze these loci jointly are critical. In this article, we propose an ensemble learning approach (ELA) to detect a set of loci whose main and interaction effects jointly have a significant association with the trait. In the ELA, we first search for "base learners" and then combine the effects of the base learners by a linear model. Each base learner represents a main effect or an interaction effect. The result of the ELA is easy to interpret. When the ELA is applied to analyze a data set, we can get a final model, an overall P-value of the association test between the set of loci involved in the final model and the trait, and an importance measure for each base learner and each marker involved in the final model. The final model is a linear combination of some base learners. We know which base learner represents a main effect and which one represents an interaction effect. The importance measure of each base learner or marker can tell us the relative importance of the base learner or marker in the final model. We used intensive simulation studies as well as a real data set to evaluate the performance of the ELA. Our simulation studies demonstrated that the ELA is more powerful than the single-marker test in all the simulation scenarios. The ELA also outperformed the other three existing multi-locus methods in almost all cases. In an application to a large-scale case-control study for Type 2 diabetes, the ELA identified 11 single nucleotide polymorphisms that have a significant multi-locus effect (P-value=0.01), while none of the single nucleotide polymorphisms showed significant marginal effects and none of the two-locus combinations showed significant two-locus interaction effects.
journal_name
Genet Epidemioljournal_title
Genetic epidemiologyauthors
Zhang Z,Zhang S,Wong MY,Wareham NJ,Sha Qdoi
10.1002/gepi.20304subject
Has Abstractpub_date
2008-05-01 00:00:00pages
285-300issue
4eissn
0741-0395issn
1098-2272journal_volume
32pub_type
杂志文章abstract::The purpose of this work is the development of linear trend tests that allow for error (LTT ae), specifically incorporating double-sampling information on phenotypes and/or genotypes. We use a likelihood framework. Misclassification errors are estimated via double sampling. Unbiased estimates of penetrances and genoty...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20246
更新日期:2007-12-01 00:00:00
abstract::The study of the genetic component of early-onset diseases requires investigation into parental genetic effects, particularly those mediated by the mother who can influence the offspring's risk of disease through the effects of her genes acting directly on the intrauterine milieu or indirectly through maternal-gene ch...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20602
更新日期:2011-09-01 00:00:00
abstract::Smoking has been observed to affect plasma sex hormones and body mass index. The relationship between smoking, body mass index, and plasma concentration of sex hormones was studied in normal adult male twins. The analyses were performed for between 150 and 159 twin pairs for whom hormonal data were available on both t...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370060303
更新日期:1989-01-01 00:00:00
abstract::Epigenome-wide association studies (EWAS) are designed to characterise population-level epigenetic differences across the genome and link them to disease. Most commonly, they assess DNA-methylation status at cytosine-guanine dinucleotide (CpG) sites, using platforms such as the Illumina 450k array that profile a subse...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22086
更新日期:2018-02-01 00:00:00
abstract::This report investigates the power issue in applying the non-parametric linkage analysis of affected sib-pairs (ASP) [Kruglyak and Lander, 1995: Am J Hum Genet 57:439-454] to localize genes that contribute to human longevity using long-lived sib-pairs. Data were simulated by introducing a recently developed statistica...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.10304
更新日期:2004-04-01 00:00:00
abstract::Understanding the genetic background of complex diseases and disorders plays an essential role in the promising precision medicine. The evaluation of candidate genes, however, requires time-consuming and expensive experiments given a large number of possibilities. Thus, computational methods have seen increasing appli...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22282
更新日期:2020-06-01 00:00:00
abstract::We examined familial resemblance and performed segregation analysis for the maximal expiratory flow rate at 50% of vital capacity (Vmax50) and the ratio of Vmax50 to forced vital capacity (FVC), based on data from 309 nuclear families with 1,045 individuals in the town of Humboldt, Saskatchewan, in 1993. Vmax50 is con...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1999)16:1<95::AID-GEPI8>3.
更新日期:1999-01-01 00:00:00
abstract::Meta-analyses of genetic association studies are usually performed using a single polymorphism at a time, even though in many cases the individual studies report results from partially overlapping sets of polymorphisms. We present here a multipoint (or multilocus) method for multivariate meta-analysis of published pop...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20531
更新日期:2010-11-01 00:00:00
abstract::Monte Carlo methods for linkage and segregation analysis are applied to the HGAR1 pedigree. To address these data, the methods are extended in several ways. The results are compared with those provided by PAP. ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370100658
更新日期:1993-01-01 00:00:00
abstract::Testing for association between two random vectors is a common and important task in many fields, however, existing tests, such as Escoufier's RV test, are suitable only for low-dimensional data, not for high-dimensional data. In moderate to high dimensions, it is necessary to consider sparse signals, which are often ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22059
更新日期:2017-11-01 00:00:00
abstract::A genetic epidemiologic investigation of breast cancer involving 389 breast cancer pedigrees including information on 14,721 individuals from the Icelandic population-based cancer registry is presented. Probands were women born in or after 1920 and reported to have breast cancer in the cancer registry. The average age...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(200001)18:1<81::AID-GEPI6>
更新日期:2000-01-01 00:00:00
abstract::We develop a Bayesian multi-SNP Markov chain Monte Carlo approach that allows published functional significance scores to objectively inform single nucleotide polymorphism (SNP) prior effect sizes in expression quantitative trait locus (eQTL) studies. We developed the Normal Gamma prior to allow the inclusion of funct...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21961
更新日期:2016-05-01 00:00:00
abstract::Variance component linkage analysis is commonly used to map quantitative trait loci (QTLs) in general pedigrees. Large pedigrees are especially attractive for these studies because they provide greater power per genotyped individual than small pedigrees. We propose accurate and computationally efficient methods to cal...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20160
更新日期:2006-09-01 00:00:00
abstract::Rheumatoid arthritis is an inflammatory disease for which positive associations have been described with some HLA-DRB1 alleles. The associated alleles share a similar amino acid sequence in the third hypervariable region, the shared epitope, but differ at position 71 and 86. It has been suggested that HLA susceptibili...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/1098-2272(200012)19:4<422::AID-GEPI12>3.0.
更新日期:2000-12-01 00:00:00
abstract::In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22089
更新日期:2017-12-01 00:00:00
abstract::Asthma and atopy are two closely related, common complex traits in which a number of genetic and environmental factors are suspected to play a role. We have performed parametric and nonparametric multi-marker linkage analysis for the Busselton data set, which is part of problem 1 of Genetic Analysis Workshop 12. In pa...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.2001.21.s1.s204
更新日期:2001-01-01 00:00:00
abstract::Copper incorporation studies were performed on individuals from 58 pedigrees, comprising 140 sibships. As previously reported, there is considerable overlap between heterozygotes and normal homozygotes. Segregation analysis supports recessive inheritance of disease, with residual heritability for 64Cu uptake in cultur...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370030403
更新日期:1986-01-01 00:00:00
abstract::The present report summarizes findings on 670 cases of autosomal trisomy diagnosed in Scotland, with actual or expected dates of delivery in 1990 to 1994 inclusive. Cases were notified by cytogenetic service laboratories. There were 277 prenatal and 369 postnatal diagnoses and 24 spontaneous losses. Excluding the latt...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1999)16:2<179::AID-GEPI5>3
更新日期:1999-01-01 00:00:00
abstract::Over the past few years at least 13 transmission/disequilibrium test (TDT)-based tests have been developed for quantitative (Q) traits for the assessment of association or linkage in the presence of the other. A total of six of these QTDT methods were used to analyze log10IgE in the Collaborative Study on the Genetics...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.2001.21.s1.s312
更新日期:2001-01-01 00:00:00
abstract::Power estimations are important for optimizing genotype-phenotype association study designs. However, existing frameworks are designed for common disorders, and thus ill-suited for the inherent challenges of studies for low-prevalence conditions such as rare diseases and infrequent adverse drug reactions. These challe...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22129
更新日期:2018-07-01 00:00:00
abstract::Gene-gene interaction is believed to play an important role in understanding complex traits. Multifactor dimensionality reduction (MDR) was proposed by Ritchie et al. [2001. Am J Hum Genet 69:138-147] to identify multiple loci that simultaneously affect disease susceptibility. Although the MDR method has been widely u...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20416
更新日期:2009-11-01 00:00:00
abstract::Several statistical tests for linkage between a disease susceptibility locus and a marker locus for sib-pair data are examined analytically. Two common statistics, a test based on the mean number of marker alleles shared identical by descent by sib-pairs, and a test based on the proportion of sib-pairs sharing exactly...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370070506
更新日期:1990-01-01 00:00:00
abstract::Systemic lupus erythematosus (SLE) is a complex disease which is partly determined by genetic factors which influence susceptibility to the disease phenotype. In this association study we try to define the high risk haplotypes which are responsible for this disease, together with other environmental factors. In many o...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370080607
更新日期:1991-01-01 00:00:00
abstract::In this study, we compare the statistical properties of a number of methods for estimating P-values for allele-sharing statistics in non-parametric linkage analysis. Some of the methods are based on the normality assumption, using different variance estimation methods, and others use simulation (gene-dropping) to find...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20177
更新日期:2006-12-01 00:00:00
abstract::Path analysis of family data has been widely applied to resolve genetic and environmental patterns of familial resemblance. A prevalent statistical approach in path analysis has been, first, to estimate the familial correlations and, second, by assuming these estimates to be independently distributed, define a likelih...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370010305
更新日期:1984-01-01 00:00:00
abstract::Our aim was to develop a simple method for testing gene-environment interaction in twin data ascertained through affected twins (probands), with known exposure status of both cotwins. To this end we derived formulae for two epidemiologic measures, as a function of prevalence of an exposure and genotype, and disease ri...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370110108
更新日期:1994-01-01 00:00:00
abstract::Several approaches were taken to identify the loci contributing to the quantitative and qualitative phenotypes in the Genetic Analysis Workshop 12 simulated data set. To identify possible quantitative trait loci (QTL), the quantitative traits were analyzed using SOLAR. The four replicates identified as the "best repli...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.2001.21.s1.s732
更新日期:2001-01-01 00:00:00
abstract::Using a recently developed semiparametric method for combined linkage/linkage-disequilibrium analysis, we analyzed the Collaborative Study on the Genetics of Alcoholism data subset developed for Genetic Analysis Workshop 11 (GAW11). This semiparametric approach estimates recombination fractions for linkage, marker log...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370170708
更新日期:1999-01-01 00:00:00
abstract::Autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS) is a disorder that has an elevated frequency in Saguenay-Lac-St-Jean (SLSJ) and Charlevoix, two geographically isolated regions in the past of northeastern Quebec. The incidence at birth and the carrier rate in SLSJ were estimated at 1/1,932 liveborn i...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370100103
更新日期:1993-01-01 00:00:00
abstract::The linkage between electronic health records (EHRs) and genotype data makes it plausible to study the genetic susceptibility of a wide range of disease phenotypes. Despite that EHR-derived phenotype data are subjected to misclassification, it has been shown useful for discovering susceptible genes, particularly in th...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22080
更新日期:2017-12-01 00:00:00