Abstract:
:For the association analysis of whole-genome sequencing (WGS) studies, we propose an efficient and fast spatial-clustering algorithm. Compared to existing analysis approaches for WGS data, that define the tested regions either by sliding or consecutive windows of fixed sizes along variants, a meaningful grouping of nearby variants into consecutive regions has the advantage that, compared to sliding window approaches, the number of tested regions is likely to be smaller. In comparison to consecutive, fixed-window approaches, our approach is likely to group nearby variants together. Given existing biological evidence that disease-associated mutations tend to physically cluster in specific regions along the chromosome, the identification of meaningful groups of nearby located variants could thus lead to a potential power gain for association analysis. Our algorithm defines consecutive genomic regions based on the physical positions of the variants, assuming an inhomogeneous Poisson process and groups together nearby variants. As parameters are estimated locally, the algorithm takes the differing variant density along the chromosome into account and provides locally optimal partitioning of variants into consecutive regions. An R-implementation of the algorithm is provided. We discuss the theoretical advances of our algorithm compared to existing, window-based approaches and show the performance and advantage of our introduced algorithm in a simulation study and by an application to Alzheimer's disease WGS data. Our analysis identifies a region in the ITGB3 gene that potentially harbors disease susceptibility loci for Alzheimer's disease. The region-based association signal of ITGB3 replicates in an independent data set and achieves formally genome-wide significance. Software Implementation: An implementation of the algorithm in R is available at: https://github.com/heidefier/cluster_wgs_data.
journal_name
Genet Epidemioljournal_title
Genetic epidemiologyauthors
Loehlein Fier H,Prokopenko D,Hecker J,Cho MH,Silverman EK,Weiss ST,Tanzi RE,Lange Cdoi
10.1002/gepi.22040subject
Has Abstractpub_date
2017-05-01 00:00:00pages
332-340issue
4eissn
0741-0395issn
1098-2272journal_volume
41pub_type
杂志文章abstract::A major locus influencing apolipoprotein AI (apo AI) serum levels was detected using data from the Donner Laboratory Family Study. This locus accounts for 46% of the phenotypic variability in apo AI levels. Multivariate segregation analysis revealed that this major locus also has significant pleiotropic effects on the...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370100648
更新日期:1993-01-01 00:00:00
abstract::We have analyzed allele frequency distribution at the hypervariable locus 3' to the apolipoprotein B gene in a healthy population sample (241 women and 246 men) from the Belgrade area. The bimodal distribution of sixteen different hypervariable region (HVR) alleles and the heterozygosity index (average 0.76) in both s...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1998)15:2<113::AID-GEPI1>3
更新日期:1998-01-01 00:00:00
abstract::Four relative-pair methods for detecting genetic linkage were applied to familial Alzheimer's disease data. Results obtained using an extended Haseman-Elston test and a weighted rank pairwise correlation test, which both use information from all relative pairs, were consistent with previously published likelihood resu...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370100608
更新日期:1993-01-01 00:00:00
abstract::The linkage between electronic health records (EHRs) and genotype data makes it plausible to study the genetic susceptibility of a wide range of disease phenotypes. Despite that EHR-derived phenotype data are subjected to misclassification, it has been shown useful for discovering susceptible genes, particularly in th...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22080
更新日期:2017-12-01 00:00:00
abstract::Different maximum likelihood approaches were used to explore the role of candidate genes in the variability of quantitative trait Q1 while accounting for the effects of age, Q2, and Q3. Segregation analysis, under the class D regressive model, provides evidence for a Mendelian gene effect on the adjusted trait Q1. Res...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370120643
更新日期:1995-01-01 00:00:00
abstract::In diseases with a complex mode of inheritance, families with multiple affected individuals are difficult to ascertain. The haplotype sharing statistic (HSS) uses (hidden) co-ancestry between affected individuals from a founder population. These affected individuals will likely not only share the same mutation(s), but...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1997)14:6<915::AID-GEPI59>
更新日期:1997-01-01 00:00:00
abstract::It is generally accepted that cancer is caused by environmental and inherited factors but these are only partially identified. Family studies can be informative but they do not separate shared lifestyles and genes. We estimate familial risks for concordant cancers between spouses in common cancers of both sexes in ord...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/1098-2272(200102)20:2<247::AID-GEPI7>3.0.C
更新日期:2001-02-01 00:00:00
abstract::Variance component linkage analysis is commonly used to map quantitative trait loci (QTLs) in general pedigrees. Large pedigrees are especially attractive for these studies because they provide greater power per genotyped individual than small pedigrees. We propose accurate and computationally efficient methods to cal...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20160
更新日期:2006-09-01 00:00:00
abstract::Several approaches were taken to identify the loci contributing to the quantitative and qualitative phenotypes in the Genetic Analysis Workshop 12 simulated data set. To identify possible quantitative trait loci (QTL), the quantitative traits were analyzed using SOLAR. The four replicates identified as the "best repli...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.2001.21.s1.s732
更新日期:2001-01-01 00:00:00
abstract::Large-scale meta-analyses of genome-wide association scans (GWAS) have been successful in discovering common risk variants with modest and small effects. The detection of lower frequency signals will undoubtedly require concerted efforts of at least similar scale. We investigate the sample size-dictated power limits o...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20627
更新日期:2011-12-01 00:00:00
abstract::In this paper we investigate the power to identify gene x gene interactions in genome-wide association studies. In our analysis we focus on two-stage analyses: analyses in which we only test for interactions between single nucleotide polymorphisms that show some marginal effect. We give two algorithms to compute signi...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20300
更新日期:2008-04-01 00:00:00
abstract::The etiology of complex traits likely involves the effects of genetic and environmental factors, along with complicated interaction effects between them. Consequently, there has been interest in applying genetic association tests of complex traits that account for potential modification of the genetic effect in the pr...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21901
更新日期:2015-07-01 00:00:00
abstract::Increased adiposity has repeatedly been identified as a major risk factor for a variety of chronic diseases. However, the question still remains whether the amount of adipose tissue itself is genetically mediated. To address this question, a segregation analysis, using maximum likelihood techniques as implemented in t...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370120505
更新日期:1995-01-01 00:00:00
abstract::In an earlier paper, positive but nonsignificant lod scores were found in pair-wise linkage tests between multiple endocrine neoplasia type 2A (MEN-2A) and both the haptoglobin (HP) locus on chromosome 16 and group-specific component (GC) locus on chromosome 4. Recently discovered restriction fragment length polymorph...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370030306
更新日期:1986-01-01 00:00:00
abstract::Using a recently developed semiparametric method for combined linkage/linkage-disequilibrium analysis, we analyzed the Collaborative Study on the Genetics of Alcoholism data subset developed for Genetic Analysis Workshop 11 (GAW11). This semiparametric approach estimates recombination fractions for linkage, marker log...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370170708
更新日期:1999-01-01 00:00:00
abstract::The transmission disequilibrium test (TDT), originally developed for mapping disease genes, has recently been extended to identify quantitative trait loci (QTL). For quantitative traits important for human health, generally multiple QTLs are involved. In the investigation of the statistical properties of the TDT, back...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1032
更新日期:2001-11-01 00:00:00
abstract::Simulation studies were conducted to assess to what extent the conclusions of segregation analysis, performed under the unified model, can be affected by the presence of unmeasured environmental factors shared by family members. Dichotomous data were generated on six-member nuclear families under two variants of the m...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370060140
更新日期:1989-01-01 00:00:00
abstract::Grade of membership analysis (GoM) may have particular relevance for genetic epidemiology. The method can flexibly relate genetic markers, clinical features, and environmental exposures to possible subtypes of disease termed pure types even when population allele frequencies and penetrance functions are not known. Hen...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370100628
更新日期:1993-01-01 00:00:00
abstract::The asymptotic distribution of [MOD] scores under the null hypothesis of no linkage is only known for affected sib pairs and other types of affected relative pairs. We have extended the GENEHUNTER-MODSCORE program to allow for simulations under the null hypothesis of no linkage to determine the empirical significance ...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20264
更新日期:2008-01-01 00:00:00
abstract::Power estimations are important for optimizing genotype-phenotype association study designs. However, existing frameworks are designed for common disorders, and thus ill-suited for the inherent challenges of studies for low-prevalence conditions such as rare diseases and infrequent adverse drug reactions. These challe...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.22129
更新日期:2018-07-01 00:00:00
abstract::Contributions to Group 17 of the Genetic Analysis Workshop 15 considered dense markers in linkage disequilibrium (LD) in the context of either linkage or association analysis. Three contributions reported on methods for modeling LD or selecting a subset of markers in linkage equilibrium to perform linkage analysis. Wh...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20291
更新日期:2007-01-01 00:00:00
abstract::There has been a great interest and a few successes in the identification of complex disease susceptibility genes in recent years. Association studies, where a large number of single-nucleotide polymorphisms (SNPs) are typed in a sample of cases and controls to determine which genes are associated with a specific dise...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20041
更新日期:2005-02-01 00:00:00
abstract::The potential of genome-wide association analysis can only be realized when they have power to detect signals despite the detrimental effect of multiple testing on power. We develop a weighted multiple testing procedure that facilitates the input of prior information in the form of groupings of tests. For each group a...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.20237
更新日期:2007-11-01 00:00:00
abstract::We have analyzed the GAW10 data from several studies of bipolar affective disorder (BPAD) using the software packages SimIBD and SIMWALK2. SimIBD implements a simulation-based affected-pedigree-member (APM) statistic, called SimAPM, as well as an APM-like statistic, also called SimIBD, that measures identical-by-desce...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/(SICI)1098-2272(1997)14:6<605::AID-GEPI9>3
更新日期:1997-01-01 00:00:00
abstract::Mantel statistics provide an additional step to standard approaches in the analysis of gene expression and covariate data, allow the calculation of standard statistics such as correlation, partial correlation, and regression coefficients, and, with permutation tests, provide P values for these statistics to relate the...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1115
更新日期:2002-06-01 00:00:00
abstract::Several statistical tests for linkage between a disease susceptibility locus and a marker locus for sib-pair data are examined analytically. Two common statistics, a test based on the mean number of marker alleles shared identical by descent by sib-pairs, and a test based on the proportion of sib-pairs sharing exactly...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.1370070506
更新日期:1990-01-01 00:00:00
abstract::Association mapping for complex diseases using unrelated individuals can be more powerful than family-based analysis in many settings. In addition, this approach has major practical advantages, including greater efficiency in sample recruitment. Association mapping may lead to false-positive findings, however, if popu...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.210
更新日期:2002-08-01 00:00:00
abstract::With new technologies, multiple types of genomic data are commonly collected on a single set of samples. However, standard analysis methods concentrate on a single data type at a time and ignore the relationships between genes, proteins, and biochemical reactions that give rise to complex phenotypes. In this paper, we...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.21628
更新日期:2012-05-01 00:00:00
abstract::Jewish women have been reported to have a higher risk for familial breast cancer than non-Jewish women and to be more likely to carry mutations in breast cancer genes such as BRCA1. Because BRCA1 mutations also increase women's risk for ovarian cancer, we asked whether Jewish women are at higher risk for familial ovar...
journal_title:Genetic epidemiology
pub_type: 临床试验,杂志文章,随机对照试验
doi:10.1002/(SICI)1098-2272(1998)15:1<51::AID-GEPI4>3.
更新日期:1998-01-01 00:00:00
abstract::A genome-wide correlation analysis and cluster analysis were utilized to determine chromosomal regions that had similar nonparametric linkage scores across families in order to locate interacting susceptibility loci for asthma. Conditional analysis was performed to detect any increase in lod score over baseline. Eight...
journal_title:Genetic epidemiology
pub_type: 杂志文章
doi:10.1002/gepi.2001.21.s1.s266
更新日期:2001-01-01 00:00:00