Identification of grouped rare and common variants via penalized logistic regression.

Abstract:

:In spite of the success of genome-wide association studies in finding many common variants associated with disease, these variants seem to explain only a small proportion of the estimated heritability. Data collection has turned toward exome and whole genome sequencing, but it is well known that single marker methods frequently used for common variants have low power to detect rare variants associated with disease, even with very large sample sizes. In response, a variety of methods have been developed that attempt to cluster rare variants so that they may gather strength from one another under the premise that there may be multiple causal variants within a gene. Most of these methods group variants by gene or proximity, and test one gene or marker window at a time. We propose a penalized regression method (PeRC) that analyzes all genes at once, allowing grouping of all (rare and common) variants within a gene, along with subgrouping of the rare variants, thus borrowing strength from both rare and common variants within the same gene. The method can incorporate either a burden-based weighting of the rare variants or one in which the weights are data driven. In simulations, our method performs favorably when compared to many previously proposed approaches, including its predecessor, the sparse group lasso [Friedman et al., 2010].

journal_name

Genet Epidemiol

journal_title

Genetic epidemiology

authors

Ayers KL,Cordell HJ

doi

10.1002/gepi.21746

subject

Has Abstract

pub_date

2013-09-01 00:00:00

pages

592-602

issue

6

eissn

0741-0395

issn

1098-2272

journal_volume

37

pub_type

杂志文章
  • Direct genetic effects and their estimation from matched case-control data.

    abstract::In genetic association studies, a single marker is often associated with multiple, correlated phenotypes (e.g., obesity and cardiovascular disease, or nicotine dependence and lung cancer). A pervasive question is then whether that marker exerts independent effects on all phenotypes. In this paper, we address this ques...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21660

    authors: Berzuini C,Vansteelandt S,Foco L,Pastorino R,Bernardinelli L

    更新日期:2012-09-01 00:00:00

  • Hierarchical Bayesian model for rare variant association analysis integrating genotype uncertainty in human sequence data.

    abstract::Next-generation sequencing (NGS) has led to the study of rare genetic variants, which possibly explain the missing heritability for complex diseases. Most existing methods for rare variant (RV) association detection do not account for the common presence of sequencing errors in NGS data. The errors can largely affect ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21871

    authors: He L,Pitkäniemi J,Sarin AP,Salomaa V,Sillanpää MJ,Ripatti S

    更新日期:2015-02-01 00:00:00

  • Tag SNPs chosen from HapMap perform well in several population isolates.

    abstract::Population isolates may be particularly useful for association studies of complex traits. This utility, however, largely depends on the transferability of tag SNPs chosen from reference samples, such as HapMap, to samples from such populations. Factors that characterize population isolates, such as widespread genetic ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20201

    authors: Service S,International Collaborative Group on Isolated Populations.,Sabatti C,Freimer N

    更新日期:2007-04-01 00:00:00

  • Pooling data and linkage analysis in the chromosome 5q candidate region for asthma.

    abstract::We investigated a variety of methods for pooling data from eight data sets (n = 5,424 subjects) to validate evidence for linkage of markers in the cytokine cluster on chromosome 5q31-33 to asthma and asthma-associated phenotypes. Chromosome 5 markers were integrated into current genetic linkage and physical maps, and ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章,meta分析

    doi:10.1002/gepi.2001.21.s1.s103

    authors: Jacobs KB,Burton PR,Iyengar SK,Elston RC,Palmer LJ

    更新日期:2001-01-01 00:00:00

  • To type or not to type: the use of unaffected siblings in nonparametric linkage analysis.

    abstract::Unaffected individuals are often disregarded in nonparametric linkage analysis. Because of the presumed high complexity of genetic interactions and the resulting low penetrance of any single genetic effect, the statistical contribution of unaffected sib pairs is thought to be considerably lower than that of the affect...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.2001.21.s1.s522

    authors: Majewski J

    更新日期:2001-01-01 00:00:00

  • Optimized selection of unrelated subjects for whole-genome sequencing studies of rare high-penetrance alleles.

    abstract::Sequencing studies using whole-genome or exome scans are still more expensive than genome-wide association studies on a per-subject basis. As a result, only a subset of subjects from a larger study will be selected for sequencing. To perform an agnostic investigation of the entire genome, subjects may be selected that...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21641

    authors: Edwards TL,Li C

    更新日期:2012-07-01 00:00:00

  • Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale.

    abstract::Logistic regression is the primary analysis tool for binary traits in genome-wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22276

    authors: German CA,Sinsheimer JS,Klimentidis YC,Zhou H,Zhou JJ

    更新日期:2020-04-01 00:00:00

  • Linkage analysis of candidate obesity genes among the Mexican-American population of Starr County, Texas.

    abstract::Recent advances in the molecular basis of body fat regulation have identified several genes in which genetic variation may influence obesity and related measures in human populations. Genes that have been shown to have a regulatory function in the control of body fat utilization, eating behavior, and/or metabolic rate...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1999)16:4<397::AID-GEPI6>3

    authors: Bray MS,Boerwinkle E,Hanis CL

    更新日期:1999-01-01 00:00:00

  • Genome-wide linkage analysis using genetic variance components of alcohol dependency-associated censored and continuous traits.

    abstract::We used variance-components analysis to investigate the additive genetic effects regulating some of the phenotypes included in the GAW11 data set. Variance-components models were fitted using Gibbs sampling methods in BUGS v 0.6. Linkage analyses for both multivariate normal (MvN) traits and right censored survival ti...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370170748

    authors: Palmer LJ,Tiller KJ,Burton PR

    更新日期:1999-01-01 00:00:00

  • Optimizing the power of genome-wide association studies by using publicly available reference samples to expand the control group.

    abstract::Genome-wide association (GWA) studies have proved extremely successful in identifying novel genetic loci contributing effects to complex human diseases. In doing so, they have highlighted the fact that many potential loci of modest effect remain undetected, partly due to the need for samples consisting of many thousan...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20482

    authors: Zhuang JJ,Zondervan K,Nyberg F,Harbron C,Jawaid A,Cardon LR,Barratt BJ,Morris AP

    更新日期:2010-05-01 00:00:00

  • Power of non-parametric linkage analysis in mapping genes contributing to human longevity in long-lived sib-pairs.

    abstract::This report investigates the power issue in applying the non-parametric linkage analysis of affected sib-pairs (ASP) [Kruglyak and Lander, 1995: Am J Hum Genet 57:439-454] to localize genes that contribute to human longevity using long-lived sib-pairs. Data were simulated by introducing a recently developed statistica...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.10304

    authors: Tan Q,Zhao JH,Iachine I,Hjelmborg J,Vach W,Vaupel JW,Christensen K,Kruse TA

    更新日期:2004-04-01 00:00:00

  • SimPEL: Simulation-based power estimation for sequencing studies of low-prevalence conditions.

    abstract::Power estimations are important for optimizing genotype-phenotype association study designs. However, existing frameworks are designed for common disorders, and thus ill-suited for the inherent challenges of studies for low-prevalence conditions such as rare diseases and infrequent adverse drug reactions. These challe...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22129

    authors: Mak L,Li M,Cao C,Gordon P,Tarailo-Graovac M,Bousman C,Wang P,Long Q

    更新日期:2018-07-01 00:00:00

  • Major genetic effects on airway-parenchymal dysanapsis of the lung: the Humboldt family study.

    abstract::We examined familial resemblance and performed segregation analysis for the maximal expiratory flow rate at 50% of vital capacity (Vmax50) and the ratio of Vmax50 to forced vital capacity (FVC), based on data from 309 nuclear families with 1,045 individuals in the town of Humboldt, Saskatchewan, in 1993. Vmax50 is con...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1999)16:1<95::AID-GEPI8>3.

    authors: Chen Y,Dosman JA,Rennie DC,Lockinger LA

    更新日期:1999-01-01 00:00:00

  • Affected relative pairs and simultaneous search for two-locus linkage in the presence of epistasis.

    abstract::It is commonly believed that multiple interacting genes increase the susceptibility of genetically complex diseases, yet few linkage analyses of human diseases scan for more than one locus at a time. To overcome some of the statistical and computational limitations of a simultaneous search for two disease susceptibili...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20223

    authors: Schaid DJ,McDonnell SK,Carlson EE,Thibodeau SN,Ostrander EA,Stanford JL

    更新日期:2007-07-01 00:00:00

  • Detecting interactions between gene, site, and environmental variables using GAP.

    abstract::Regressive models that incorporate measured variables and assumed genetic parameters were used to detect interactions between gene, research site, and environmental variables in GAW11 Problem 2. Replicates 1 to 5 were used in the analyses. Significant three-way gene x environment x site interactions were seen for all ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.13701707118

    authors: Shin J,Corey M

    更新日期:1999-01-01 00:00:00

  • Allelic association patterns for a dense SNP map.

    abstract::A dense set of 5,000 SNPs on a 10-Mb region of human chromosome 20 has been typed on samples of African Americans, East Asians, and United Kingdom Caucasians. There are departures from Hardy-Weinberg equilibrium beyond the level at which markers are often discarded because of possible genotyping errors. The observatio...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20038

    authors: Weir BS,Hill WG,Cardon LR,SNP Consortium.

    更新日期:2004-12-01 00:00:00

  • Efficient strategy for detecting gene × gene joint action and its application in schizophrenia.

    abstract::We propose a new approach to detect gene × gene joint action in genome-wide association studies (GWASs) for case-control designs. This approach offers an exhaustive search for all two-way joint action (including, as a special case, single gene action) that is computationally feasible at the genome-wide level and has r...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21779

    authors: Won S,Kwon MS,Mattheisen M,Park S,Park C,Kihara D,Cichon S,Ophoff R,Nöthen MM,Rietschel M,Baur M,Uitterlinden AG,Hofmann A,GROUP Investigators.,Lange C

    更新日期:2014-01-01 00:00:00

  • Pleiotropy and principal components of heritability combine to increase power for association analysis.

    abstract::When many correlated traits are measured the potential exists to discover the coordinated control of these traits via genotyped polymorphisms. A common statistical approach to this problem involves assessing the relationship between each phenotype and each single nucleotide polymorphism (SNP) individually (PHN); and t...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20257

    authors: Klei L,Luca D,Devlin B,Roeder K

    更新日期:2008-01-01 00:00:00

  • Identifying genetic interactions in genome-wide data using Bayesian networks.

    abstract::It is believed that interactions among genes (epistasis) may play an important role in susceptibility to common diseases (Moore and Williams [2002]. Ann Med 34:88-95; Ritchie et al. [2001]. Am J Hum Genet 69:138-147). To study the underlying genetic variants of diseases, genome-wide association studies (GWAS) that sim...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20514

    authors: Jiang X,Barmada MM,Visweswaran S

    更新日期:2010-09-01 00:00:00

  • POLARIS: Polygenic LD-adjusted risk score approach for set-based analysis of GWAS data.

    abstract::Polygenic risk scores (PRSs) are a method to summarize the additive trait variance captured by a set of SNPs, and can increase the power of set-based analyses by leveraging public genome-wide association study (GWAS) datasets. PRS aims to assess the genetic liability to some phenotype on the basis of polygenic risk fo...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22117

    authors: Baker E,Schmidt KM,Sims R,O'Donovan MC,Williams J,Holmans P,Escott-Price V,Consortium WTG

    更新日期:2018-06-01 00:00:00

  • Replication of genetic associations as pseudoreplication due to shared genealogy.

    abstract::The genotypes of individuals in replicate genetic association studies have some level of correlation due to shared descent in the complete pedigree of all living humans. As a result of this genealogical sharing, replicate studies that search for genotype-phenotype associations using linkage disequilibrium between mark...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20400

    authors: Rosenberg NA,Vanliere JM

    更新日期:2009-09-01 00:00:00

  • Commingling analysis of memory performance in elderly men.

    abstract::Smalley et al. [(1992) Genet Epidemiol 9:333-345] found evidence of a mixture of two distributions in memory performance among offspring of patients with dementia of the Alzheimer type (DAT), suggesting that these groups reflect genotypic subgroups of carriers and non-carriers of a putative DAT gene. One prediction of...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370110506

    authors: Palmer CG,Wolkenstein BH,La Rue A,Swan GE,Smalley SL

    更新日期:1994-01-01 00:00:00

  • A general autosomal/X-linked model.

    abstract::This paper describes a general genetic model which encompasses both autosomal and X-linked inheritance as submodels. It allows one to test for X-linked inheritance of a trait by comparing the likelihood of X-linked inheritance to the likelihood of the general genetic model. The general model is formulated as two loci,...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370010105

    authors: Hasstedt SJ,Skolnick M

    更新日期:1984-01-01 00:00:00

  • Rare-variant association tests in longitudinal studies, with an application to the Multi-Ethnic Study of Atherosclerosis (MESA).

    abstract::Over the past few years, an increasing number of studies have identified rare variants that contribute to trait heritability. Due to the extreme rarity of some individual variants, gene-based association tests have been proposed to aggregate the genetic variants within a gene, pathway, or specific genomic region as op...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.22081

    authors: He Z,Lee S,Zhang M,Smith JA,Guo X,Palmas W,Kardia SLR,Ionita-Laza I,Mukherjee B

    更新日期:2017-12-01 00:00:00

  • Logistic transmission modeling for the simulated data of GAW10 problem 2.

    abstract::A recently developed nonparametric method is a generalization of the transmission disequilibrium test across all alleles of a locus. This approach has been applied to Problem 2 of GAW10 and has been extended to explore the combined contribution of neighboring loci for chromosomes 1, 5, and 8. When applied to the chrom...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/(SICI)1098-2272(1997)14:6<857::AID-GEPI49>

    authors: Neas BR,Moser KL,Harley JB

    更新日期:1997-01-01 00:00:00

  • A sliding-window weighted linkage disequilibrium test.

    abstract::Multilocus linkage disequilibrium (LD) tests that consider inter-marker (LD) are more powerful than single-locus tests when disease etiology is contributed simultaneously by several linked and correlated loci. However, inclusion of redundant non-informative markers may result in reduced testing power and/or inflated f...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.20165

    authors: Yang HC,Lin CY,Fann CS

    更新日期:2006-09-01 00:00:00

  • Immunoglobulin allotyping (Gm, Km) of GAW5 families.

    abstract::The following Gm and Km immunoglobulin allotypes were determined on the Genetic Analysis Workshop 5 insulin-dependent diabetes mellitus (GAW5 IDDM) families: G1m (1,2,3,17), G2m (23), G3m (5,10,11,13,14,21,28) and Km (1,3). Since the allotype G2m (23) has been rarely studied, due to paucity of typing reagents, it was ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370060108

    authors: Field LL,Dugoujon JM

    更新日期:1989-01-01 00:00:00

  • BRCA1 polymorphisms and breast cancer epidemiology in the Western New York exposures and breast cancer (WEB) study.

    abstract::Results of studies for the association of BRCA1 genotypes and haplotypes with sporadic breast cancer have been inconsistent. Therefore, a candidate single nucleotide polymorphism (SNP) approach was used in a breast cancer case-control study to explore genotypes and haplotypes that have the potential to affect protein ...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.21730

    authors: Ricks-Santi LJ,Nie J,Marian C,Ochs-Balcom HM,Trevisan M,Edge SB,Kanaan Y,Freudenheim JL,Shields PG

    更新日期:2013-07-01 00:00:00

  • The inheritance of pyloric stenosis explained by a multifactorial threshold model with sex dimorphism for liability.

    abstract::The inheritance of pyloric stenosis is explained by a multifactorial threshold model with an underlying assumption that the liability for the disease is distributed in males and females showing a sex dimorphism. From the available data on familial occurrences of pyloric stenosis, it is shown, that an extra maternal ef...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/gepi.1370030102

    authors: Chakraborty R

    更新日期:1986-01-01 00:00:00

  • Bayesian linkage and segregation analysis: factoring the problem.

    abstract::Complex segregation analysis and linkage methods are mathematical techniques for the genetic dissection of complex diseases. They are used to delineate complex modes of familial transmission and to localize putative disease susceptibility loci to specific chromosomal locations. The computational problem of Bayesian li...

    journal_title:Genetic epidemiology

    pub_type: 杂志文章

    doi:10.1002/1098-2272(2000)19:1+<::AID-GEPI8>3.0.CO;2-

    authors: Matthysse S

    更新日期:2000-01-01 00:00:00