Flexible variable selection for recovering sparsity in nonadditive nonparametric models.

Abstract:

:Variable selection for recovering sparsity in nonadditive and nonparametric models with high-dimensional variables has been challenging. This problem becomes even more difficult due to complications in modeling unknown interaction terms among high-dimensional variables. There is currently no variable selection method to overcome these limitations. Hence, in this article we propose a variable selection approach that is developed by connecting a kernel machine with the nonparametric regression model. The advantages of our approach are that it can: (i) recover the sparsity; (ii) automatically model unknown and complicated interactions; (iii) connect with several existing approaches including linear nonnegative garrote and multiple kernel learning; and (iv) provide flexibility for both additive and nonadditive nonparametric models. Our approach can be viewed as a nonlinear version of a nonnegative garrote method. We model the smoothing function by a Least Squares Kernel Machine (LSKM) and construct the nonnegative garrote objective function as the function of the sparse scale parameters of kernel machine to recover sparsity of input variables whose relevances to the response are measured by the scale parameters. We also provide the asymptotic properties of our approach. We show that sparsistency is satisfied with consistent initial kernel function coefficients under certain conditions. An efficient coordinate descent/backfitting algorithm is developed. A resampling procedure for our variable selection methodology is also proposed to improve the power.

journal_name

Biometrics

journal_title

Biometrics

authors

Fang Z,Kim I,Schaumont P

doi

10.1111/biom.12518

subject

Has Abstract

pub_date

2016-12-01 00:00:00

pages

1155-1163

issue

4

eissn

0006-341X

issn

1541-0420

journal_volume

72

pub_type

杂志文章
  • Inference for reaction networks using the linear noise approximation.

    abstract::We consider inference for the reaction rates in discretely observed networks such as those found in models for systems biology, population ecology, and epidemics. Most such networks are neither slow enough nor small enough for inference via the true state-dependent Markov jump process to be feasible. Typically, infere...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12152

    authors: Fearnhead P,Giagos V,Sherlock C

    更新日期:2014-06-01 00:00:00

  • Impact of time to start treatment following infection with application to initiating HAART in HIV-positive patients.

    abstract::We estimate how the effect of antiretroviral treatment depends on the time from HIV-infection to initiation of treatment, using observational data. A major challenge in making inferences from such observational data arises from biases associated with the nonrandom assignment of treatment, for example bias induced by d...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2011.01738.x

    authors: Lok JJ,DeGruttola V

    更新日期:2012-09-01 00:00:00

  • Bayesian models for multivariate current status data with informative censoring.

    abstract::Multivariate current status data, consist of indicators of whether each of several events occur by the time of a single examination. Our interest focuses on inferences about the joint distribution of the event times. Conventional methods for analysis of multiple event-time data cannot be used because all of the event ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2002.00079.x

    authors: Dunson DB,Dinse GE

    更新日期:2002-03-01 00:00:00

  • Nonparametric analysis of covariance by matching.

    abstract::The basic problem under consideration is the comparison of treatments with respect to a response Y when a covariable X is taken into account. Various methods involving matching may be regarded as compromises between the standard analysis of covariance and the standard analysis of independent matched pairs. First, ther...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Quade D

    更新日期:1982-09-01 00:00:00

  • Functional multiple indicators, multiple causes measurement error models.

    abstract::Objective measures of oxygen consumption and carbon dioxide production by mammals are used to predict their energy expenditure. Since energy expenditure is not directly observable, it can be viewed as a latent construct with multiple physical indirect measures such as respiratory quotient, volumetric oxygen consumptio...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12706

    authors: Tekwe CD,Zoh RS,Bazer FW,Wu G,Carroll RJ

    更新日期:2018-03-01 00:00:00

  • Optimal matching with minimal deviation from fine balance in a study of obesity and surgical outcomes.

    abstract::In multivariate matching, fine balance constrains the marginal distributions of a nominal variable in treated and matched control groups to be identical without constraining who is matched to whom. In this way, a fine balance constraint can balance a nominal variable with many levels while focusing efforts on other mo...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2011.01691.x

    authors: Yang D,Small DS,Silber JH,Rosenbaum PR

    更新日期:2012-06-01 00:00:00

  • Efficient analysis of Weibull survival data from experiments on heterogeneous patient populations.

    abstract::An efficient method is presented for analyses of death rated in one-way or cross-classified experiments where expected survival time for a patient at time of entry on trial is a function of observable covariates. The survival-time distribution used is a Weibull form of Cox's (1972) model. The analysis proceeds in two ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Williams JS

    更新日期:1978-06-01 00:00:00

  • On the treatment of grouped observations in life studies.

    abstract::Assuming a model of proportional failure rates, Cox (1972) presents a systematic study of the use of covariates in the analysis of life time. The treatment of tied observations is a particularly troublesome point in both theory and application. It appears that grouping rather than discrete time is the right way to han...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Thompson WA Jr

    更新日期:1977-09-01 00:00:00

  • Valid inference in random effects meta-analysis.

    abstract::The standard approach to inference for random effects meta-analysis relies on approximating the null distribution of a test statistic by a standard normal distribution. This approximation is asymptotic on k, the number of studies, and can be substantially in error in medical meta-analyses, which often have only a few ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.1999.00732.x

    authors: Follmann DA,Proschan MA

    更新日期:1999-09-01 00:00:00

  • Case-control studies of gene-environment interaction: Bayesian design and analysis.

    abstract::With increasing frequency, epidemiologic studies are addressing hypotheses regarding gene-environment interaction. In many well-studied candidate genes and for standard dietary and behavioral epidemiologic exposures, there is often substantial prior information available that may be used to analyze current data as wel...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2009.01357.x

    authors: Mukherjee B,Ahn J,Gruber SB,Ghosh M,Chatterjee N

    更新日期:2010-09-01 00:00:00

  • Incorporating correlation for multivariate failure time data when cluster size is large.

    abstract::We propose a new estimation method for multivariate failure time data using the quadratic inference function (QIF) approach. The proposed method efficiently incorporates within-cluster correlations. Therefore, it is more efficient than those that ignore within-cluster correlation. Furthermore, the proposed method is e...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2009.01307.x

    authors: Xue L,Wang L,Qu A

    更新日期:2010-06-01 00:00:00

  • Estimating treatment effect in a proportional hazards model in randomized clinical trials with all-or-nothing compliance.

    abstract::We consider methods for estimating the treatment effect and/or the covariate by treatment interaction effect in a randomized clinical trial under noncompliance with time-to-event outcome. As in Cuzick et al. (2007), assuming that the patient population consists of three (possibly latent) subgroups based on treatment p...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12472

    authors: Li S,Gray RJ

    更新日期:2016-09-01 00:00:00

  • Evaluating multiple diagnostic tests with partial verification.

    abstract::To evaluate diagnostic tests, one would ideally like to verify, for example, with a biopsy, the disease state of all subjects in a study. Often, however, no all subjects are verified. Previous methods for evaluation assume that the decision to verify depends only on recorded variables. Sometimes, particularly if the d...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Baker SG

    更新日期:1995-03-01 00:00:00

  • A note on estimating crude odds ratios in case-control studies with differentially misclassified exposure.

    abstract::Morrissey and Spiegelman (1999, Biometrics 55, 338 344) provided a comparative study of adjustment methods for exposure misclassification in case-control studies equipped with an internal validation sample. In addition to the maximum likelihood (ML) approach, they considered two intuitive procedures based on proposals...

    journal_title:Biometrics

    pub_type: 评论,杂志文章

    doi:10.1111/j.0006-341x.2002.1034_1.x

    authors: Lyles RH

    更新日期:2002-12-01 00:00:00

  • A Bayesian approach to modeling associations between pulsatile hormones.

    abstract:SUMMARY:Many hormones are secreted in pulses. The pulsatile relationship between hormones regulates many biological processes. To understand endocrine system regulation, time series of hormone concentrations are collected. The goal is to characterize pulsatile patterns and associations between hormones. Currently each ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2008.01117.x

    authors: Carlson NE,Johnson TD,Brown MB

    更新日期:2009-06-01 00:00:00

  • Regression dilution in the proportional hazards model.

    abstract::The problem of regression dilution arising from covariate measurement error is investigated for survival data using the proportional hazards model. The naive approach to parameter estimation is considered whereby observed covariate values are used, inappropriately, in the usual analysis instead of the underlying covar...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Hughes MD

    更新日期:1993-12-01 00:00:00

  • Adjusted regression trend test for a multicenter clinical trial.

    abstract::Studies using a series of increasing doses of a compound, including a zero dose control, are often conducted to study the effect of the compound on the response of interest. For a one-way design, Tukey et al. (1985, Biometrics 41, 295-301) suggested assessing trend by examining the slopes of regression lines under ari...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.1999.00460.x

    authors: Quan H,Capizzi T

    更新日期:1999-06-01 00:00:00

  • A comparison of several point estimators of the odds ratio in a single 2 x 2 contingency table.

    abstract::The relative performance of the unconditioned maximum likelihood estimators (UMLEs), conditional MLEs (CMLEs), and Jewell-type estimators of the odds ratio (OR) and its logarithm were investigated in sets of single 2 x 2 contingency tables. The tables were generated by complete enumeration of all possible cell frequen...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Walter SD,Cook RJ

    更新日期:1991-09-01 00:00:00

  • Hypothesis testing under mixture models: application to genetic linkage analysis.

    abstract::In this paper we propose a new class of statistics to test a simple hypothesis against a family of alternatives characterized by a mixture model. Unlike the likelihood ratio statistic, whose large sample distribution is still unknown in this situation, these new statistics have a simple asymptotic distribution to whic...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.1999.00065.x

    authors: Liang KY,Rathouz PJ

    更新日期:1999-03-01 00:00:00

  • Time-varying functional regression for predicting remaining lifetime distributions from longitudinal trajectories.

    abstract::A recurring objective in longitudinal studies on aging and longevity has been the investigation of the relationship between age-at-death and current values of a longitudinal covariate trajectory that quantifies reproductive or other behavioral activity. We propose a novel technique for predicting age-at-death distribu...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2005.00378.x

    authors: Müller HG,Zhang Y

    更新日期:2005-12-01 00:00:00

  • Statistical methods in ophthalmology: an adjusted chi-square approach.

    abstract::Ophthalmologic studies often compare several groups of subjects for the presence or absence of some ocular finding, where each subject may contribute two eyes to the analysis, the values from the two eyes being highly correlated. Rosner (1982, Biometrics 38, 105-114) and Dallal (1988, Biometrics 44, 253-257) proposed ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Donner A

    更新日期:1989-06-01 00:00:00

  • Modeling of time trends and interactions in vital rates using restricted regression splines.

    abstract::For the analysis of time trends in incidence and mortality rates, the age-period-cohort (apc) model has became a widely accepted method. The considered data are arranged in a two-way table by age group and calendar period, which are mostly subdivided into 5- or 10-year intervals. The disadvantage of this approach is t...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Heuer C

    更新日期:1997-03-01 00:00:00

  • Covariate adjustment of event histories estimated from Markov chains: the additive approach.

    abstract::Markov chain models are frequently used for studying event histories that include transitions between several states. An empirical transition matrix for nonhomogeneous Markov chains has previously been developed, including a detailed statistical theory based on counting processes and martingales. In this article, we s...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2001.00993.x

    authors: Aalen OO,Borgan O,Fekjaer H

    更新日期:2001-12-01 00:00:00

  • Comparing the performances of Diggle's tests of spatial randomness for small samples with and without edge-effect correction: application to ecological data.

    abstract::Diggle's tests of spatial randomness based on empirical distributions of interpoint distances can be performed with and without edge-effect correction. We present here numerical results illustrating that tests without the edge-effect correction proposed by Diggle (1979, Biometrics 35, 87-101) have a higher power for s...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.1999.00156.x

    authors: Gignoux J,Duby C,Barot S

    更新日期:1999-03-01 00:00:00

  • Testing for Hardy-Weinberg equilibrium.

    abstract::The class of admissible tests for Hardy-Weinberg equilibrium in a multi-allelic system is characterized. The standard goodness-of-fit chi-square tests is shown to be admissible for systems of two or more alleles. The conditional probability distribution required to determine the exact significance level of this test i...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Ledwina T,Gnot S

    更新日期:1980-03-01 00:00:00

  • Dynamic analysis of multivariate failure time data.

    abstract::We present an approach for analyzing internal dependencies in counting processes. This covers the case with repeated events on each of a number of individuals, and more generally, the situation where several processes are observed for each individual. We define dynamic covariates, i.e., covariates depending on the pas...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2004.00227.x

    authors: Aalen OO,Fosen J,Weedon-Fekjaer H,Borgan O,Husebye E

    更新日期:2004-09-01 00:00:00

  • Joint modeling of progression of HIV resistance mutations measured with uncertainty and failure time data.

    abstract::Development of HIV resistance mutations is a major cause for failure of antiretroviral treatment. This article proposes a method for jointly modeling the processes of viral genetic changes and treatment failure. Because the viral genome is measured with uncertainty, a hidden Markov model is used to fit the viral genet...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2006.00635.x

    authors: Hu C,De Gruttola V

    更新日期:2007-03-01 00:00:00

  • Semiparametric estimation of proportional mean residual life model in presence of censoring.

    abstract::A mean residual life function is the average remaining life of a surviving subject, as it varies with time. The proportional mean residual life model was proposed by Oakes and Dasu (1990, Biometrika77, 409-410) in regression analysis to study its association with related covariates in absence of censoring. In this art...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2005.030224.x

    authors: Chen YQ,Jewell NP,Lei X,Cheng SC

    更新日期:2005-03-01 00:00:00

  • Ranked set sampling with unequal samples.

    abstract::A ranked set sampling procedure with unequal samples (RSSU) is proposed and used to estimate the population mean. This estimator is then compared with the estimators based on the ranked set sampling (RSS) and median ranked set sampling (MRSS) procedures. It is shown that the relative precisions of the estimator based ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2001.00957.x

    authors: Bhoj DS

    更新日期:2001-09-01 00:00:00

  • Testing for cubic smoothing splines under dependent data.

    abstract::In most research on smoothing splines the focus has been on estimation, while inference, especially hypothesis testing, has received less attention. By defining design matrices for fixed and random effects and the structure of the covariance matrices of random errors in an appropriate way, the cubic smoothing spline a...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2010.01537.x

    authors: Nummi T,Pan J,Siren T,Liu K

    更新日期:2011-09-01 00:00:00