Variable selection and prediction using a nested, matched case-control study: Application to hospital acquired pneumonia in stroke patients.

Abstract:

:Matched case-control designs are commonly used in epidemiologic studies for increased efficiency. These designs have recently been introduced to the setting of modern imaging and genomic studies, which are characterized by high-dimensional covariates. However, appropriate statistical analyses that adjust for the matching have not been widely adopted. A matched case-control study of 430 acute ischemic stroke patients was conducted at Massachusetts General Hospital (MGH) in order to identify specific brain regions of acute infarction that are associated with hospital acquired pneumonia (HAP) in these patients. There are 138 brain regions in which infarction was measured, which introduce nearly 10,000 two-way interactions, and challenge the statistical analysis. We investigate penalized conditional and unconditional logistic regression approaches to this variable selection problem that properly differentiate between selection of main effects and of interactions, and that acknowledge the matching. This neuroimaging study was nested within a larger prospective study of HAP in 1915 stroke patients at MGH, which recorded clinical variables, but did not include neuroimaging. We demonstrate how the larger study, in conjunction with the nested, matched study, affords us the capability to derive a score for prediction of HAP in future stroke patients based on imaging and clinical features. We evaluate the proposed methods in simulation studies and we apply them to the MGH HAP study.

journal_name

Biometrics

journal_title

Biometrics

authors

Qian J,Payabvash S,Kemmling A,Lev MH,Schwamm LH,Betensky RA

doi

10.1111/biom.12113

subject

Has Abstract

pub_date

2014-03-01 00:00:00

pages

153-63

issue

1

eissn

0006-341X

issn

1541-0420

journal_volume

70

pub_type

杂志文章
  • Growth curve models of repeated binary response.

    abstract::Experimental designs that include repeated measures of binary response variables over time and under different conditions are common in biology. In such settings, it is often desirable to characterize the response pattern over time. When response variables are continuous, this characterization can be made in terms of ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Stanek EJ 3rd,Diehl SR

    更新日期:1988-12-01 00:00:00

  • Semiparametric estimation of proportional mean residual life model in presence of censoring.

    abstract::A mean residual life function is the average remaining life of a surviving subject, as it varies with time. The proportional mean residual life model was proposed by Oakes and Dasu (1990, Biometrika77, 409-410) in regression analysis to study its association with related covariates in absence of censoring. In this art...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2005.030224.x

    authors: Chen YQ,Jewell NP,Lei X,Cheng SC

    更新日期:2005-03-01 00:00:00

  • A comparison of methods for estimating the causal effect of a treatment in randomized clinical trials subject to noncompliance.

    abstract:SUMMARY:We consider the analysis of clinical trials that involve randomization to an active treatment (T = 1) or a control treatment (T = 0), when the active treatment is subject to all-or-nothing compliance. We compare three approaches to estimating treatment efficacy in this situation: as-treated analysis, per-protoc...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2008.01066.x

    authors: Little RJ,Long Q,Lin X

    更新日期:2009-06-01 00:00:00

  • Multiclass linear discriminant analysis with ultrahigh-dimensional features.

    abstract::Within the framework of Fisher's discriminant analysis, we propose a multiclass classification method which embeds variable screening for ultrahigh-dimensional predictors. Leveraging interfeature correlations, we show that the proposed linear classifier recovers informative features with probability tending to one and...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13065

    authors: Li Y,Hong HG,Li Y

    更新日期:2019-12-01 00:00:00

  • Nonparametric discrete survival function estimation with uncertain endpoints using an internal validation subsample.

    abstract::When a true survival endpoint cannot be assessed for some subjects, an alternative endpoint that measures the true endpoint with error may be collected, which often occurs when obtaining the true endpoint is too invasive or costly. We develop an estimated likelihood function for the situation where we have both uncert...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12316

    authors: Zee J,Xie SX

    更新日期:2015-09-01 00:00:00

  • Sequential model selection-based segmentation to detect DNA copy number variation.

    abstract::Array-based CGH experiments are designed to detect genomic aberrations or regions of DNA copy-number variation that are associated with an outcome, typically a state of disease. Most of the existing statistical methods target on detecting DNA copy number variations in a single sample or array. We focus on the detectio...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12478

    authors: Hu J,Zhang L,Wang HJ

    更新日期:2016-09-01 00:00:00

  • A two-stage experimental design for dilution assays.

    abstract::Dilution assays to determine solute concentration have found wide use in biomedical research. Many dilution assays return imprecise concentration estimates because they are only done to orders of magnitude. Previous statistical work has focused on how to design efficient experiments that can return more precise estima...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13032

    authors: Ferguson JM,Miura TA,Miller CR

    更新日期:2019-09-01 00:00:00

  • Dynamic models for estimating the effect of HAART on CD4 in observational studies: Application to the Aquitaine Cohort and the Swiss HIV Cohort Study.

    abstract::Highly active antiretroviral therapy (HAART) has proved efficient in increasing CD4 counts in many randomized clinical trials. Because randomized trials have some limitations (e.g., short duration, highly selected subjects), it is interesting to assess the effect of treatments using observational studies. This is chal...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12564

    authors: Prague M,Commenges D,Gran JM,Ledergerber B,Young J,Furrer H,Thiébaut R

    更新日期:2017-03-01 00:00:00

  • Capture-recapture when time and behavioral response affect capture probabilities.

    abstract::We consider a capture-recapture model in which capture probabilities vary with time and with behavioral response. Two inference procedures are developed under the assumption that recapture probabilities bear a constant relationship to initial capture probabilities. These two procedures are the maximum likelihood metho...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2000.00427.x

    authors: Chao A,Chu W,Hsu CH

    更新日期:2000-06-01 00:00:00

  • Bayesian model-averaged benchmark dose analysis via reparameterized quantal-response models.

    abstract::An important objective in biomedical and environmental risk assessment is estimation of minimum exposure levels that induce a pre-specified adverse response in a target population. The exposure points in such settings are typically referred to as benchmark doses (BMDs). Parametric Bayesian estimation for finding BMDs ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12340

    authors: Fang Q,Piegorsch WW,Simmons SJ,Li X,Chen C,Wang Y

    更新日期:2015-12-01 00:00:00

  • Bayesian inference for two-phase studies with categorical covariates.

    abstract::In this article, we consider two-phase sampling in the situation in which all covariates are categorical. Two-phase designs are appealing from an efficiency perspective since they allow sampling to be concentrated in informative cells. A number of likelihood-based methods have been developed for the analysis of two-ph...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12019

    authors: Ross M,Wakefield J

    更新日期:2013-06-01 00:00:00

  • A general model for the analysis of mark-resight, mark-recapture, and band-recovery data under tag loss.

    abstract::Estimates of waterfowl demographic parameters often come from resighting studies where birds fit with individually identifiable neck collars are resighted at a distance. Concerns have been raised about the effects of collar loss on parameter estimates, and the reliability of extrapolating from collared individuals to ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2004.00245.x

    authors: Conn PB,Kendall WL,Samuel MD

    更新日期:2004-12-01 00:00:00

  • Latent Ornstein-Uhlenbeck models for Bayesian analysis of multivariate longitudinal categorical responses.

    abstract::We propose a Bayesian latent Ornstein-Uhlenbeck (OU) model to analyze unbalanced longitudinal data of binary and ordinal variables, which are manifestations of fewer continuous latent variables. We focus on the evolution of such latent variables when they continuously change over time. Existing approaches are limited ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13292

    authors: Tran TD,Lesaffre E,Verbeke G,Duyck J

    更新日期:2020-05-11 00:00:00

  • Statistical analysis of unlabeled point sets: comparing molecules in chemoinformatics.

    abstract::We consider Bayesian methodology for comparing two or more unlabeled point sets. Application of the technique to a set of steroid molecules illustrates its potential utility involving the comparison of molecules in chemoinformatics and bioinformatics. We initially match a pair of molecules, where one molecule is regar...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2006.00622.x

    authors: Dryden IL,Hirst JD,Melville JL

    更新日期:2007-03-01 00:00:00

  • Accurate critical constants for the one-sided approximate likelihood ratio test of a normal mean vector when the covariance matrix is estimated.

    abstract::Tang, Gnecco, and Geller (1989, Biometrika 76, 577-583) proposed an approximate likelihood ratio (ALR) test of the null hypothesis that a normal mean vector equals a null vector against the alternative that all of its components are nonnegative with at least one strictly positive. This test is useful for comparing a t...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2002.00650.x

    authors: Tamhane AC,Logan BR

    更新日期:2002-09-01 00:00:00

  • Comments about Joint Modeling of Cluster Size and Binary and Continuous Subunit-Specific Outcomes.

    abstract::In longitudinal studies and in clustered situations often binary and continuous response variables are observed and need to be modeled together. In a recent publication Dunson, Chen, and Harry (2003, Biometrics 59, 521-530) (DCH) propose a Bayesian approach for joint modeling of cluster size and binary and continuous ...

    journal_title:Biometrics

    pub_type: 评论,杂志文章

    doi:10.1111/j.1541-020X.2005.00409_1.x

    authors: Gueorguieva RV

    更新日期:2005-09-01 00:00:00

  • Order-preserving dimension reduction procedure for the dominance of two mean curves with application to tidal volume curves.

    abstract::The paper here presented was motivated by a case study involving high-dimensional and high-frequency tidal volume traces measured during induced panic attacks. The focus was to develop a procedure to determine the significance of whether a mean curve dominates another one. The key idea of the suggested method relies o...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2007.00959.x

    authors: Lee SH,Lim J,Vannucci M,Petkova E,Preter M,Klein DF

    更新日期:2008-09-01 00:00:00

  • Estimating treatment effect in a proportional hazards model in randomized clinical trials with all-or-nothing compliance.

    abstract::We consider methods for estimating the treatment effect and/or the covariate by treatment interaction effect in a randomized clinical trial under noncompliance with time-to-event outcome. As in Cuzick et al. (2007), assuming that the patient population consists of three (possibly latent) subgroups based on treatment p...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12472

    authors: Li S,Gray RJ

    更新日期:2016-09-01 00:00:00

  • Optimally weighted L(2) distance for functional data.

    abstract::Many techniques of functional data analysis require choosing a measure of distance between functions, with the most common choice being L2 distance. In this article we show that using a weighted L2 distance, with a judiciously chosen weight function, can improve the performance of various statistical methods for funct...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12161

    authors: Chen H,Reiss PT,Tarpey T

    更新日期:2014-09-01 00:00:00

  • Aberrant crypt foci and semiparametric modeling of correlated binary data.

    abstract::Motivated by the spatial modeling of aberrant crypt foci (ACF) in colon carcinogenesis, we consider binary data with probabilities modeled as the sum of a nonparametric mean plus a latent Gaussian spatial process that accounts for short-range dependencies. The mean is modeled in a general way using regression splines....

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2007.00892.x

    authors: Apanasovich TV,Ruppert D,Lupton JR,Popovic N,Turner ND,Chapkin RS,Carroll RJ

    更新日期:2008-06-01 00:00:00

  • Response-adaptive regression for longitudinal data.

    abstract::We propose a response-adaptive model for functional linear regression, which is adapted to sparsely sampled longitudinal responses. Our method aims at predicting response trajectories and models the regression relationship by directly conditioning the sparse and irregular observations of the response on the predictor,...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2010.01518.x

    authors: Wu S,Müller HG

    更新日期:2011-09-01 00:00:00

  • Additive gamma frailty models with applications to competing risks in related individuals.

    abstract::Epidemiological studies of related individuals are often complicated by the fact that follow-up on the event type of interest is incomplete due to the occurrence of other events. We suggest a class of frailty models with cause-specific hazards for correlated competing events in related individuals. The frailties are b...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12326

    authors: Eriksson F,Scheike T

    更新日期:2015-09-01 00:00:00

  • Fitting nonlinear and constrained generalized estimating equations with optimization software.

    abstract::In this article, we present an estimation approach for solving nonlinear constrained generalized estimating equations that can be implemented using object-oriented software for nonlinear programming, such as nlminb in Splus or fmincon and lsqnonlin in Matlab. We show how standard estimating equation theory includes th...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2000.01268.x

    authors: Contreras M,Ryan LM

    更新日期:2000-12-01 00:00:00

  • Small-sample inference for the comparison of means of log-normal distributions.

    abstract::We propose a likelihood-based test for comparing the means of two or more log-normal distributions, with possibly unequal variances. A modification to the likelihood ratio test is needed when sample sizes are small. The performance of the proposed procedures is compared with the F-ratio test using Monte Carlo simulati...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2004.00199.x

    authors: Gill PS

    更新日期:2004-06-01 00:00:00

  • Comparison of different methods for decision-making in bioequivalence assessment.

    abstract::If the regulatory requirements are symmetrical, the use of symmetrical confidence intervals as a decision rule for bioequivalence assessment leads, as shown by simulations, to better level properties and an inferior power compared to a rule based on shortest confidence intervals. A choice between these two approaches ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Mandallaz D,Mau J

    更新日期:1981-06-01 00:00:00

  • Modification of the Greenwood formula for correlated response times.

    abstract::Life-table methodology for interval-censored survival times is used to estimate marginal survival probabilities from data consisting of independent cohorts of correlated responses. We restrict our attention to situations where response times within cohorts are exchangeable and the marginal survival distributions are t...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Kang SS,Koehler KJ

    更新日期:1997-09-01 00:00:00

  • Memory in coal tits: an alternative model.

    abstract::Jolliffe and Jolliffe (1997, Biometrics 53, 1136-1142) proposed various models for data from an experiment on memory in coal tits. This article describes an alternative model, which fits equally well and which may be simpler to interpret. ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.1999.00660.x

    authors: Ridout MS

    更新日期:1999-06-01 00:00:00

  • Exact inference on the random-effects model for meta-analyses with few studies.

    abstract::We describe an exact, unconditional, non-randomized procedure for producing confidence intervals for the grand mean in a normal-normal random effects meta-analysis. The procedure targets meta-analyses based on too few primary studies, ≤ 7 , say, to ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12998

    authors: Michael H,Thornton S,Xie M,Tian L

    更新日期:2019-06-01 00:00:00

  • On the Colton model for clinical trials with delayed observations -- normally-distributed responses.

    abstract::The Colton model for the choice between two medical treatments is studied, with the additional assumption that there is a time lag between the administration of the treatments and the availability of the responses. Two simple procedures are suggested for dealing with patients who arrive during the waiting period, caus...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Langenberg P,Srinivasan R

    更新日期:1981-03-01 00:00:00

  • Ranked set sampling with unequal samples.

    abstract::A ranked set sampling procedure with unequal samples (RSSU) is proposed and used to estimate the population mean. This estimator is then compared with the estimators based on the ranked set sampling (RSS) and median ranked set sampling (MRSS) procedures. It is shown that the relative precisions of the estimator based ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2001.00957.x

    authors: Bhoj DS

    更新日期:2001-09-01 00:00:00