The relationship between hot-deck multiple imputation and weighted likelihood.

Abstract:

:Hot-deck imputation is an intuitively simple and popular method of accommodating incomplete data. Users of the method will often use the usual multiple imputation variance estimator which is not appropriate in this case. However, no variance expression has yet been derived for this easily implemented method applied to missing covariates in regression models. The simple hot-deck method is in fact asymptotically equivalent to the mean-score method for the estimation of a regression model parameter, so that hot-deck can be understood in the context of likelihood methods. Both of these methods accommodate data where missingness may depend on the observed variables but not on the unobserved value of the incomplete covariate, that is, missing at random (MAR). The asymptotic properties of hot-deck are derived here for the case where the fully observed variables are categorical, though the incomplete covariate(s) may be continuous. Simulation studies indicate that the two methods compare well in small samples and for small numbers of imputations. Current users of hot-deck may now conduct their analysis using mean-score, which is a weighted likelihood method and can thus be implemented by a single pass through the data using any standard package which accommodates weighted regression models. Valid inference is now straightforward using the variance expression provided here. The equivalence of mean-score and hot-deck is illustrated using three clinical data sets where an important covariate is missing for a large number of study subjects.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Reilly M,Pepe M

doi

10.1002/(sici)1097-0258(19970115)16:1<5::aid-sim46

subject

Has Abstract

pub_date

1997-01-15 00:00:00

pages

5-19

issue

1-3

eissn

0277-6715

issn

1097-0258

pii

10.1002/(SICI)1097-0258(19970115)16:1<5::AID-SIM46

journal_volume

16

pub_type

杂志文章
  • Causal conclusions are most sensitive to unobserved binary covariates.

    abstract::There is a rich literature that considers whether an observed relation between treatment and response is due to an unobserved covariate. In order to quantify this unmeasured bias, an assumption is made about the distribution of this unobserved covariate; typically that it is either binary or at least confined to the u...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2344

    authors: Wang L,Krieger AM

    更新日期:2006-07-15 00:00:00

  • Parametric multistate survival models: Flexible modelling allowing transition-specific distributions with application to estimating clinically useful measures of effect differences.

    abstract::Multistate models are increasingly being used to model complex disease profiles. By modelling transitions between disease states, accounting for competing events at each transition, we can gain a much richer understanding of patient trajectories and how risk factors impact over the entire disease pathway. In this arti...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7448

    authors: Crowther MJ,Lambert PC

    更新日期:2017-12-20 00:00:00

  • A random effects model for ordinal responses from a crossover trial.

    abstract::Crossover studies have been successfully conducted in the case of continuous responses. Existing procedures of analysis for ordinal responses, on the other hand, are rarely satisfactory unless strict, usually unrealistic, assumptions are made. In this paper we investigate a random effects model and show that the model...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780100611

    authors: Ezzet F,Whitehead J

    更新日期:1991-06-01 00:00:00

  • A selection model for longitudinal binary responses subject to non-ignorable attrition.

    abstract::Longitudinal studies collect information on a sample of individuals which is followed over time to analyze the effects of individual and time-dependent characteristics on the observed response. These studies often suffer from attrition: individuals drop out of the study before its completion time and thus present inco...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3604

    authors: Alfò M,Maruotti A

    更新日期:2009-08-30 00:00:00

  • A spatial scan statistic for ordinal data.

    abstract::Spatial scan statistics are widely used for count data to detect geographical disease clusters of high or low incidence, mortality or prevalence and to evaluate their statistical significance. Some data are ordinal or continuous in nature, however, so that it is necessary to dichotomize the data to use a traditional s...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2607

    authors: Jung I,Kulldorff M,Klassen AC

    更新日期:2007-03-30 00:00:00

  • An adjustment for a post-randomization variable in the comparison of two treatments for survival.

    abstract::A method is proposed to infer the randomized treatment effect on survival after an adjustment for a post-randomization variable. The post-randomization variable is made independent of the treatment assignment and is considered a surrogate for baseline prognostic factors. The relationship between the post-randomization...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.968

    authors: Heller G

    更新日期:2001-11-30 00:00:00

  • Reducing cost in sequential testing: a limit of indifference approach.

    abstract::In noninferiority studies, a limit of indifference is used to express a tolerance in results such that the clinician would regard such results as being acceptable or 'not worse'. We applied this concept to a measure of accuracy, the Receiver Operating Characteristic (ROC) curve, for a sequence of tests. We expressed a...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5741

    authors: Ahmed AE,Schubert CM,McClish DK

    更新日期:2013-07-20 00:00:00

  • A comparison of group sequential methods for binary longitudinal data.

    abstract::Interim analyses are conducted to allow for early termination of the trial, for ethical as well as economical reasons. Here we consider interim analyses in repeated measurements studies where the measurements are binary. Two methods for analysing this kind of data are compared according to their operating characterist...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1361

    authors: Spiessens B,Lesaffre E,Verbeke G

    更新日期:2003-02-28 00:00:00

  • Posterior predictive model checks for disease mapping models.

    abstract::Disease incidence or disease mortality rates for small areas are often displayed on maps. Maps of raw rates, disease counts divided by the total population at risk, have been criticized as unreliable due to non-constant variance associated with heterogeneity in base population size. This has led to the use of model-ba...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/1097-0258(20000915/30)19:17/18<2377::aid-s

    authors: Stern HS,Cressie N

    更新日期:2000-09-15 00:00:00

  • Correcting for regression in assessing the response to treatment in a selected population.

    abstract::Previous work on the consequences of regression to the mean for the interpretation of responses to treatment is extended to the situation where the response measured is the proportional change in some variable. Methods for correcting for the bias are discussed. ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780060203

    authors: Curnow RN

    更新日期:1987-03-01 00:00:00

  • Linear regression for bivariate censored data via multiple imputation.

    abstract::Bivariate survival data arise, for example, in twin studies and studies of both eyes or ears of the same individual. Often it is of interest to regress the survival times on a set of predictors. In this paper we extend Wei and Tanner's multiple imputation approach for linear regression with univariate censored data to...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19991130)18:22<3111::aid-s

    authors: Pan W,Kooperberg C

    更新日期:1999-11-30 00:00:00

  • A joint modeling and estimation method for multivariate longitudinal data with mixed types of responses to analyze physical activity data generated by accelerometers.

    abstract::A mixed effect model is proposed to jointly analyze multivariate longitudinal data with continuous, proportion, count, and binary responses. The association of the variables is modeled through the correlation of random effects. We use a quasi-likelihood type approximation for nonlinear variables and transform the prop...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7401

    authors: Li H,Zhang Y,Carroll RJ,Keadle SK,Sampson JN,Matthews CE

    更新日期:2017-11-10 00:00:00

  • A maximally selected test of symmetry about zero.

    abstract::The problem of testing symmetry about zero has a long and rich history in the statistical literature. We introduce a new test that sequentially discards observations whose absolute value is below increasing thresholds defined by the data. McNemar's statistic is obtained at each threshold and the largest is used as the...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5384

    authors: Laska E,Meisner M,Wanderling J

    更新日期:2012-11-20 00:00:00

  • Sample size planning for survival prediction with focus on high-dimensional data.

    abstract::Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to contr...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5550

    authors: Götte H,Zwiener I

    更新日期:2013-02-28 00:00:00

  • Regression analysis applied to PVC histories: a statistical procedure for evaluating antiarrhythmic drug efficacy.

    abstract::Suppression of premature ventricular contractions (PVCs) is one of the goals of antiarrhythmic therapy. In a clinical trial, however, it may be difficult to distinguish antiarrhythmic drug effect from spontaneous variation in PVCs. We propose the application of linear regression to PVC histories to ascertain drug effe...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章

    doi:10.1002/sim.4780020305

    authors: Berry DA,Fox TL

    更新日期:1983-07-01 00:00:00

  • Choice of test for association in small sample unordered r x c tables.

    abstract::Pearson's chi-squared, the likelihood-ratio, and Fisher-Freeman-Halton's test statistics are often used to test the association of unordered r x c tables. Asymptotical, exact conditional, or exact conditional with mid-p adjustment methods are commonly used to compute the p-value. We have compared test power and signif...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2839

    authors: Lydersen S,Pradhan V,Senchaudhuri P,Laake P

    更新日期:2007-10-15 00:00:00

  • The analysis of continuous outcomes in multi-centre trials with small centre sizes.

    abstract::The standard analysis of clinical trials stratified by centre is to include centres as fixed effects, but if many centres contribute small numbers of patients, this approach results in a loss of power. Assuming no treatment by centre interaction, we used simulation to examine power and coverage of confidence intervals...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3068

    authors: Pickering RM,Weatherall M

    更新日期:2007-12-30 00:00:00

  • Explaining community-level variance in group randomized trials.

    abstract::Between-community variance or community-by-time variance is one of the key factors driving the cost of conducting group randomized trials, which are often very expensive. We investigated empirically whether between-community variance could be reduced by controlling individual- and/or community-level covariates and ide...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990315)18:5<539::aid-sim

    authors: Feng Z,Diehr P,Yasui Y,Evans B,Beresford S,Koepsell TD

    更新日期:1999-03-15 00:00:00

  • A penalized robust semiparametric approach for gene-environment interactions.

    abstract::In genetic and genomic studies, gene-environment (G×E) interactions have important implications. Some of the existing G×E interaction methods are limited by analyzing a small number of G factors at a time, by assuming linear effects of E factors, by assuming no data contamination, and by adopting ineffective selection...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6609

    authors: Wu C,Shi X,Cui Y,Ma S

    更新日期:2015-12-30 00:00:00

  • Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.

    abstract::The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7028

    authors: Hooper R,Teerenstra S,de Hoop E,Eldridge S

    更新日期:2016-11-20 00:00:00

  • Estimation of mean sojourn time in breast cancer screening using a Markov chain model of both entry to and exit from the preclinical detectable phase.

    abstract::The sojourn time, time spent in the preclinical detectable phase (PCDP) for chronic diseases, for example, breast cancer, plays an important role in the design and assessment of screening programmes. Traditional methods to estimate it usually assume a uniform incidence rate of preclinical disease from a randomized con...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/sim.4780141404

    authors: Duffy SW,Chen HH,Tabar L,Day NE

    更新日期:1995-07-30 00:00:00

  • Efficient evaluation of treatment effects in the presence of missing covariate values.

    abstract::In clinical trials, treatment comparisons are often performed by models that incorporate important prognostic factors. Since these models require complete covariate information on all patients, statisticians frequently resort to complete case analysis or to omission of an important covariate. A probability imputation ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780090707

    authors: Schemper M,Smith TL

    更新日期:1990-07-01 00:00:00

  • Methods for assessing reliability and validity for a measurement tool: a case study and critique using the WHO haemoglobin colour scale.

    abstract::Before introducing a new measurement tool it is necessary to evaluate its performance. Several statistical methods have been developed, or used, to evaluate the reliability and validity of a new assessment method in such circumstances. In this paper we review some commonly used methods. Data from a study that was cond...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1804

    authors: White SA,van den Broek NR

    更新日期:2004-05-30 00:00:00

  • Infant growth modelling using a shape invariant model with random effects.

    abstract::Models for infant growth have usually been based on parametric forms, commonly an exponential or similar model, which have been shown to fit poorly especially during the first year of life. An alternative approach is to use a non-parametric model, based on a shape invariant model (SIM), where a single function is tran...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2718

    authors: Beath KJ

    更新日期:2007-05-30 00:00:00

  • Second-stage least squares versus penalized quasi-likelihood for fitting hierarchical models in epidemiologic analyses.

    abstract::Hierarchical regression analysis holds much promise for epidemiologic analysis, but has as yet seen limited application because of lack of easily used software and the relatively lengthy run times of preferred fitting methods (such as true maximum likelihood and Bayesian approaches). This paper compares three relative...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970315)16:5<515::aid-sim

    authors: Greenland S

    更新日期:1997-03-15 00:00:00

  • A refined method for the meta-analysis of controlled clinical trials with binary outcome.

    abstract::For the meta-analysis of controlled clinical trials with binary outcome a test statistic for testing an overall treatment effect is proposed, which is based on a refined estimator for the variance of the treatment effect estimator usually used in the random-effects model of meta-analysis. In simulation studies it is s...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1009

    authors: Hartung J,Knapp G

    更新日期:2001-12-30 00:00:00

  • Variance estimators for attributable fraction estimates consistent in both large strata and sparse data.

    abstract::A number of variance formulae for the attributable fraction have been presented, but none is consistent in sparse data, such as found in individually matched case-control studies. This paper employs Mantel-Haenszel estimation to derive variance estimators for attributable fractions that are dually consistent, that is,...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780060607

    authors: Greenland S

    更新日期:1987-09-01 00:00:00

  • Estimating the completeness of prevalence based on cancer registry data.

    abstract::Prevalence data provided by cancer registries are generally biased, since the patients that were diagnosed before the starting of the registry's activity cannot be included in the statistics. The relevance of this incompleteness bias is estimated in this paper. Incidence and relative survival are modelled as parametri...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970228)16:4<425::aid-sim

    authors: Capocaccia R,De Angelis R

    更新日期:1997-02-28 00:00:00

  • Application of a two-stage random effects model to longitudinal pulmonary function data from sarcoidosis patients.

    abstract::We applied a two-stage random effects model to pulmonary function data from 31 sarcoidosis patients to illustrate its usefulness in analysing unbalanced longitudinal data. For the first stage, repeated measurements of percentage of predicted forced vital capacity (FVC%) from an individual were modelled as a function o...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780080206

    authors: Vacek PM,Mickey RM,Bell DY

    更新日期:1989-02-01 00:00:00

  • Flexible design clinical trial methodology in regulatory applications.

    abstract::Adaptive designs or flexible designs in a broader sense have increasingly been considered in planning pivotal registration clinical trials. Sample size reassessment design and adaptive selection design are two of such designs that appear in regulatory applications. At the design stage, consideration of sample size rea...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4021

    authors: Hung HM,Wang SJ,O'Neill R

    更新日期:2011-06-15 00:00:00