Abstract:
:Hot-deck imputation is an intuitively simple and popular method of accommodating incomplete data. Users of the method will often use the usual multiple imputation variance estimator which is not appropriate in this case. However, no variance expression has yet been derived for this easily implemented method applied to missing covariates in regression models. The simple hot-deck method is in fact asymptotically equivalent to the mean-score method for the estimation of a regression model parameter, so that hot-deck can be understood in the context of likelihood methods. Both of these methods accommodate data where missingness may depend on the observed variables but not on the unobserved value of the incomplete covariate, that is, missing at random (MAR). The asymptotic properties of hot-deck are derived here for the case where the fully observed variables are categorical, though the incomplete covariate(s) may be continuous. Simulation studies indicate that the two methods compare well in small samples and for small numbers of imputations. Current users of hot-deck may now conduct their analysis using mean-score, which is a weighted likelihood method and can thus be implemented by a single pass through the data using any standard package which accommodates weighted regression models. Valid inference is now straightforward using the variance expression provided here. The equivalence of mean-score and hot-deck is illustrated using three clinical data sets where an important covariate is missing for a large number of study subjects.
journal_name
Stat Medjournal_title
Statistics in medicineauthors
Reilly M,Pepe Mdoi
10.1002/(sici)1097-0258(19970115)16:1<5::aid-sim46subject
Has Abstractpub_date
1997-01-15 00:00:00pages
5-19issue
1-3eissn
0277-6715issn
1097-0258pii
10.1002/(SICI)1097-0258(19970115)16:1<5::AID-SIM46journal_volume
16pub_type
杂志文章abstract::There is a rich literature that considers whether an observed relation between treatment and response is due to an unobserved covariate. In order to quantify this unmeasured bias, an assumption is made about the distribution of this unobserved covariate; typically that it is either binary or at least confined to the u...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.2344
更新日期:2006-07-15 00:00:00
abstract::Multistate models are increasingly being used to model complex disease profiles. By modelling transitions between disease states, accounting for competing events at each transition, we can gain a much richer understanding of patient trajectories and how risk factors impact over the entire disease pathway. In this arti...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.7448
更新日期:2017-12-20 00:00:00
abstract::Crossover studies have been successfully conducted in the case of continuous responses. Existing procedures of analysis for ordinal responses, on the other hand, are rarely satisfactory unless strict, usually unrealistic, assumptions are made. In this paper we investigate a random effects model and show that the model...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780100611
更新日期:1991-06-01 00:00:00
abstract::Longitudinal studies collect information on a sample of individuals which is followed over time to analyze the effects of individual and time-dependent characteristics on the observed response. These studies often suffer from attrition: individuals drop out of the study before its completion time and thus present inco...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.3604
更新日期:2009-08-30 00:00:00
abstract::Spatial scan statistics are widely used for count data to detect geographical disease clusters of high or low incidence, mortality or prevalence and to evaluate their statistical significance. Some data are ordinal or continuous in nature, however, so that it is necessary to dichotomize the data to use a traditional s...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.2607
更新日期:2007-03-30 00:00:00
abstract::A method is proposed to infer the randomized treatment effect on survival after an adjustment for a post-randomization variable. The post-randomization variable is made independent of the treatment assignment and is considered a surrogate for baseline prognostic factors. The relationship between the post-randomization...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.968
更新日期:2001-11-30 00:00:00
abstract::In noninferiority studies, a limit of indifference is used to express a tolerance in results such that the clinician would regard such results as being acceptable or 'not worse'. We applied this concept to a measure of accuracy, the Receiver Operating Characteristic (ROC) curve, for a sequence of tests. We expressed a...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.5741
更新日期:2013-07-20 00:00:00
abstract::Interim analyses are conducted to allow for early termination of the trial, for ethical as well as economical reasons. Here we consider interim analyses in repeated measurements studies where the measurements are binary. Two methods for analysing this kind of data are compared according to their operating characterist...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.1361
更新日期:2003-02-28 00:00:00
abstract::Disease incidence or disease mortality rates for small areas are often displayed on maps. Maps of raw rates, disease counts divided by the total population at risk, have been criticized as unreliable due to non-constant variance associated with heterogeneity in base population size. This has led to the use of model-ba...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/1097-0258(20000915/30)19:17/18<2377::aid-s
更新日期:2000-09-15 00:00:00
abstract::Previous work on the consequences of regression to the mean for the interpretation of responses to treatment is extended to the situation where the response measured is the proportional change in some variable. Methods for correcting for the bias are discussed. ...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780060203
更新日期:1987-03-01 00:00:00
abstract::Bivariate survival data arise, for example, in twin studies and studies of both eyes or ears of the same individual. Often it is of interest to regress the survival times on a set of predictors. In this paper we extend Wei and Tanner's multiple imputation approach for linear regression with univariate censored data to...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/(sici)1097-0258(19991130)18:22<3111::aid-s
更新日期:1999-11-30 00:00:00
abstract::A mixed effect model is proposed to jointly analyze multivariate longitudinal data with continuous, proportion, count, and binary responses. The association of the variables is modeled through the correlation of random effects. We use a quasi-likelihood type approximation for nonlinear variables and transform the prop...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.7401
更新日期:2017-11-10 00:00:00
abstract::The problem of testing symmetry about zero has a long and rich history in the statistical literature. We introduce a new test that sequentially discards observations whose absolute value is below increasing thresholds defined by the data. McNemar's statistic is obtained at each threshold and the largest is used as the...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.5384
更新日期:2012-11-20 00:00:00
abstract::Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to contr...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.5550
更新日期:2013-02-28 00:00:00
abstract::Suppression of premature ventricular contractions (PVCs) is one of the goals of antiarrhythmic therapy. In a clinical trial, however, it may be difficult to distinguish antiarrhythmic drug effect from spontaneous variation in PVCs. We propose the application of linear regression to PVC histories to ascertain drug effe...
journal_title:Statistics in medicine
pub_type: 临床试验,杂志文章
doi:10.1002/sim.4780020305
更新日期:1983-07-01 00:00:00
abstract::Pearson's chi-squared, the likelihood-ratio, and Fisher-Freeman-Halton's test statistics are often used to test the association of unordered r x c tables. Asymptotical, exact conditional, or exact conditional with mid-p adjustment methods are commonly used to compute the p-value. We have compared test power and signif...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.2839
更新日期:2007-10-15 00:00:00
abstract::The standard analysis of clinical trials stratified by centre is to include centres as fixed effects, but if many centres contribute small numbers of patients, this approach results in a loss of power. Assuming no treatment by centre interaction, we used simulation to examine power and coverage of confidence intervals...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.3068
更新日期:2007-12-30 00:00:00
abstract::Between-community variance or community-by-time variance is one of the key factors driving the cost of conducting group randomized trials, which are often very expensive. We investigated empirically whether between-community variance could be reduced by controlling individual- and/or community-level covariates and ide...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/(sici)1097-0258(19990315)18:5<539::aid-sim
更新日期:1999-03-15 00:00:00
abstract::In genetic and genomic studies, gene-environment (G×E) interactions have important implications. Some of the existing G×E interaction methods are limited by analyzing a small number of G factors at a time, by assuming linear effects of E factors, by assuming no data contamination, and by adopting ineffective selection...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.6609
更新日期:2015-12-30 00:00:00
abstract::The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.7028
更新日期:2016-11-20 00:00:00
abstract::The sojourn time, time spent in the preclinical detectable phase (PCDP) for chronic diseases, for example, breast cancer, plays an important role in the design and assessment of screening programmes. Traditional methods to estimate it usually assume a uniform incidence rate of preclinical disease from a randomized con...
journal_title:Statistics in medicine
pub_type: 临床试验,杂志文章,随机对照试验
doi:10.1002/sim.4780141404
更新日期:1995-07-30 00:00:00
abstract::In clinical trials, treatment comparisons are often performed by models that incorporate important prognostic factors. Since these models require complete covariate information on all patients, statisticians frequently resort to complete case analysis or to omission of an important covariate. A probability imputation ...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780090707
更新日期:1990-07-01 00:00:00
abstract::Before introducing a new measurement tool it is necessary to evaluate its performance. Several statistical methods have been developed, or used, to evaluate the reliability and validity of a new assessment method in such circumstances. In this paper we review some commonly used methods. Data from a study that was cond...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.1804
更新日期:2004-05-30 00:00:00
abstract::Models for infant growth have usually been based on parametric forms, commonly an exponential or similar model, which have been shown to fit poorly especially during the first year of life. An alternative approach is to use a non-parametric model, based on a shape invariant model (SIM), where a single function is tran...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.2718
更新日期:2007-05-30 00:00:00
abstract::Hierarchical regression analysis holds much promise for epidemiologic analysis, but has as yet seen limited application because of lack of easily used software and the relatively lengthy run times of preferred fitting methods (such as true maximum likelihood and Bayesian approaches). This paper compares three relative...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/(sici)1097-0258(19970315)16:5<515::aid-sim
更新日期:1997-03-15 00:00:00
abstract::For the meta-analysis of controlled clinical trials with binary outcome a test statistic for testing an overall treatment effect is proposed, which is based on a refined estimator for the variance of the treatment effect estimator usually used in the random-effects model of meta-analysis. In simulation studies it is s...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.1009
更新日期:2001-12-30 00:00:00
abstract::A number of variance formulae for the attributable fraction have been presented, but none is consistent in sparse data, such as found in individually matched case-control studies. This paper employs Mantel-Haenszel estimation to derive variance estimators for attributable fractions that are dually consistent, that is,...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780060607
更新日期:1987-09-01 00:00:00
abstract::Prevalence data provided by cancer registries are generally biased, since the patients that were diagnosed before the starting of the registry's activity cannot be included in the statistics. The relevance of this incompleteness bias is estimated in this paper. Incidence and relative survival are modelled as parametri...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/(sici)1097-0258(19970228)16:4<425::aid-sim
更新日期:1997-02-28 00:00:00
abstract::We applied a two-stage random effects model to pulmonary function data from 31 sarcoidosis patients to illustrate its usefulness in analysing unbalanced longitudinal data. For the first stage, repeated measurements of percentage of predicted forced vital capacity (FVC%) from an individual were modelled as a function o...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780080206
更新日期:1989-02-01 00:00:00
abstract::Adaptive designs or flexible designs in a broader sense have increasingly been considered in planning pivotal registration clinical trials. Sample size reassessment design and adaptive selection design are two of such designs that appear in regulatory applications. At the design stage, consideration of sample size rea...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4021
更新日期:2011-06-15 00:00:00