Sample size planning for survival prediction with focus on high-dimensional data.

Abstract:

:Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to control the difference between optimal and expected prediction error. Prediction is carried out by Cox proportional hazards models. The general approach considers censoring as well as low-dimensional and high-dimensional explanatory variables. For dimension reduction in the high-dimensional setting, a variable selection step is inserted. If not all informative variables are included in the final model, the effect estimates are biased towards zero. The bias affects the prediction error, and its magnitude is influenced by the sample size. For variable selection, we consider two approaches: least absolute shrinkage and selection operator (LASCO) and univariable selection. For univariable selection, we can calculate input parameters for the sample size formula. For the LASCO, supportive simulations are necessary to appropriately choose the input parameters. We investigate the performance of the proposed formulas with the use of simulations. Simulation results support the validity of the sample size formulas. An application of a real data example illustrates the practical implementation of the method.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Götte H,Zwiener I

doi

10.1002/sim.5550

subject

Has Abstract

pub_date

2013-02-28 00:00:00

pages

787-807

issue

5

eissn

0277-6715

issn

1097-0258

journal_volume

32

pub_type

杂志文章
  • Development and applications of a city-level alcohol availability and alcohol problems database.

    abstract::Data on alcohol availability and problems in all cities in Los Angeles County were collected from several different sources and linked together to form a Local Alcohol Availability Database (LAAD). The two major purposes of the project are to provide a city-level alcohol availability and alcohol-related problems datab...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780140517

    authors: MacKinnon DP,Scribner R,Taft KA

    更新日期:1995-03-15 00:00:00

  • Deficiencies in clinical reports for registration of drugs.

    abstract::A considerable number of the clinical reports which are presented to the Dutch Board for the Evaluation of Drugs, have deficiencies and/or shortcomings. A number of these, including loose description of the target population and sampling method, methodological flaws, incorrect treatment of withdrawals, confounding of ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780020209

    authors: De Jonge H

    更新日期:1983-04-01 00:00:00

  • Second-stage least squares versus penalized quasi-likelihood for fitting hierarchical models in epidemiologic analyses.

    abstract::Hierarchical regression analysis holds much promise for epidemiologic analysis, but has as yet seen limited application because of lack of easily used software and the relatively lengthy run times of preferred fitting methods (such as true maximum likelihood and Bayesian approaches). This paper compares three relative...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970315)16:5<515::aid-sim

    authors: Greenland S

    更新日期:1997-03-15 00:00:00

  • The relationship between hot-deck multiple imputation and weighted likelihood.

    abstract::Hot-deck imputation is an intuitively simple and popular method of accommodating incomplete data. Users of the method will often use the usual multiple imputation variance estimator which is not appropriate in this case. However, no variance expression has yet been derived for this easily implemented method applied to...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970115)16:1<5::aid-sim46

    authors: Reilly M,Pepe M

    更新日期:1997-01-15 00:00:00

  • A new and improved confidence interval for the Mantel-Haenszel risk difference.

    abstract::Writing the variance of the Mantel-Haenszel estimator under the null of homogeneity and inverting the corresponding test, we arrive at an improved confidence interval for the common risk difference in stratified 2 × 2 tables. This interval outperforms a variety of other intervals currently recommended in the literatur...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6122

    authors: Klingenberg B

    更新日期:2014-07-30 00:00:00

  • Optimal three-stage designs for phase II cancer clinical trials.

    abstract::The objective of a phase II cancer clinical trial is to screen a treatment that can produce a similar or better response rate compared to the current treatment results. This screening is usually carried out in two stages as proposed by Simon. For ineffective treatment, the trial should terminate at the first stage. En...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19971215)16:23<2701::aid-s

    authors: Chen TT

    更新日期:1997-12-15 00:00:00

  • Comparing onset of antidepressant action using a repeated measures approach and a traditional assessment schedule.

    abstract:BACKGROUND:It has been recommended that onset of antidepressant action be assessed using survival analyses with assessments taken at least twice per week. However, such an assessment schedule is problematic to implement. The present study assessed the feasibility of comparing onset of action between treatments using a ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2309

    authors: Mallinckrodt CH,Detke MJ,Kaiser CJ,Watkin JG,Molenberghs G,Carroll RJ

    更新日期:2006-07-30 00:00:00

  • A special case of reduced rank models for identification and modelling of time varying effects in survival analysis.

    abstract::Flexible survival models are in need when modelling data from long term follow-up studies. In many cases, the assumption of proportionality imposed by a Cox model will not be valid. Instead, a model that can identify time varying effects of fixed covariates can be used. Although there are several approaches that deal ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7088

    authors: Perperoglou A

    更新日期:2016-12-10 00:00:00

  • A linear exponent AR(1) family of correlation structures.

    abstract::In repeated measures settings, modeling the correlation pattern of the data can be immensely important for proper analyses. Accurate inference requires proper choice of the correlation model. Optimal efficiency of the estimation procedure demands a parsimonious parameterization of the correlation structure, with suffi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3928

    authors: Simpson SL,Edwards LJ,Muller KE,Sen PK,Styner MA

    更新日期:2010-07-30 00:00:00

  • Bayesian modelling of imperfect ascertainment methods in cancer studies.

    abstract::Tumour registry linkage, chart review and patient self-report are all commonly used ascertainment methods in cancer epidemiology. These methods are used for estimating the incidence or prevalence of different cancer types in a population, and for investigating the effects of possible risk factors for cancer. Tumour re...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2116

    authors: Bernatsky S,Joseph L,Bélisle P,Boivin JF,Rajan R,Moore A,Clarke A

    更新日期:2005-08-15 00:00:00

  • On design considerations and randomization-based inference for community intervention trials.

    abstract::This paper discusses design considerations and the role of randomization-based inference in randomized community intervention trials. We stress that longitudinal follow-up of cohorts within communities often yields useful information on the effects of intervention on individuals, whereas cross-sectional surveys can us...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/(SICI)1097-0258(19960615)15:11<1069::AID-S

    authors: Gail MH,Mark SD,Carroll RJ,Green SB,Pee D

    更新日期:1996-06-15 00:00:00

  • An application of Harrell's C-index to PH frailty models.

    abstract::Frailty models are encountered in many medical applications, yet little research has been devoted to develop measures that quantify the predictive ability of these models. In this paper, we elaborate on the concept of the concordance probability to clustered data, resulting in an 'Overall Conditional C-index' or bfC(O...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4058

    authors: Van Oirbeek R,Lesaffre E

    更新日期:2010-12-30 00:00:00

  • One-stage parametric meta-analysis of time-to-event outcomes.

    abstract::Methodology for the meta-analysis of individual patient data with survival end-points is proposed. Motivated by questions about the reliance on hazard ratios as summary measures of treatment effects, a parametric approach is considered and percentile ratios are introduced as an alternative to hazard ratios. The genera...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4086

    authors: Siannis F,Barrett JK,Farewell VT,Tierney JF

    更新日期:2010-12-20 00:00:00

  • Variable length testing using the ordinal regression model.

    abstract::Health questionnaires are often built up from sets of questions that are totaled to obtain a sum score. An important consideration in designing questionnaires is to minimize respondent burden. An increasingly popular method for efficient measurement is computerized adaptive testing; unfortunately, many health question...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5936

    authors: Smits N,Finkelman MD

    更新日期:2014-02-10 00:00:00

  • Analysis of ectopic pregnancy data using marginal and conditional models.

    abstract::This work is motivated by a longitudinal study of women and their ectopic pregnancy outcomes in Lund, Sweden. In this article, we review and apply the Liang-Zeger methodology to the Lund ectopic pregnancy data set. We further analyse the ectopic pregnancy data using conditional modelling approaches suggested by Rosner...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19971115)16:21<2403::aid-s

    authors: Hadgu A,Koch G,Westrom L

    更新日期:1997-11-15 00:00:00

  • Optimal designs for Michaelis-Menten kinetic studies.

    abstract::Many reactions in enzymology are governed by the Michaelis-Menten equation. Characterising these reactions requires the estimation of the parameters K(M) and V(max) which determine the Michaelis-Menten equation and this is done by observing rates of reactions at a set of substrate concentrations. The choice of substra...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1612

    authors: Matthews JN,Allcock GC

    更新日期:2004-02-15 00:00:00

  • Smooth bootstrap methods for analysis of longitudinal data.

    abstract::In analysis of longitudinal data, the variance matrix of the parameter estimates is usually estimated by the 'sandwich' method, in which the variance for each subject is estimated by its residual products. We propose smooth bootstrap methods by perturbing the estimating functions to obtain 'bootstrapped' realizations ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3027

    authors: Li Y,Wang YG

    更新日期:2008-03-30 00:00:00

  • Comparison of methods for the analysis of longitudinal interval count data.

    abstract::Longitudinal studies are often concerned with estimating the recurrence rate of a non-fatal event. In many cases, only the total number of events occurring during successive time intervals is known. We compared a mixed Poisson-gamma regression method proposed by Thall and a quasi-likelihood method proposed by Zeger an...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/sim.4780121406

    authors: Stukel TA

    更新日期:1993-07-30 00:00:00

  • Causal inference in paired two-arm experimental studies under noncompliance with application to prognosis of myocardial infarction.

    abstract::Motivated by a study about prompt coronary angiography in myocardial infarction, we propose a method to estimate the causal effect of a treatment in two-arm experimental studies with possible noncompliance in both treatment and control arms. We base the method on a causal model for repeated binary outcomes (before and...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5856

    authors: Bartolucci F,Farcomeni A

    更新日期:2013-11-10 00:00:00

  • Discriminant analysis using a multivariate linear mixed model with a normal mixture in the random effects distribution.

    abstract::We have developed a method to longitudinally classify subjects into two or more prognostic groups using longitudinally observed values of markers related to the prognosis. We assume the availability of a training data set where the subjects' allocation into the prognostic group is known. The proposed method proceeds i...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3849

    authors: Komárek A,Hansen BE,Kuiper EM,van Buuren HR,Lesaffre E

    更新日期:2010-12-30 00:00:00

  • An analysis of eight 95 per cent confidence intervals for a ratio of Poisson parameters when events are rare.

    abstract::We compared eight nominal 95 per cent confidence intervals for the ratio of two Poisson parameters, both assumed small, on their true coverage (the probability that the interval includes the ratio of Poisson parameters) and median width. The commonly used log-linear interval, justified by asymptotic considerations, pr...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3234

    authors: Barker L,Cadwell BL

    更新日期:2008-09-10 00:00:00

  • Bayesian random effects meta-analysis of trials with binary outcomes: methods for the absolute risk difference and relative risk scales.

    abstract::In a recent Statistics in Medicine paper, Warn, Thompson and Spiegelhalter (WTS) made a comparison between the Bayesian approach to the meta-analysis of binary outcomes and a popular Classical approach that uses summary (two-stage) techniques. They included approximate summary (two-stage) Bayesian techniques in their ...

    journal_title:Statistics in medicine

    pub_type: 评论,信件

    doi:10.1002/sim.2115

    authors: O'Rourke K,Altman DG

    更新日期:2005-09-15 00:00:00

  • A mediation analysis for a nonrare dichotomous outcome with sequentially ordered multiple mediators.

    abstract::Mediation analyses can help us to understand the biological mechanism in which an exposure or treatment affects an outcome. Single mediator analyses have been used in various applications, but may not be appropriate for analyzing intricate mechanisms involving multiple mediators that affect each other. Thus, in this a...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8485

    authors: Lai EY,Shih S,Huang YT,Wang S

    更新日期:2020-05-15 00:00:00

  • Finding common task-related regions in fMRI data from multiple subjects by periodogram clustering and clustering ensemble.

    abstract::We propose an innovative and practically relevant clustering method to find common task-related brain regions among different subjects who respond to the same set of stimuli. Using functional magnetic resonance imaging (fMRI) time series data, we first cluster the voxels within each subject on a voxel by voxel basis. ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6906

    authors: Ye J,Li Y,Lazar NA,Schaeffer DJ,McDowell JE

    更新日期:2016-07-10 00:00:00

  • Detection rates and false positive rates for Down's syndrome screening: how precisely can they be estimated and what factors influence their value?

    abstract::Down's syndrome screening is currently carried out using a combination of biochemical markers measured in maternal serum samples; these include MSAFP, Total hCG, uE3 and Free beta-hCG. Recently a number of papers have compared the effectiveness of different combinations of these markers. Some recommend MSAFP, Total hC...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970715)16:13<1481::aid-s

    authors: Dunstan FD,Gray JC,Nix AB,Reynolds T

    更新日期:1997-07-15 00:00:00

  • A Bayesian methodology for detecting targeted genes under two related experiments.

    abstract::Many gene expression data are based on two experiments where the gene expressions of the targeted genes under both experiments are correlated. We consider problems in which objectives are to find genes that are simultaneously upregulated/downregulated under both experiments. A Bayesian methodology is proposed based on...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6555

    authors: Bansal NK,Jiang H,Pradeep P

    更新日期:2015-11-10 00:00:00

  • Regression analysis of clustered interval-censored data with informative cluster size.

    abstract::Interval-censored data are commonly found in studies of diseases that progress without symptoms, which require clinical evaluation for detection. Several techniques have been suggested with independent assumption. However, the assumption will not be valid if observations come from clusters. Furthermore, when the clust...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4042

    authors: Kim YJ

    更新日期:2010-12-10 00:00:00

  • Combining mortality and longitudinal measures in clinical trials.

    abstract::Clinical trials often assess therapeutic benefit on the basis of an event such as death or the diagnosis of disease. Usually, there are several additional longitudinal measures of clinical status which are collected to be used in the treatment comparison. This paper proposes a simple non-parametric test which combines...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/(sici)1097-0258(19990615)18:11<1341::aid-s

    authors: Finkelstein DM,Schoenfeld DA

    更新日期:1999-06-15 00:00:00

  • A recycling framework for the construction of Bonferroni-based multiple tests.

    abstract::In this paper we describe Bonferroni-based multiple testing procedures (MTPs) as strategies to split and recycle test mass. Here, 'test mass' refers to (parts of) the nominal level alpha at which the family-wise error rate is controlled. Briefly, test mass is split between different null hypotheses, and whenever a nul...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3513

    authors: Burman CF,Sonesson C,Guilbaud O

    更新日期:2009-02-28 00:00:00

  • Bounding the bias of unmeasured factors with confounding and effect-modifying potentials.

    abstract::Confounding is a major concern in observational studies. To adjust for confounding bias, the potential confounder(s) for a study must first be identified and measured. But this is not always possible. The unmeasured factors may also exhibit effect modification, and this further complicates the situation. In this paper...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4151

    authors: Lee WC

    更新日期:2011-04-30 00:00:00