Abstract:
:In observational studies, many continuous or categorical covariates may be related to an outcome. Various spline-based procedures or the multivariable fractional polynomial (MFP) procedure can be used to identify important variables and functional forms for continuous covariates. This is the main aim of an explanatory model, as opposed to a model only for prediction. The type of analysis often guides the complexity of the final model. Spline-based procedures and MFP have tuning parameters for choosing the required complexity. To compare model selection approaches, we perform a simulation study in the linear regression context based on a data structure intended to reflect realistic biomedical data. We vary the sample size, variance explained and complexity parameters for model selection. We consider 15 variables. A sample size of 200 (1000) and R(2) = 0.2 (0.8) is the scenario with the smallest (largest) amount of information. For assessing performance, we consider prediction error, correct and incorrect inclusion of covariates, qualitative measures for judging selected functional forms and further novel criteria. From limited information, a suitable explanatory model cannot be obtained. Prediction performance from all types of models is similar. With a medium amount of information, MFP performs better than splines on several criteria. MFP better recovers simpler functions, whereas splines better recover more complex functions. For a large amount of information and no local structure, MFP and the spline procedures often select similar explanatory models.
journal_name
Stat Medjournal_title
Statistics in medicineauthors
Binder H,Sauerbrei W,Royston Pdoi
10.1002/sim.5639subject
Has Abstractpub_date
2013-06-15 00:00:00pages
2262-77issue
13eissn
0277-6715issn
1097-0258journal_volume
32pub_type
杂志文章abstract::Two features commonly exhibited by randomized trials of health promotion interventions are cluster randomization and stratification. Ignoring correlations between individuals within clusters can lead to an inflated type I error rate and hence a P-value which overstates the significance of the result. This paper compar...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.1256
更新日期:2002-12-30 00:00:00
abstract::Current methods for statistical analysis of twin studies focus on continuous and dichotomous data, while only limited methodology exists for analysing multinomial data. As a consequence, investigators are often tempted to collapse multinomial data into two categories simply to facilitate the analysis. We address this ...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/1097-0258(20010130)20:2<249::aid-sim641>3.
更新日期:2001-01-30 00:00:00
abstract::We give up-to-date methods for estimating the age-specific incidence of a disease and for estimating the effect of risk factors. We recommend taking age as the basic time scale of the analysis; then, the hazard function can be interpreted as the age-specific incidence of the disease. This choice raises a delayed entry...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/(sici)1097-0258(19980915)17:17<1973::aid-s
更新日期:1998-09-15 00:00:00
abstract::Shared random effects models have been increasingly common in the joint analyses of repeated measures (e.g. CD4 counts, hemoglobin levels) and a correlated failure time such as death. In this paper we study several shared random effects models in the multi-level repeated measures data setting with dependent failure ti...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.3392
更新日期:2008-11-29 00:00:00
abstract::Prevention studies, as distinguished from studies investigating treatments for established disease, present some distinct challenges. Perhaps the most extensive experience with preventive agents is in the area of infectious diseases; vaccines have been extremely effective in preventing many such diseases. Vaccines hav...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.1717
更新日期:2004-01-30 00:00:00
abstract::In vitro fertilization (IVF) is an increasingly common method of assisted reproductive technology. Because of the careful observation and follow-up required as part of the procedure, IVF studies provide an ideal opportunity to identify and assess clinical and demographic factors along with environmental exposures that...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.6050
更新日期:2014-05-10 00:00:00
abstract::Mediation analysis is a standard approach to understanding how and why an intervention works in social and medical sciences. However, the presence of missing data, especially missing not at random data, poses a great challenge for the applicability of this approach in practice. Current methods for handling such missin...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.7413
更新日期:2017-11-10 00:00:00
abstract::I discuss alternatives to the one compartment model, delta Yt = alpha + beta exp(- gamma t). Instead of comparing the one and two compartment models, I derive statistics for testing mixtures of the parameters (beta, gamma) in the one compartment model. I apply the proposed methods to the problem of hydrogen clearance ...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780080811
更新日期:1989-08-01 00:00:00
abstract::In medical research, risk difference (RD) and number needed to treat (NNT) measures for survival times have been mainly proposed without consideration of covariates. In this paper, we develop adjusted RD and NNT measures for use in observational studies with survival time outcomes within the framework of the Cox propo...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.3793
更新日期:2010-03-30 00:00:00
abstract::The use of data from multiple studies or centers for the validation of a clinical test or a multivariable prediction model allows researchers to investigate the test's/model's performance in multiple settings and populations. Recently, meta-analytic techniques have been proposed to summarize discrimination and calibra...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.7653
更新日期:2018-05-30 00:00:00
abstract::Recently, there has been much work on early phase cancer designs that incorporate both toxicity and efficacy data, called phase I-II designs because they combine elements of both phases. However, they do not explicitly address the phase II hypothesis test of H0 : p ≤ p0 , where p is the probability of efficacy at the ...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.6124
更新日期:2014-07-20 00:00:00
abstract::We propose a joint modeling approach to investigating the observed and latent risk factors of mixed types of outcomes. The proposed model comprises three parts. The first part is an exploratory factor analysis model that summarizes latent factors through multiple observed variables. The second part is a proportional h...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.8840
更新日期:2020-12-09 00:00:00
abstract::A preference trial is a special form of cross-over trial where clinical conditions determine when patients change treatment, in a prescribed order. This can be modelled using a geometric distribution. The model can be simply fitted using standard logistic regression methodology. The procedure is applied to a trial stu...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/(SICI)1097-0258(19960229)15:4<443::AID-SIM
更新日期:1996-02-28 00:00:00
abstract::The semi-Markov assumption emphasizes the importance of time spent in a state. In order to compute this type of multistate model, most transition times are always considered to be exactly identified or right censored. However, in the longitudinal analysis of chronic diseases, investigators are often confronted with in...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.3100
更新日期:2007-12-30 00:00:00
abstract::Graphical methods are often used to check goodness-of-fit of models to data. It is common to plot residuals against a reference distribution so that when the model fits the data, the configuration should be close to a straight line. Since the resemblance to a straight line is often unclear, it has been suggested to ad...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780141607
更新日期:1995-08-30 00:00:00
abstract::It is often assumed that randomisation will prevent bias in estimation of treatment effects from clinical trials, but this is not true of the semiparametric Proportional Hazards model for survival data when there is underlying risk heterogeneity. Here, a new formula is proposed for estimation of this bias, improving o...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.7343
更新日期:2017-09-20 00:00:00
abstract::Meta-analyses of data from epidemiological studies are often based on odds ratios (ORs) or relative risks (RRs) and their 95 per cent confidence intervals (CIs) as reported by the authors. Where possible ORs, RRs and CIs should be checked against the source data. Some simple methods are presented for checking the vali...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/(sici)1097-0258(19990815)18:15<1973::aid-s
更新日期:1999-08-15 00:00:00
abstract::The continual reassessment method (CRM) is an adaptive design for Phase I trials whose operating characteristics, including appropriate sample size, probability of correctly identifying the maximum tolerated dose, and the expected proportion of participants assigned to each dose, can only be determined via simulation....
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.8746
更新日期:2020-09-16 00:00:00
abstract::In epidemiological studies we often want to learn about the direct effect of an exposure on an outcome, i.e. the effect that is not relayed by a specific intermediate variable. In the literature, there are two common definitions of direct effects; controlled and natural. When the intermediate variable and the outcome ...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.3493
更新日期:2009-02-15 00:00:00
abstract::During a course of human immunodeficiency virus (HIV-1) infection, the viral load usually increases sharply to a peak following infection and then drops rapidly to a steady state, where it remains until progression to AIDS. This steady state is often referred to as the viral set point. It is believed that the HIV vira...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.3038
更新日期:2008-01-15 00:00:00
abstract::We propose a goodness-of-fit test statistic for linear regression with heterogeneous variance, which is asymptotically chi-square if the given model is correct. The test statistic is computed as a quadratic form of observed minus predicted responses. We apply the method to a linear regression for an ordinal categorica...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780130205
更新日期:1994-01-30 00:00:00
abstract::The 'landmark' and 'Simon and Makuch' non-parametric estimators of the survival function are commonly used to contrast the survival experience of time-dependent treatment groups in applications such as stem cell transplant versus chemotherapy in leukemia. However, the theoretical survival functions corresponding to th...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.6765
更新日期:2016-03-30 00:00:00
abstract::Results from an analysis of traffic accidents from a study of the police records of four police stations in the Bangkok metropolis are presented. The main emphasis in this study was put on the development of a measure for traffic accident density. The traffic flow was estimated at the various study locations by traine...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780142113
更新日期:1995-11-15 00:00:00
abstract::Rao proposed and compared several approaches for predicting future observations in a growth curve model. The assessment of associated prediction efficiency for different prediction methods were evaluated by Cross-Validation Assessment Error (CVAE). He used three data sets, each with a limited number of subjects (13-27...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.4780132103
更新日期:1994-11-15 00:00:00
abstract::Sam Greenhouse joined the Census Bureau as a clerk at an interesting time period for the agency. The first use of sampling in the decennial census occurred in 1940. There was a major expansion of the amount of data collected. The organization of the Census Bureau underwent radical changes, including the growth of the ...
journal_title:Statistics in medicine
pub_type: 传,历史文章,杂志文章
doi:10.1002/sim.1627
更新日期:2003-11-15 00:00:00
abstract::The "some invalid, some valid instrumental variable estimator" (sisVIVE) is a lasso-based method for instrumental variables (IVs) regression of outcome on an exposure. In principle, sisVIVE is robust to some of the IVs in the analysis being invalid, in the sense of being related to the outcome variable through pathway...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.8066
更新日期:2019-04-30 00:00:00
abstract::Conditional power and predictive power provide estimates of the probability of success at the end of the trial based on the information from the interim analysis. The observed value of the time to event endpoint at the interim analysis could be biased for the true treatment effect due to early censoring, leading to a ...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.7673
更新日期:2018-08-15 00:00:00
abstract::Huntington's disease (HD) is a neurodegenerative disorder with a dominant genetic mode of inheritance caused by an expansion of CAG repeats on chromosome 4. Typically, a longer sequence of CAG repeat length is associated with increased risk of experiencing earlier onset of HD. Previous studies of the association betwe...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.5971
更新日期:2014-04-15 00:00:00
abstract::Multilevel mixed effects survival models are used in the analysis of clustered survival data, such as repeated events, multicenter clinical trials, and individual participant data (IPD) meta-analyses, to investigate heterogeneity in baseline risk and covariate effects. In this paper, we extend parametric frailty model...
journal_title:Statistics in medicine
pub_type: 杂志文章
doi:10.1002/sim.6191
更新日期:2014-09-28 00:00:00
abstract::Much has been published on various aspects of data analysis and reporting from clinical trials within the biopharmaceutical environment. This ranges from regulatory guidelines on the format and content of registration dossiers to recommendations on data presentation and the statistical methodologies that are appropria...
journal_title:Statistics in medicine
pub_type: 杂志文章,评审
doi:10.1002/(sici)1097-0258(19980815/30)17:15/16<1829:
更新日期:1998-08-15 00:00:00