Abstract:
:The classical and most commonly used approach to building prediction intervals is the parametric approach. However, its main drawback is that its validity and performance highly depend on the assumed functional link between the covariates and the response. This research investigates new methods that improve the performance of prediction intervals with random forests. Two aspects are explored: The method used to build the forest and the method used to build the prediction interval. Four methods to build the forest are investigated, three from the classification and regression tree (CART) paradigm and the transformation forest method. For CART forests, in addition to the default least-squares splitting rule, two alternative splitting criteria are investigated. We also present and evaluate the performance of five flexible methods for constructing prediction intervals. This yields 20 distinct method variations. To reliably attain the desired confidence level, we include a calibration procedure performed on the out-of-bag information provided by the forest. The 20 method variations are thoroughly investigated, and compared to five alternative methods through simulation studies and in real data settings. The results show that the proposed methods are very competitive. They outperform commonly used methods in both in simulation settings and with real data.
journal_name
Stat Methods Med Resjournal_title
Statistical methods in medical researchauthors
Roy MH,Larocque Ddoi
10.1177/0962280219829885subject
Has Abstractpub_date
2020-01-01 00:00:00pages
205-229issue
1eissn
0962-2802issn
1477-0334journal_volume
29pub_type
杂志文章abstract::This paper reviews the application of statistical models to outbreaks of two common respiratory viral diseases, measles and influenza. For each disease, we look first at its epidemiological characteristics and assess the extent to which these either aid or hinder modelling. We then turn to the models that have been de...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1177/096228029300200104
更新日期:1993-01-01 00:00:00
abstract::Monte Carlo evaluation of resampling-based tests is often conducted in statistical analysis. However, this procedure is generally computationally intensive. The pooling resampling-based method has been developed to reduce the computational burden but the validity of the method has not been studied before. In this arti...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280216661876
更新日期:2018-05-01 00:00:00
abstract::Researchers and clinicians often need to know whether a new method of measurement is equivalent to an established one that is already in use. For this problem, the estimation of limits of agreement advocated by Bland and Altman is a widely used solution. However, this approach ignores two vital issues in method compar...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280210379365
更新日期:2012-08-01 00:00:00
abstract::The ability to evaluate effects of factors on outcomes is increasingly important for studies that control some but not all of the factors. Although important advances have been made in methods of analysis for such partially controlled studies, work on designs has been limited. To help understand why, we review the mai...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1191/0962280205sm405oa
更新日期:2005-08-01 00:00:00
abstract::Metagenomics enables the study of gene abundances in complex mixtures of microorganisms and has become a standard methodology for the analysis of the human microbiome. However, gene abundance data is inherently noisy and contains high levels of biological and technical variability as well as an excess of zeros due to ...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280218811354
更新日期:2019-12-01 00:00:00
abstract::Couples with diseases associated with the sexual chromosomes, as well as families in countries where the desire for a male is extreme, are interested in influencing the sex of the baby. We propose an original composite likelihood approach to analyse the relation between sex of the newborn and timing of the intercourse...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217702415
更新日期:2018-11-01 00:00:00
abstract::Measurement error is a serious problem in the analysis of epidemiological data. In the past 20 years, a large number of methods for the correction of measurement error have been developed. While at the beginning mostly methods for cohort studies were considered, recently more attention has been paid to case-control st...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1177/096228020000900504
更新日期:2000-10-01 00:00:00
abstract::We review recent work on the application of pseudo-observations in survival and event history analysis. This includes regression models for parameters like the survival function in a single point, the restricted mean survival time and transition or state occupation probabilities in multi-state models, e.g. the competi...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1177/0962280209105020
更新日期:2010-02-01 00:00:00
abstract::A growing body of evidence suggests that genetic factors have an important influence on the onset and course of smoking. Here we review some of the statistical methods that have been used to test for genetic influences on smoking behaviour, with a particular focus on studies of large national twin samples. We show how...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1177/096228029800700205
更新日期:1998-06-01 00:00:00
abstract::The Cochran-Armitage (CA) test is commonly used in both epidemiology and genetics to test for linear trend in two-way tables with a binary outcome. There has been increasing interest in the power and size of the test and in determination of sample size, especially when there is potential misclassification in the 'expo...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280211406424
更新日期:2014-06-01 00:00:00
abstract::Covariate-adaptive designs are widely used to balance covariates and maintain randomization in clinical trials. Adaptive designs for discrete covariates and their asymptotic properties have been well studied in the literature. However, important continuous covariates are often involved in clinical studies. Simply disc...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280218770231
更新日期:2019-06-01 00:00:00
abstract::This paper presents a new model-based generalized functional clustering method for discrete longitudinal data, such as multivariate binomial and Poisson distributed data. For this purpose, we propose a multivariate functional principal component analysis (MFPCA)-based clustering procedure for a latent multivariate Gau...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280220921912
更新日期:2020-11-01 00:00:00
abstract::To project national hepatitis C virus (HCV) burden, unbiased estimation of HCV progression to liver cirrhosis is required for the whole community of HCV-infected individuals. However, widely varying estimates of progression rates to cirrhosis have been produced. This disparity is partly associated with the statistical...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280208094688
更新日期:2009-06-01 00:00:00
abstract::In many applications of zero-inflated models, score tests are often used to evaluate whether the population heterogeneity as implied by these models is consistent with the data. The most frequently cited justification for using score tests is that they only require estimation under the null hypothesis. Because this es...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280220937324
更新日期:2020-12-01 00:00:00
abstract::The random effects model in meta-analysis is a standard statistical tool often used to analyze the effect sizes of the quantity of interest if there is heterogeneity between studies. In the special case considered here, meta-analytic data contain only the sample means in two treatment arms and the sample sizes, but no...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217718867
更新日期:2019-01-01 00:00:00
abstract::In the clinical development of some new infectious disease drugs, early clinical pharmacology trials may predict with high confidence that the efficacious doses are well below the range of the safety margin. In this case, a dose-ranging study may be unnecessary after a proof-of-concept (PoC) study testing the highest ...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280218807950
更新日期:2019-12-01 00:00:00
abstract::Publication bias frequently appears in meta-analyses when the included studies' results (e.g., p-values) influence the studies' publication processes. Some unfavorable studies may be suppressed from publication, so the meta-analytic results may be biased toward an artificially favorable direction. Many statistical tes...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280220910172
更新日期:2020-10-01 00:00:00
abstract::One of the main advantages of Bayesian analyses of clinical trials is their ability to formally incorporate skepticism about large treatment effects through the use of informative priors. We conducted a simulation study to assess the performance of informative normal, Student- t, and beta distributions in estimating r...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280215620828
更新日期:2018-01-01 00:00:00
abstract::Censored data make survival analysis more complicated because exact event times are not observed. Statistical methodology developed to account for censored observations assumes that patients' withdrawal from a study is independent of the event of interest. However, in practice, some covariates might be associated to b...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280216628900
更新日期:2018-02-01 00:00:00
abstract::In many health studies, researchers are interested in estimating the treatment effects on the outcome around and through an intermediate variable. Such causal mediation analyses aim to understand the mechanisms that explain the treatment effect. Although multiple mediators are often involved in real studies, most of t...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280215615899
更新日期:2018-01-01 00:00:00
abstract::In genomic analysis, it is significant though challenging to identify markers associated with cancer outcomes or phenotypes. Based on the biological mechanisms of cancers and the characteristics of datasets, we propose a novel integrative interaction approach under a semiparametric model, in which genetic and environm...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280220909969
更新日期:2020-10-01 00:00:00
abstract::Statistical models of breast cancer tumour progression have been used to further our knowledge of the natural history of breast cancer, to evaluate mammography screening in terms of mortality, to estimate overdiagnosis, and to estimate the impact of lead-time bias when comparing survival times between screen detected ...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217734583
更新日期:2019-03-01 00:00:00
abstract::Clustered data is not simply correlated data, but has its own unique aspects. In this paper, various methods for correlated receiver operating characteristic (ROC) curve data that have been extended specifically to clustered data are reviewed. For those methods that have not yet been extended, suggestions for their ap...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1177/096228029800700402
更新日期:1998-12-01 00:00:00
abstract::I comment here on a recent paper in this journal, on the fitting of truncated normal distributions by the EM algorithm. I show that the fitting of such distributions by direct numerical maximization of likelihood (rather than EM) is straightforward, contrary to an assertion made by the authors of that paper. ...
journal_title:Statistical methods in medical research
pub_type: 评论,信件
doi:10.1177/0962280217712089
更新日期:2018-12-01 00:00:00
abstract::In this article, we present an overview and tutorial of statistical methods for meta-analysis of diagnostic tests under two scenarios: (1) when the reference test can be considered a gold standard and (2) when the reference test cannot be considered a gold standard. In the first scenario, we first review the conventio...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1177/0962280213492588
更新日期:2016-08-01 00:00:00
abstract::Pattern-mixture model (PMM)-based controlled imputations have become a popular tool to assess the sensitivity of primary analysis inference to different post-dropout assumptions or to estimate treatment effectiveness. The methodology is well established for continuous responses but less well established for binary res...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280220941880
更新日期:2020-12-01 00:00:00
abstract::We propose a fully parametric model for the analysis of competing risks data where the types of failure may not be independent. We show how the dependence between the cause-specific survival times can be modelled with a copula function. Features include: identifiability of the problem; accessible understanding of the ...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1191/0962280203sm335ra
更新日期:2003-08-01 00:00:00
abstract::The analysis of health care costs is complicated by the skewed and heteroscedastic nature of their distribution with possibly additional zero values. Statistical methods that do not adjust for these features can lead to incorrect conclusions. This paper reviews recent developments in statistical methods for the analys...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1191/0962280202sm290ra
更新日期:2002-08-01 00:00:00
abstract::Cluster analysis via a finite mixture model approach is considered. With this approach to clustering, the data can be partitioned into a specified number of clusters g by first fitting a mixture model with g components. An outright clustering of the data is then obtained by assigning an observation to the component to...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1191/0962280204sm372ra
更新日期:2004-10-01 00:00:00
abstract::Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratif...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280213497432
更新日期:2016-08-01 00:00:00