R2: a useful measure of model performance when predicting a dichotomous outcome.

Abstract:

:R2 has been criticized as a measure of model performance when predicting a dichotomous outcome, both because its value is often low and because it is sensitive to the prevalence of the event of interest. The C statistic is more widely used to measure model performance in a 0/1 setting. We use a simple parametric family of models to illustrate the potential usefulness of models with low R2 values, to clarify the effect of prevalence on both C and R2, and to demonstrate how R2 captures information not picked up by C. We also show that C is subject to a 'random mixing' problem that does not affect R2. Finally, we report both R2 and C values for different risk-adjustment models in situations with different prevalences and show the relationship between the measures and decile death rates, thereby providing a context for interpreting R2 values in a 0/1 setting.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Ash A,Shwartz M

doi

10.1002/(sici)1097-0258(19990228)18:4<375::aid-sim

subject

Has Abstract

pub_date

1999-02-28 00:00:00

pages

375-84

issue

4

eissn

0277-6715

issn

1097-0258

pii

10.1002/(SICI)1097-0258(19990228)18:4<375::AID-SIM

journal_volume

18

pub_type

杂志文章
  • The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models.

    abstract::Assessing the calibration of methods for estimating the probability of the occurrence of a binary outcome is an important aspect of validating the performance of risk-prediction algorithms. Calibration commonly refers to the agreement between predicted and observed probabilities of the outcome. Graphical methods are a...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8281

    authors: Austin PC,Steyerberg EW

    更新日期:2019-09-20 00:00:00

  • REML and ML estimation for clustered grouped survival data.

    abstract::Clustered grouped survival data arise naturally in clinical medicine and biological research. For example, in a randomized clinical trial, the variable of interest is the time to occurrence of a certain event with or without a new treatment and the data are collected from possibly correlated subjects from independent ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1323

    authors: Lam KF,Ip D

    更新日期:2003-06-30 00:00:00

  • A Bayesian semiparametric Markov regression model for juvenile dermatomyositis.

    abstract::Juvenile dermatomyositis (JDM) is a rare autoimmune disease that may lead to serious complications, even to death. We develop a 2-state Markov regression model in a Bayesian framework to characterise disease progression in JDM over time and gain a better understanding of the factors influencing disease risk. The trans...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7613

    authors: De Iorio M,Gallot N,Valcarcel B,Wedderburn L

    更新日期:2018-05-10 00:00:00

  • Combining biomarkers for classification with covariate adjustment.

    abstract::Combining multiple markers can improve classification accuracy compared with using a single marker. In practice, covariates associated with markers or disease outcome can affect the performance of a biomarker or biomarker combination in the population. The covariate-adjusted receiver operating characteristic (ROC) cur...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7274

    authors: Kim S,Huang Y

    更新日期:2017-07-10 00:00:00

  • On the use of the generalized t and generalized rank-sum statistics in medical research.

    abstract::We have used Monte Carlo methods to compare the type I error properties of the conditional and unconditional versions of the generalized t and the generalized rank-sum tests to those of the independent samples t and Wilcoxon rank-sum tests. Results showed inflated type I errors for the conditional generalized tests bu...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780110410

    authors: Blair RC,Morel JG

    更新日期:1992-02-28 00:00:00

  • Model selection techniques for the covariance matrix for incomplete longitudinal data.

    abstract::In longitudinal studies with incomplete data, where the number of time points can become numerous, it is often advantageous to model the covariance matrix. We describe several covariance models (for example, mixed models, compound symmetry, AR(1)-type models, and combination models) that offer parsimonious alternative...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.4780141302

    authors: Grady JJ,Helms RW

    更新日期:1995-07-15 00:00:00

  • A joint modeling and estimation method for multivariate longitudinal data with mixed types of responses to analyze physical activity data generated by accelerometers.

    abstract::A mixed effect model is proposed to jointly analyze multivariate longitudinal data with continuous, proportion, count, and binary responses. The association of the variables is modeled through the correlation of random effects. We use a quasi-likelihood type approximation for nonlinear variables and transform the prop...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7401

    authors: Li H,Zhang Y,Carroll RJ,Keadle SK,Sampson JN,Matthews CE

    更新日期:2017-11-10 00:00:00

  • Tests for individual and population bioequivalence based on generalized p-values.

    abstract::The U.S. Food and Drug Administration (FDA) has proposed new regulations that address the 'prescribability' and 'switchability' of new formulations of already-approved drugs. These new criteria are known, respectively, as population and individual bioequivalence. Two methods have been proposed in the bioequivalence li...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1346

    authors: McNally RJ,Iyer H,Mathew T

    更新日期:2003-01-15 00:00:00

  • Some extensions and applications of a Bayesian strategy for monitoring multiple outcomes in clinical trials.

    abstract::We present some practical extensions and applications of a strategy proposed by Thall, Simon and Estey for designing and monitoring single-arm clinical trials with multiple outcomes. We show by application how the strategy may be applied to construct designs for phase IIA activity trials and phase II equivalence trial...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19980730)17:14<1563::aid-s

    authors: Thall PF,Sung HG

    更新日期:1998-07-30 00:00:00

  • Application of the parallel line assay to assessment of biosimilar products based on binary endpoints.

    abstract::Biological drug products are therapeutic moieties manufactured by a living system or organisms. These are important life-saving drug products for patients with unmet medical needs. Because of expensive cost, only a few patients have access to life-saving biological products. Most of the early biological products will ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5565

    authors: Lin JR,Chow SC,Chang CH,Lin YC,Liu JP

    更新日期:2013-02-10 00:00:00

  • Nowcasting influenza epidemics using non-homogeneous hidden Markov models.

    abstract::Timeliness of a public health surveillance system is one of its most important characteristics. The process of predicting the present situation using available incomplete information from surveillance systems has received the term nowcasting and has high public health interest. Generally in Europe, general practitione...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5670

    authors: Nunes B,Natário I,Lucília Carvalho M

    更新日期:2013-07-10 00:00:00

  • A flexible, interpretable framework for assessing sensitivity to unmeasured confounding.

    abstract::When estimating causal effects, unmeasured confounding and model misspecification are both potential sources of bias. We propose a method to simultaneously address both issues in the form of a semi-parametric sensitivity analysis. In particular, our approach incorporates Bayesian Additive Regression Trees into a two-p...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6973

    authors: Dorie V,Harada M,Carnegie NB,Hill J

    更新日期:2016-09-10 00:00:00

  • On the propensity score weighting analysis with survival outcome: Estimands, estimation, and inference.

    abstract::Propensity score analysis is widely used in observational studies to adjust for confounding and estimate the causal effect of a treatment on the outcome. When the outcome is survival time, there are special considerations on the definition of the causal estimand, point, and variance estimation that have not been thoro...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7839

    authors: Mao H,Li L,Yang W,Shen Y

    更新日期:2018-11-20 00:00:00

  • Internal pilot studies I: type I error rate of the naive t-test.

    abstract::When sample size is recalculated using unblinded interim data, use of the usual t-test at the end of a study may lead to an elevated type I error rate. This paper describes a numerical quadrature investigation to calculate the true probability of rejection as a function of the time of the recalculation, the magnitude ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19991230)18:24<3481::aid-s

    authors: Wittes J,Schabenberger O,Zucker D,Brittain E,Proschan M

    更新日期:1999-12-30 00:00:00

  • Stratified analysis of multivariate clinical data: application of a Mantel-Haenszel approach.

    abstract::Laboratory determinations on children aged 6 to 10 years obtained over a 5-year period are analysed by a method described in detail for differentiating between children from exposed and control areas of Seveso, Italy. In the analysis, stratification is employed to distinguish the separate days of laboratory measuremen...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780020221

    authors: Mantel N,Mocarelli P,Marocchi A,Brambilla P,Baretta R

    更新日期:1983-04-01 00:00:00

  • Nonparametric covariate hypothesis tests for the cure rate in mixture cure models.

    abstract::In lifetime data, like cancer studies, there may be long term survivors, which lead to heavy censoring at the end of the follow-up period. Since a standard survival model is not appropriate to handle these data, a cure model is needed. In the literature, covariate hypothesis tests for cure models are limited to parame...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8530

    authors: López-Cheda A,Jácome MA,Van Keilegom I,Cao R

    更新日期:2020-07-30 00:00:00

  • Network analytic methods for epidemiological risk assessment.

    abstract::The authors measure the efficacy of three methods for predicting the time to infection for susceptible individuals in a population undergoing an HIV epidemic. The methods differ in whether they require detailed information of the contact network and whether they require knowledge of the initial source of infection. Ef...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780130107

    authors: Altmann M,Wee BC,Willard K,Peterson D,Gatewood LC

    更新日期:1994-01-15 00:00:00

  • Design evaluation and optimisation in crossover pharmacokinetic studies analysed by nonlinear mixed effects models.

    abstract::Bioequivalence or interaction trials are commonly studied in crossover design and can be analysed by nonlinear mixed effects models as an alternative to noncompartmental approach. We propose an extension of the population Fisher information matrix in nonlinear mixed effects models to design crossover pharmacokinetic t...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4390

    authors: Nguyen TT,Bazzoli C,Mentré F

    更新日期:2012-05-20 00:00:00

  • Measuring vaccine efficacy from epidemics of acute infectious agents.

    abstract::A good measure of field vaccine efficacy should evaluate the direct protective effect of vaccination on the person who receives the vaccine. The conventional estimator for vaccine efficacy depends on population level factors that are either unrelated or indirectly related to the direct biological action of the vaccine...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780120309

    authors: Longini IM Jr,Halloran ME,Haber M,Chen RT

    更新日期:1993-02-01 00:00:00

  • Parametric randomization-based methods for correcting for treatment changes in the assessment of the causal effect of treatment.

    abstract::We develop parametric maximum likelihood methods to adjust for treatment changes during follow-up in order to assess the causal effect of treatment in clinical trials with time-to-event outcomes. The accelerated failure time model of Robins and Tsiatis relates each observed event time to the underlying event time that...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1618

    authors: Walker AS,White IR,Babiker AG

    更新日期:2004-02-28 00:00:00

  • Last observation carry-forward and last observation analysis.

    abstract::Drop-out often occurs in clinical trials with multiple visits and drop-out is often informative in the sense that the population of patients who dropped out is different from the population of patients who completed the study. To handle data with informative drop-out, an intention-to-treat analysis, which evaluates tr...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1519

    authors: Shao J,Zhong B

    更新日期:2003-08-15 00:00:00

  • Combining individual and aggregated data to investigate the role of socioeconomic disparities on cancer burden in Italy.

    abstract::Quantifying socioeconomic disparities and understanding the roots of inequalities are growing topics in cancer research. However, socioeconomic differences are challenging to investigate mainly due to the lack of accurate data at individual-level, while aggregate indicators are only partially informative. We implement...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8392

    authors: Mezzetti M,Palli D,Dominici F

    更新日期:2020-01-15 00:00:00

  • Non-parametric methods for recurrent event data with informative and non-informative censorings.

    abstract::Recurrent event data are commonly encountered in health-related longitudinal studies. In this paper time-to-events models for recurrent event data are studied with non-informative and informative censorings. In statistical literature, the risk set methods have been confirmed to serve as an appropriate and efficient ap...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1029

    authors: Wang MC,Chiang CT

    更新日期:2002-02-15 00:00:00

  • Group sequential designs for cure rate models with early stopping in favour of the null hypothesis.

    abstract::Ewell and Ibrahim derived the large sample distribution of the logrank statistic under general local alternatives. Their asymptotic results enable us to extend several group sequential designs which allow for early stopping in favour of the null hypothesis to the setting in which the cure rate model is appropriate. In...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/1097-0258(20001130)19:22<3023::aid-sim638>

    authors: Patricia Bernardo MV,Ibrahim JG

    更新日期:2000-11-30 00:00:00

  • On prediction of future observation in growth curve model.

    abstract::Rao proposed and compared several approaches for predicting future observations in a growth curve model. The assessment of associated prediction efficiency for different prediction methods were evaluated by Cross-Validation Assessment Error (CVAE). He used three data sets, each with a limited number of subjects (13-27...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780132103

    authors: Tian JJ,Shukla R,Buncher CR

    更新日期:1994-11-15 00:00:00

  • True verification probabilities should not be used in estimating the area under receiver operating characteristic curve.

    abstract::In medical research, a two-phase study is often used for the estimation of the area under the receiver operating characteristic curve (AUC) of a diagnostic test. However, such a design introduces verification bias. One of the methods to correct verification bias is inverse probability weighting (IPW). Since the probab...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8700

    authors: Wu Y

    更新日期:2020-11-30 00:00:00

  • Model mis-specification and overestimation of the intraclass correlation coefficient in cluster randomized trials.

    abstract::Intraclass correlation coefficient (ICC) estimates must be provided when reporting the results of a cluster randomized trial. This study demonstrates that estimating this parameter with one-way ANOVA and an underlying mixed-effects statistical model leads to biased estimates. The bias depends on the effect size of the...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2260

    authors: Giraudeau B

    更新日期:2006-03-30 00:00:00

  • Combining classification trees using MLE.

    abstract::We propose a probability distribution for an equivalence class of classification trees (that is, those that ignore the value of the cutpoints but retain tree structure). This distribution is parameterized by a central tree structure representing the true model, and a precision or concentration coefficient representing...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990330)18:6<727::aid-sim

    authors: Shannon WD,Banks D

    更新日期:1999-03-30 00:00:00

  • Quantifying degrees of necessity and of sufficiency in cause-effect relationships with dichotomous and survival outcomes.

    abstract::We suggest measures to quantify the degrees of necessity and of sufficiency of prognostic factors for dichotomous and for survival outcomes. A cause, represented by certain values of prognostic factors, is considered necessary for an event if, without the cause, the event cannot develop. It is considered sufficient fo...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8331

    authors: Gleiss A,Schemper M

    更新日期:2019-10-15 00:00:00

  • Beta-binomial/Poisson regression models for repeated bivariate counts.

    abstract::We analyze data obtained from a study designed to evaluate training effects on the performance of certain motor activities of Parkinson's disease patients. Maximum likelihood methods were used to fit beta-binomial/Poisson regression models tailored to evaluate the effects of training on the numbers of attempted and su...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3303

    authors: Lora MI,Singer JM

    更新日期:2008-07-30 00:00:00