Extending the c-statistic to nominal polytomous outcomes: the Polytomous Discrimination Index.

Abstract:

:Diagnostic problems in medicine are sometimes polytomous, meaning that the outcome has more than two distinct categories. For example, ovarian tumors can be benign, borderline, primary invasive, or metastatic. Extending the main measure of binary discrimination, the c-statistic or area under the ROC curve, to nominal polytomous settings is not straightforward. This paper reviews existing measures and presents the polytomous discrimination index (PDI) as an alternative. The PDI assesses all sets of k cases consisting of one case from each outcome category. For each category i (i = 1, … ,k), it is assessed whether the risk of category i is highest for the case from category i. A score of 1∕k is given per category for which this holds, yielding a set score between 0 and 1 to indicate the level of discrimination. The PDI is the average set score and is interpreted as the probability to correctly identify a case from a randomly selected category within a set of k cases. This probability can be split up by outcome category, yielding k category-specific values that result in the PDI when averaged. We demonstrate the measures on two diagnostic problems (residual mass histology after chemotherapy for testicular cancer; diagnosis of ovarian tumors). We compare the behavior of the measures on theoretical data, showing that PDI is more strongly influenced by simultaneous discrimination between all categories than by partial discrimination between pairs of categories. In conclusion, the PDI is attractive because it better matches the requirements of a measure to summarize polytomous discrimination.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Van Calster B,Van Belle V,Vergouwe Y,Timmerman D,Van Huffel S,Steyerberg EW

doi

10.1002/sim.5321

subject

Has Abstract

pub_date

2012-10-15 00:00:00

pages

2610-26

issue

23

eissn

0277-6715

issn

1097-0258

journal_volume

31

pub_type

杂志文章
  • On the statistical analysis of allelic-loss data.

    abstract::This paper concerns the statistical analysis of certain binary data arising in molecular studies of cancer. In allelic-loss experiments, tumour cell genomes are analysed at informative molecular marker loci to identify deleted chromosomal regions. The resulting binary data are used to infer properties of putative supp...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19980715)17:13<1425::aid-s

    authors: Newton MA,Gould MN,Reznikoff CA,Haag JD

    更新日期:1998-07-15 00:00:00

  • Accumulating evidence from independent studies: what we can win and what we can lose.

    abstract::When asking 'what is known' about a drug or therapy or program at any time, both researchers and practitioners often confront more than a single study. Facing a variety of findings, where conflicts may outweigh agreement, how can a reviewer constructively approach the task? In this discussion, I will outline some ques...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780060304

    authors: Light RJ

    更新日期:1987-04-01 00:00:00

  • Predictive accuracy of risk factors and markers: a simulation study of the effect of novel markers on different performance measures for logistic regression models.

    abstract::The change in c-statistic is frequently used to summarize the change in predictive accuracy when a novel risk factor is added to an existing logistic regression model. We explored the relationship between the absolute change in the c-statistic, Brier score, generalized R(2) , and the discrimination slope when a risk f...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5598

    authors: Austin PC,Steyerberg EW

    更新日期:2013-02-20 00:00:00

  • Efficient interval estimation of a ratio of marginal probabilities in matched-pair data: non-iterative method.

    abstract::Matched-pair designs have been commonly employed in diagnostic, epidemiologic and laboratory studies. For estimation of a ratio of two marginal probabilities in matched-pair data, a Wald-type logarithmic method is computationally simple, but an actual coverage rate is known to be smaller than a nominal one and a lengt...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3685

    authors: Nam JM

    更新日期:2009-10-15 00:00:00

  • Comparison of trends in HIV infection for two risk categories.

    abstract::Sensible plans for health-care needs and determination of priorities for expenditure require regular assessment of trends in HIV incidences. In particular, trends in the relative HIV incidences of different risk categories are useful when assessing whether current control strategies are working equally well for all ri...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(SICI)1097-0258(19960830)15:16<1779::AID-S

    authors: Becker NG,Cui JS

    更新日期:1996-08-30 00:00:00

  • A regression model for multivariate random length data.

    abstract::Multivariate random length data occur when we observe multiple measurements of a quantitative variable and the variable number of these measurements is also an observed outcome for each experimental unit. For example, for a patient with coronary artery disease, we may observe a number of lesions in that patient's coro...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990130)18:2<199::aid-sim

    authors: Barnhart HX,Kosinski AS,Sampson AR

    更新日期:1999-01-30 00:00:00

  • Estimation of the mediation effect with a binary mediator.

    abstract::A mediator acts as a third variable in the causal pathway between a risk factor and an outcome. In this paper, we consider the estimation of the mediation effect when the mediator is a binary variable. We give a precise definition of the mediation effect and examine asymptotic properties of five different estimators o...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2730

    authors: Li Y,Schneider JA,Bennett DA

    更新日期:2007-08-15 00:00:00

  • Complete imputation of missing repeated categorical data: one-sample applications.

    abstract::Longitudinal studies with repeated measures are often subject to non-response. Methods currently employed to alleviate the difficulties caused by missing data are typically unsatisfactory, especially when the cause of the missingness is related to the outcomes. We present an approach for incomplete categorical data in...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.982

    authors: West CP,Dawson JD

    更新日期:2002-01-30 00:00:00

  • Subgroup identification using covariate-adjusted interaction trees.

    abstract::We consider the problem of identifying subgroups of participants in a clinical trial that have enhanced treatment effect. Recursive partitioning methods that recursively partition the covariate space based on some measure of between groups treatment effect difference are popular for such subgroup identification. The m...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8214

    authors: Steingrimsson JA,Yang J

    更新日期:2019-09-20 00:00:00

  • Estimating the stenosis probabilities in arteriosclerosis obliterans using generalized estimating equations.

    abstract::For each of 211 arteriosclerosis obliterans patients, the degree of stenosis of arteries at four sites were examined at Hiroshima University Hospital to analyse the relationship between the degree of stenosis and age, sex and site. The generalized estimating equations using a proportional odds model for the stenosis p...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1363

    authors: Nakashima E,Tsuji S,Fukuoka H,Ohtaki M,Ito K

    更新日期:2003-07-15 00:00:00

  • Comparing onset of antidepressant action using a repeated measures approach and a traditional assessment schedule.

    abstract:BACKGROUND:It has been recommended that onset of antidepressant action be assessed using survival analyses with assessments taken at least twice per week. However, such an assessment schedule is problematic to implement. The present study assessed the feasibility of comparing onset of action between treatments using a ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2309

    authors: Mallinckrodt CH,Detke MJ,Kaiser CJ,Watkin JG,Molenberghs G,Carroll RJ

    更新日期:2006-07-30 00:00:00

  • A note on sample size calculations for cluster randomised crossover trials with a fixed number of clusters.

    abstract::Girardeau, Ravaud and Donner in 2008 presented a formula for sample size calculations for cluster randomised crossover trials, when the intracluster correlation coefficient, interperiod correlation coefficient and mean cluster size are specified in advance. However, in many randomised trials, the number of clusters is...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8191

    authors: Kelly TL,Pratt N

    更新日期:2019-08-15 00:00:00

  • rhDNase as an example of recurrent event analysis.

    abstract::We consider counting process methods for analysing time-to-event data with multiple or recurrent outcomes, using the models developed by Anderson and Gill, Wei, Lin and Weissfeld and Prentice, Williams and Peterson. We compare the methods, and show how to implement them using popular statistical software programs. By ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970930)16:18<2029::aid-s

    authors: Therneau TM,Hamilton SA

    更新日期:1997-09-30 00:00:00

  • Estimating the sample size for a t-test using an internal pilot.

    abstract::If the sample size for a t-test is calculated on the basis of a prior estimate of the variance then the power of the test at the treatment difference of interest is not robust to misspecification of the variance. We propose a t-test for a two-treatment comparison based on Stein's two-stage test which involves the use ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990715)18:13<1575::aid-s

    authors: Denne JS,Jennison C

    更新日期:1999-07-15 00:00:00

  • Development and applications of a city-level alcohol availability and alcohol problems database.

    abstract::Data on alcohol availability and problems in all cities in Los Angeles County were collected from several different sources and linked together to form a Local Alcohol Availability Database (LAAD). The two major purposes of the project are to provide a city-level alcohol availability and alcohol-related problems datab...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780140517

    authors: MacKinnon DP,Scribner R,Taft KA

    更新日期:1995-03-15 00:00:00

  • Reinforcement learning design for cancer clinical trials.

    abstract::We develop reinforcement learning trials for discovering individualized treatment regimens for life-threatening diseases such as cancer. A temporal-difference learning method called Q-learning is utilized that involves learning an optimal policy from a single training set of finite longitudinal patient trajectories. A...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3720

    authors: Zhao Y,Kosorok MR,Zeng D

    更新日期:2009-11-20 00:00:00

  • Reducing false alarms in syndromic surveillance.

    abstract::Algorithms for identifying public health threats or disease outbreaks are vulnerable to false alarms arising from sudden shifts in health-care utilization or data participation. This paper describes a method of reducing false alerts in automated public health surveillance algorithms, and in particular, automated syndr...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4204

    authors: Peter W,Najmi AH,Burkom HS

    更新日期:2011-06-30 00:00:00

  • Power analyses for longitudinal study designs with missing data.

    abstract::Existing methods for power analysis for longitudinal study designs are limited in that they do not adequately address random missing data patterns. Although the pattern of missing data can be assessed during data analysis, it is unknown during the design phase of a study. The random nature of the missing data pattern ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2773

    authors: Tu XM,Zhang J,Kowalski J,Shults J,Feng C,Sun W,Tang W

    更新日期:2007-07-10 00:00:00

  • A frailty model for recurrent events during alternating restraint and non-restraint time periods.

    abstract::We consider recurrent events of the same type that occur during alternating restraint and non-restraint time periods. This research is motivated by a study on juvenile recidivism, where the probationers were followed for re-offenses during alternating placement periods and free-time periods. During the placement perio...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7150

    authors: Li X,Chen Y,Li R

    更新日期:2017-02-20 00:00:00

  • An easy-to-implement approach for analyzing case-control and case-only studies assuming gene-environment independence and Hardy-Weinberg equilibrium.

    abstract::The case-control study is a simple and an useful method to characterize the effect of a gene, the effect of an exposure, as well as the interaction between the two. The control-free case-only study is yet an even simpler design, if interest is centered on gene-environment interaction only. It requires the sometimes pl...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4028

    authors: Lee WC,Wang LY,Cheng KF

    更新日期:2010-10-30 00:00:00

  • Differences in surrogate threshold effect estimates between original and simplified correlation-based validation approaches.

    abstract::Surrogate endpoint validation has been well established by the meta-analytical correlation-based approach as outlined in the seminal work of Buyse et al. (Biostatistics, 2000). Surrogacy can be assumed if strong associations on individual and study levels can be demonstrated. Alternatively, if an effect on a true endp...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6778

    authors: Schürmann C,Sieben W

    更新日期:2016-03-30 00:00:00

  • Comparison of predictive values of two diagnostic tests from the same sample of subjects using weighted least squares.

    abstract::Screening and diagnostic tests are important in disease prevention or control. The predictive values of positive and negative (PPV and NPV) test results are two of four operational characteristics of a screening test. We review an existing method based on the generalized estimating equation (GEE) methodology for compa...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2332

    authors: Wang W,Davis CS,Soong SJ

    更新日期:2006-07-15 00:00:00

  • Racial/ethnic disparities in vaccination coverage by 19 months of age: an evaluation of the impact of missing data resulting from record scattering.

    abstract::We describe how trends in the vaccination coverage at 19 months of age vary by race/ethnicity; explore the extent to which data required to evaluate a child's up-to-date vaccination status is missing as a result of the scattering of vaccination records among many vaccination providers; evaluate how the prevalence of t...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3223

    authors: Smith PJ,Stevenson J

    更新日期:2008-09-10 00:00:00

  • Personalized dose selection in radiation therapy using statistical models for toxicity and efficacy with dose and biomarkers as covariates.

    abstract::Selection of dose for cancer patients treated with radiation therapy (RT) must balance the increased efficacy with the increased toxicity associated with higher dose. Historically, a single dose has been selected for a population of patients (e.g., all stage III non-small cell lung cancer). However, the availability o...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6285

    authors: Schipper MJ,Taylor JM,TenHaken R,Matuzak MM,Kong FM,Lawrence TS

    更新日期:2014-12-30 00:00:00

  • Elasticity as a measure for online determination of remission points in ongoing epidemics.

    abstract::The correct identification of change-points during ongoing outbreak investigations of infectious diseases is a matter of paramount importance in epidemiology, with major implications for the management of health care resources, public health and, as the COVID-19 pandemic has shown, social live. Onsets, peaks, and infl...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8807

    authors: Veres-Ferrer EJ,Pavía JM

    更新日期:2021-02-20 00:00:00

  • The impact of heterogeneity on the comparison of survival times.

    abstract::We consider several sources of heterogeneity in a clinical trial with patients' survival time as the main response criterion: differences in prognosis which can be attributed to a latent or ignored prognostic factor; differences in treatment efficacy in subgroups of patients, and differences in treatment combinations ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780060708

    authors: Schumacher M,Olschewski M,Schmoor C

    更新日期:1987-10-01 00:00:00

  • Modelling the geographical distribution of co-infection risk from single-disease surveys.

    abstract:BACKGROUND:The need to deliver interventions targeting multiple diseases in a cost-effective manner calls for integrated disease control efforts. Consequently, maps are required that show where the risk of co-infection is particularly high. Co-infection risk is preferably estimated via Bayesian geostatistical multinomi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4243

    authors: Schur N,Gosoniu L,Raso G,Utzinger J,Vounatsou P

    更新日期:2011-06-30 00:00:00

  • Multinomial goodness-of-fit tests for logistic regression models.

    abstract::We examine the properties of several tests for goodness-of-fit for multinomial logistic regression. One test is based on a strategy of sorting the observations according to the complement of the estimated probability for the reference outcome category and then grouping the subjects into g equal-sized groups. A g x c c...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3202

    authors: Fagerland MW,Hosmer DW,Bofin AM

    更新日期:2008-09-20 00:00:00

  • Ordinal invariant measures for individual and group changes in ordered categorical data.

    abstract::Subjective judgements of complex variables are commonly recorded as ordered categorical data. The rank-invariant properties of such data are well known, and there are various statistical approaches to the analysis and modelling of ordinal data. This paper focuses on the non-additive property of ordered categorical dat...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19981230)17:24<2923::aid-s

    authors: Svensson E

    更新日期:1998-12-30 00:00:00

  • Modelling of viral dynamics in hepatitis B and hepatitis C clinical trials.

    abstract::In the recent years, studies of hepatitis B (HBV) and hepatitis C virus (HCV) dynamics have drawn great attention as they provide insight into the process of virus elimination/production and of infected cells decay during antiviral treatment. Estimates of viral dynamic parameters may be used to determine the lifetime ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3457

    authors: Sypsa V,Hatzakis A

    更新日期:2008-12-30 00:00:00