Easy and accurate variance estimation of the nonparametric estimator of the partial area under the ROC curve and its application.

Abstract:

:The receiver operating characteristic (ROC) curve is a popular technique with applications, for example, investigating an accuracy of a biomarker to delineate between disease and non-disease groups. A common measure of accuracy of a given diagnostic marker is the area under the ROC curve (AUC). In contrast with the AUC, the partial area under the ROC curve (pAUC) looks into the area with certain specificities (i.e., true negative rate) only, and it can be often clinically more relevant than examining the entire ROC curve. The pAUC is commonly estimated based on a U-statistic with the plug-in sample quantile, making the estimator a non-traditional U-statistic. In this article, we propose an accurate and easy method to obtain the variance of the nonparametric pAUC estimator. The proposed method is easy to implement for both one biomarker test and the comparison of two correlated biomarkers because it simply adapts the existing variance estimator of U-statistics. In this article, we show accuracy and other advantages of the proposed variance estimation method by broadly comparing it with previously existing methods. Further, we develop an empirical likelihood inference method based on the proposed variance estimator through a simple implementation. In an application, we demonstrate that, depending on the inferences by either the AUC or pAUC, we can make a different decision on a prognostic ability of a same set of biomarkers. Copyright © 2016 John Wiley & Sons, Ltd.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Yu J,Yang L,Vexler A,Hutson AD

doi

10.1002/sim.6863

subject

Has Abstract

pub_date

2016-06-15 00:00:00

pages

2251-82

issue

13

eissn

0277-6715

issn

1097-0258

journal_volume

35

pub_type

杂志文章
  • Assessment of equivalence on multiple endpoints.

    abstract::Some clinical trials aim to demonstrate therapeutic equivalence on multiple primary endpoints. For example, therapeutic equivalence studies of agents for the treatment of osteoarthritis use several primary endpoints including investigator's global assessment of disease activity, patient's global assessment of response...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.985

    authors: Quan H,Bolognese J,Yuan W

    更新日期:2001-11-15 00:00:00

  • An index of disease activity in rheumatoid arthritis.

    abstract::This paper describes the Stoke Index which has been designed to give a global measure of disease activity in rheumatoid arthritis. The index is based on two objective laboratory measurements, one subjective and two semi-objective clinical measurements, chosen from 13 measurements using clinical judgement. Variable sel...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780121206

    authors: Jones PW,Ziade MF,Davis MJ,Dawes PT

    更新日期:1993-06-30 00:00:00

  • Statistical inferences for a twin correlation with multinomial outcomes.

    abstract::Current methods for statistical analysis of twin studies focus on continuous and dichotomous data, while only limited methodology exists for analysing multinomial data. As a consequence, investigators are often tempted to collapse multinomial data into two categories simply to facilitate the analysis. We address this ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/1097-0258(20010130)20:2<249::aid-sim641>3.

    authors: Bartfay E,Donner A

    更新日期:2001-01-30 00:00:00

  • Investigating the prediction ability of survival models based on both clinical and omics data: two case studies.

    abstract::In biomedical literature, numerous prediction models for clinical outcomes have been developed based either on clinical data or, more recently, on high-throughput molecular data (omics data). Prediction models based on both types of data, however, are less common, although some recent studies suggest that a suitable c...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6246

    authors: De Bin R,Sauerbrei W,Boulesteix AL

    更新日期:2014-12-30 00:00:00

  • Assessing heterogeneity and correlation of paired failure times with the bivariate frailty model.

    abstract::We consider bivariate survival times for heterogeneous populations, where heterogeneity induces deviations in an individual's risk of an event as well as associations between survival times. The heterogeneity is characterized by a bivariate frailty model. We measure the heterogeneity effects through deviations associa...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990430)18:8<907::aid-sim

    authors: Xue X,Ding Y

    更新日期:1999-04-30 00:00:00

  • Methods for analysing county-level mortality rates.

    abstract::The identification of counties burdened by exceptionally high rates of mortality is a fundamental step in the development of state-based intervention and prevention strategies. However, the estimation of rates from small geographic areas presents special problems, especially for rare events. This paper compares the us...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780120320

    authors: Stevenson JM,Olson DR

    更新日期:1993-02-01 00:00:00

  • Estimating the sample size for a t-test using an internal pilot.

    abstract::If the sample size for a t-test is calculated on the basis of a prior estimate of the variance then the power of the test at the treatment difference of interest is not robust to misspecification of the variance. We propose a t-test for a two-treatment comparison based on Stein's two-stage test which involves the use ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990715)18:13<1575::aid-s

    authors: Denne JS,Jennison C

    更新日期:1999-07-15 00:00:00

  • Modelling age-specific risk: application to dementia.

    abstract::We give up-to-date methods for estimating the age-specific incidence of a disease and for estimating the effect of risk factors. We recommend taking age as the basic time scale of the analysis; then, the hazard function can be interpreted as the age-specific incidence of the disease. This choice raises a delayed entry...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19980915)17:17<1973::aid-s

    authors: Commenges D,Letenneur L,Joly P,Alioum A,Dartigues JF

    更新日期:1998-09-15 00:00:00

  • Circular-circular regression model with a spike at zero.

    abstract::With reference to a real data on cataract surgery, we discuss the problem of zero-inflated circular-circular regression when both covariate and response are circular random variables and a large proportion of the responses are zeros. The regression model is proposed, and the estimation procedure for the parameters is ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7496

    authors: Jha J,Biswas A

    更新日期:2018-01-15 00:00:00

  • Constructing binomial confidence intervals with near nominal coverage by adding a single imaginary failure or success.

    abstract::In this paper we present a simple method for constructing (1- alpha)100 per cent confidence intervals for binomial proportions with near nominal coverage for all underlying proportion parameters on the unit interval. This new method uses, with a slight modification, the standard normal approximation technique taught i...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2469

    authors: Borkowf CB

    更新日期:2006-11-15 00:00:00

  • Modelling heterogeneity in clustered count data with extra zeros using compound Poisson random effect.

    abstract::In medical and health studies, heterogeneities in clustered count data have been traditionally modeled by positive random effects in Poisson mixed models; however, excessive zeros often occur in clustered medical and health count data. In this paper, we consider a three-level random effects zero-inflated Poisson model...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3619

    authors: Ma R,Hasan MT,Sneddon G

    更新日期:2009-08-15 00:00:00

  • A new proposal to adjust Moran's I for population density.

    abstract::We analyse the effect of using prevalence rates based on populations with different sizes in the power of spatial independence tests. We compare the well known spatial correlation Moran's index to three indexes obtained after adjusting for population density, one proposed by Oden, another proposed by Waldhör, and a th...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990830)18:16<2147::aid-s

    authors: Assunção RM,Reis EA

    更新日期:1999-08-30 00:00:00

  • Finding common task-related regions in fMRI data from multiple subjects by periodogram clustering and clustering ensemble.

    abstract::We propose an innovative and practically relevant clustering method to find common task-related brain regions among different subjects who respond to the same set of stimuli. Using functional magnetic resonance imaging (fMRI) time series data, we first cluster the voxels within each subject on a voxel by voxel basis. ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6906

    authors: Ye J,Li Y,Lazar NA,Schaeffer DJ,McDowell JE

    更新日期:2016-07-10 00:00:00

  • Statistical issues related to dietary intake as the response variable in intervention trials.

    abstract::The focus of this paper is dietary intervention trials. We explore the statistical issues involved when the response variable, intake of a food or nutrient, is based on self-report data that are subject to inherent measurement error. There has been little work on handling error in this context. A particular feature of...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7011

    authors: Keogh RH,Carroll RJ,Tooze JA,Kirkpatrick SI,Freedman LS

    更新日期:2016-11-10 00:00:00

  • Long-term survivor mixture model with random effects: application to a multi-centre clinical trial of carcinoma.

    abstract::A mixture model incorporating long-term survivors has been adopted in the field of biostatistics where some individuals may never experience the failure event under study. The surviving fractions may be considered as cured. In most applications, the survival times are assumed to be independent. However, when the survi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.932

    authors: Yau KK,Ng AS

    更新日期:2001-06-15 00:00:00

  • Correcting covariate-dependent measurement error with non-zero mean.

    abstract::There are many settings in which the distribution of error in a mismeasured covariate varies with the value of another covariate. Take, for example, the case of HIV phylogenetic cluster size, large values of which are an indication of rapid HIV transmission. Researchers wish to find behavioral correlates of HIV phylog...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7289

    authors: Parveen N,Moodie E,Brenner B

    更新日期:2017-07-30 00:00:00

  • Sensitivity of Fisher's exact test to minor perturbations in 2 x 2 contingency tables.

    abstract::The two tailed Fisher's exact P value is extremely sensitive to small perturbations in 2 x 2 contingency tables. An example indicates that a 1 per cent increase in the denominator of one treatment group results in a 32 per cent drop in the exact P value, but a mere 0.1 per cent decrease in the treatment success rate. ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780050610

    authors: Dupont WD

    更新日期:1986-11-01 00:00:00

  • Comparing performance of surgeons using risk-adjusted procedures.

    abstract::It is naive and incorrect to use the proportions of successful operations to compare the performance of surgeons because the patients' risk profiles are different. In this paper, we explore the use of risk-adjusted procedures to compare the performance of surgeons. One such risk-adjusted statistic is the standardized ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7310

    authors: Tang X,Gan FF

    更新日期:2017-07-20 00:00:00

  • Parameters of mortality in human populations with widely varying life spans.

    abstract::A three-component, competing-risk mortality model, developed for animal survival data, fits human life table data for all ages over a range of mean life spans from 16 to 74 years. The competing risks are a novel exponentially-decreasing hazard, dominant during immaturity; a constant hazard, dominant during adulthood; ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780020309

    authors: Siler W

    更新日期:1983-07-01 00:00:00

  • Identifiability and estimation of causal mediation effects with missing data.

    abstract::Mediation analysis is a standard approach to understanding how and why an intervention works in social and medical sciences. However, the presence of missing data, especially missing not at random data, poses a great challenge for the applicability of this approach in practice. Current methods for handling such missin...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7413

    authors: Li W,Zhou XH

    更新日期:2017-11-10 00:00:00

  • Estimation methods for marginal and association parameters for longitudinal binary data with nonignorable missing observations.

    abstract::In longitudinal studies, missing observations occur commonly. It has been well known that biased results could be produced if missingness is not properly handled in the analysis. Authors have developed many methods with the focus on either incomplete response or missing covariate observations, but rarely on both. The ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5536

    authors: Li H,Yi GY

    更新日期:2013-02-28 00:00:00

  • Sharp nonparametric bounds and randomization inference for treatment effects on an ordinal outcome.

    abstract::In clinical research, investigators are interested in inferring the average causal effect of a treatment. However, the causal parameter that can be used to derive the average causal effect is not well defined for ordinal outcomes. Although some definitions have been proposed, they are limited in that they are not iden...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7400

    authors: Chiba Y

    更新日期:2017-11-10 00:00:00

  • Measurement error correction for nutritional exposures with correlated measurement error: use of the method of triads in a longitudinal setting.

    abstract::Nutritional exposures are often measured with considerable error in commonly used surrogate instruments such as the food frequency questionnaire (FFQ) (denoted by Q(i) for the ith subject). The error can be both systematic and random. The diet record (DR) denoted by R(i) for the ith subject is considered an alloyed go...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3238

    authors: Rosner B,Michels KB,Chen YH,Day NE

    更新日期:2008-08-15 00:00:00

  • Performance of analytical methods for overdispersed counts in cluster randomized trials: sample size, degree of clustering and imbalance.

    abstract::Many different methods have been proposed for the analysis of cluster randomized trials (CRTs) over the last 30 years. However, the evaluation of methods on overdispersed count data has been based mostly on the comparison of results using empiric data; i.e. when the true model parameters are not known. In this study, ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3681

    authors: Durán Pacheco G,Hattendorf J,Colford JM Jr,Mäusezahl D,Smith T

    更新日期:2009-10-30 00:00:00

  • A Bayesian hierarchical variable selection prior for pathway-based GWAS using summary statistics.

    abstract::While genome-wide association studies (GWASs) have been widely used to uncover associations between diseases and genetic variants, standard SNP-level GWASs often lack the power to identify SNPs that individually have a moderate effect size but jointly contribute to the disease. To overcome this problem, pathway-based ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8442

    authors: Yang Y,Basu S,Zhang L

    更新日期:2020-03-15 00:00:00

  • More powerful randomization-based p-values in double-blind trials with non-compliance.

    abstract::Standard randomization-based tests of sharp null hypotheses in randomized clinical trials, that is, intent-to-treat analyses, are valid without extraneous assumptions, but generally can be appropriately powerful only with alternative hypotheses that involve treatment assignment having an effect on outcome. In the cont...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19980215)17:3<371::aid-sim

    authors: Rubin DB

    更新日期:1998-02-15 00:00:00

  • Assessing surrogacy from the joint modelling of multivariate longitudinal data and survival: application to clinical trial data on chronic lymphocytic leukaemia.

    abstract::In clinical research, we are often interested in assessing how a biomarker changes with time, and whether it could be used as a surrogate marker when evaluating the efficacy of a new drug. However, when the longitudinal marker is correlated with survival, linear mixed models for longitudinal data may be inappropriate....

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3142

    authors: Deslandes E,Chevret S

    更新日期:2007-12-30 00:00:00

  • A new approach to training back-propagation artificial neural networks: empirical evaluation on ten data sets from clinical studies.

    abstract::We present a new approach to training back-propagation artificial neural nets (BP-ANN) based on regularization and cross-validation and on initialization by a logistic regression (LR) model. The new approach is expected to produce a BP-ANN predictor at least as good as the LR-based one. We have applied the approach to...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1107

    authors: Ciampi A,Zhang F

    更新日期:2002-05-15 00:00:00

  • Classification using ensemble learning under weighted misclassification loss.

    abstract::Binary classification rules based on covariates typically depend on simple loss functions such as zero-one misclassification. Some cases may require more complex loss functions. For example, individual-level monitoring of HIV-infected individuals on antiretroviral therapy requires periodic assessment of treatment fail...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8082

    authors: Xu Y,Liu T,Daniels MJ,Kantor R,Mwangi A,Hogan JW

    更新日期:2019-05-20 00:00:00

  • Estimation of death rates in US states with small subpopulations.

    abstract::In US states with small subpopulations, the observed mortality rates are often zero, particularly among young ages. Because in life tables, death rates are reported mostly on a log scale, zero mortality rates are problematic. To overcome the observed zero death rates problem, appropriate probability models are used. U...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6385

    authors: Voulgaraki A,Wei R,Kedem B

    更新日期:2015-05-20 00:00:00