Performance assessment for radiologists interpreting screening mammography.

Abstract:

:When interpreting screening mammograms radiologists decide whether suspicious abnormalities exist that warrant the recall of the patient for further testing. Previous work has found significant differences in interpretation among radiologists; their false-positive and false-negative rates have been shown to vary widely. Performance assessments of individual radiologists have been mandated by the U.S. government, but concern exists about the adequacy of current assessment techniques. We use hierarchical modelling techniques to infer about interpretive performance of individual radiologists in screening mammography. While doing this we account for differences due to patient mix and radiologist attributes (for instance, years of experience or interpretive volume). We model at the mammogram level, and then use these models to assess radiologist performance. Our approach is demonstrated with data from mammography registries and radiologist surveys. For each mammogram, the registries record whether or not the woman was found to have breast cancer within one year of the mammogram; this criterion is used to determine whether the recall decision was correct. We model the false-positive rate and the false-negative rate separately using logistic regression on patient risk factors and radiologist random effects. The radiologist random effects are, in turn, regressed on radiologist attributes such as the number of years in practice. Using these Bayesian hierarchical models we examine several radiologist performance metrics. The first is the difference between the false-positive or false-negative rate of a particular radiologist and that of a hypothetical 'standard' radiologist with the same attributes and the same patient mix. A second metric predicts the performance of each radiologist on hypothetical mammography exams with particular combinations of patient risk factors (which we characterize as 'typical', 'high-risk', or 'low-risk'). The second metric can be used to compare one radiologist to another, while the first metric addresses how the radiologist is performing compared to an appropriate standard. Interval estimates are given for the metrics, thereby addressing uncertainty. The particular novelty in our contribution is to estimate multiple performance rates (sensitivity and specificity). One can even estimate a continuum of performance rates such as a performance curve or ROC curve using our models and we describe how this may be done. In addition to assessing radiologists in the original data set, we also show how to infer about the performance of a new radiologist with new case mix, new outcome data, and new attributes without having to refit the model.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Woodard DB,Gelfand AE,Barlow WE,Elmore JG

doi

10.1002/sim.2633

subject

Has Abstract

pub_date

2007-03-30 00:00:00

pages

1532-51

issue

7

eissn

0277-6715

issn

1097-0258

journal_volume

26

pub_type

杂志文章
  • Statistical inferences for a twin correlation with multinomial outcomes.

    abstract::Current methods for statistical analysis of twin studies focus on continuous and dichotomous data, while only limited methodology exists for analysing multinomial data. As a consequence, investigators are often tempted to collapse multinomial data into two categories simply to facilitate the analysis. We address this ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/1097-0258(20010130)20:2<249::aid-sim641>3.

    authors: Bartfay E,Donner A

    更新日期:2001-01-30 00:00:00

  • Non-parametric methods for comparing multiple treatment groups to a control group, based on incomplete non-decreasing repeated measurements.

    abstract::In the comparison of two or more treatment groups to a control group, consider a study with non-decreasing repeated measurements of the same characteristic taken over a common set of time points for each subject. Based on the vector of possibly incomplete responses from each subject, this paper considers asymptoticall...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(SICI)1097-0258(19961215)15:23<2509::AID-S

    authors: Davis CS

    更新日期:1996-12-15 00:00:00

  • Maximum likelihood estimation of the kappa coefficient from models of matched binary responses.

    abstract::We present an estimate of the kappa-coefficient of agreement between two methods of rating based on matched pairs of binary responses and show that the estimate depends on the common intraclass correlation coefficient between the pairs. Via Monte Carlo simulation, we investigate power of the test of significance on ka...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780140109

    authors: Shoukri MM,Martin SW,Mian IU

    更新日期:1995-01-15 00:00:00

  • A simulation study of finite-sample properties of marginal structural Cox proportional hazards models.

    abstract::Motivated by a previously published study of HIV treatment, we simulated data subject to time-varying confounding affected by prior treatment to examine some finite-sample properties of marginal structural Cox proportional hazards models. We compared (a) unadjusted, (b) regression-adjusted, (c) unstabilized, and (d) s...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5317

    authors: Westreich D,Cole SR,Schisterman EF,Platt RW

    更新日期:2012-08-30 00:00:00

  • Ignorability and bias in clinical trials.

    abstract::Patient non-compliance and drop-out can bias analyses of clinical trial data. I describe a parametric model for treatment cross-over and drop-out and demonstrate how the concept of ignorability, originally defined for incomplete-data problems, can elucidate sources of bias in clinical trials. I discuss some implicatio...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990915/30)18:17/18<2421:

    authors: Heitjan DF

    更新日期:1999-09-15 00:00:00

  • A robust method for proportional hazards regression.

    abstract::In this paper we give an informal introduction to a robust method for survival analysis which is based on a modification of the usual partial likelihood estimator (PLE). Large sample results lead us to expect reduced bias for this robust estimator compared with the PLE whenever there are even slight violations of the ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(SICI)1097-0258(19960530)15:10<1033::AID-S

    authors: Minder CE,Bednarski T

    更新日期:1996-05-30 00:00:00

  • A sequential classification rule based on multiple quantitative tests in the absence of a gold standard.

    abstract::In many medical applications, combining information from multiple biomarkers could yield a better diagnosis than any single one on its own. When there is a lack of a gold standard, an algorithm of classifying subjects into the case and non-case status is necessary for combining multiple markers. The aim of this paper ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6780

    authors: Zhang J,Zhang Y,Chaloner K,Stapleton JT

    更新日期:2016-04-15 00:00:00

  • A Markov mixed effect regression model for drug compliance.

    abstract::Patient compliance (adherence) with prescribed medication is often erratic, while clinical outcomes are causally linked to actual, rather than nominal medication dosage. We propose here a hierarchical Markov model for patient compliance. At the first stage, conditional upon individual random effects and a set of indiv...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19981030)17:20<2313::aid-s

    authors: Girard P,Blaschke TF,Kastrissios H,Sheiner LB

    更新日期:1998-10-30 00:00:00

  • A graphical approach to sequentially rejective multiple test procedures.

    abstract::For clinical trials with multiple treatment arms or endpoints a variety of sequentially rejective, weighted Bonferroni-type tests have been proposed, such as gatekeeping procedures, fixed sequence tests, and fallback procedures. They allow to map the difference in importance as well as the relationship between the var...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3495

    authors: Bretz F,Maurer W,Brannath W,Posch M

    更新日期:2009-02-15 00:00:00

  • Estimation of the mediation effect with a binary mediator.

    abstract::A mediator acts as a third variable in the causal pathway between a risk factor and an outcome. In this paper, we consider the estimation of the mediation effect when the mediator is a binary variable. We give a precise definition of the mediation effect and examine asymptotic properties of five different estimators o...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2730

    authors: Li Y,Schneider JA,Bennett DA

    更新日期:2007-08-15 00:00:00

  • Monitoring medical procedures by exponential smoothing.

    abstract::A new exponentially weighted moving average (EWMA) control chart well suited for 'online' routine surveillance of medical procedures is introduced. The chart is based on inter-event counts for failures recorded when the failures occur. The method can be used for many types of hospital procedures and activities, such a...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2520

    authors: Spliid H

    更新日期:2007-01-15 00:00:00

  • Inference on a collapsed margin in disease mapping.

    abstract::This paper describes a method for estimating the risk from a disease over a set of contiguous geographical regions, when data on a potentially important covariate, such as race, are not available. Conditions under which the extra margin can be recovered are suggested. An application to prostate cancer mortality among ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/1097-0258(20000915/30)19:17/18<2243::aid-s

    authors: Byers S,Besag J

    更新日期:2000-09-15 00:00:00

  • Technical uncertainty in the back-calculation of occupational exposure to dioxins.

    abstract::Members of a cohort of workers in chemical industry (the so-called Boehringer cohort) exposed to 2, 3, 7, 8-tetrachlorodibenzo-para-dioxin (TCDD) from 1950 to 1984 were subject in the years 1985-1986 and 1992-1994 to an extensive biomonitoring programme on the TCDD levels of the individual workers. For establishing a ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3074

    authors: Heinzl H,Mittlböck M,Edler L

    更新日期:2008-05-30 00:00:00

  • A frailty model approach for regression analysis of multivariate current status data.

    abstract::This paper discusses regression analysis of multivariate current status failure time data (The Statistical Analysis of Interval-censoring Failure Time Data. Springer: New York, 2006), which occur quite often in, for example, tumorigenicity experiments and epidemiologic investigations of the natural history of a diseas...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3715

    authors: Chen MH,Tong X,Sun J

    更新日期:2009-11-30 00:00:00

  • Comparative calibration without a gold standard.

    abstract::Comparative calibration is the broad statistical methodology used to assess the calibration of a set of p instruments, each designed to measure the same characteristic, on a common group of individuals. Different from the usual calibration problem, the true underlying quantity measured is unobservable. Many authors ha...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970830)16:16<1889::aid-s

    authors: Lu Y,Ye K,Mathur AK,Hui S,Fuerst TP,Genant HK

    更新日期:1997-08-30 00:00:00

  • Dynamic Cox modelling based on fractional polynomials: time-variations in gastric cancer prognosis.

    abstract::The most popular model used for survival analysis is the proportional hazards regression model proposed by Cox. This is mainly due to its exceptional simplicity. Nevertheless the fundamental assumption of the Cox model is the proportionality of the hazards. For many applications, however, this assumption is doubtful. ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1411

    authors: Berger U,Schäfer J,Ulm K

    更新日期:2003-04-15 00:00:00

  • Play the winner for phase II/III clinical trials.

    abstract::In comparing two treatments under a typical sequential clinical trial setting, a 50-50 randomization design generates reliable data for making efficient inferences about the treatment difference for the benefit of patients in the general population. However, if the treatment difference is large and the endpoint of the...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19961130)15:22<2413::aid-s

    authors: Yao Q,Wei LJ

    更新日期:1996-11-15 00:00:00

  • A linear exponent AR(1) family of correlation structures.

    abstract::In repeated measures settings, modeling the correlation pattern of the data can be immensely important for proper analyses. Accurate inference requires proper choice of the correlation model. Optimal efficiency of the estimation procedure demands a parsimonious parameterization of the correlation structure, with suffi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3928

    authors: Simpson SL,Edwards LJ,Muller KE,Sen PK,Styner MA

    更新日期:2010-07-30 00:00:00

  • Sample sizes for constructing confidence intervals and testing hypotheses.

    abstract::Although estimation and confidence intervals have become popular alternatives to hypothesis testing and p-values, statisticians usually determine sample sizes for randomized clinical trials by controlling the power of a statistical test at an appropriate alternative, even those statisticians who recommend the use of c...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780080705

    authors: Bristol DR

    更新日期:1989-07-01 00:00:00

  • The Peto odds ratio viewed as a new effect measure.

    abstract::Meta-analysis has generally been accepted as a fundamental tool for combining effect estimates from several studies. For binary studies with rare events, the Peto odds ratio (POR) method has become the relative effect estimator of choice. However, the POR leads to biased estimates for the OR when treatment effects are...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6301

    authors: Brockhaus AC,Bender R,Skipka G

    更新日期:2014-12-10 00:00:00

  • Assessing goodness-of-fit of parametric regression models for lifetime data-graphical methods.

    abstract::Graphical methods are often used to check goodness-of-fit of models to data. It is common to plot residuals against a reference distribution so that when the model fits the data, the configuration should be close to a straight line. Since the resemblance to a straight line is often unclear, it has been suggested to ad...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780141607

    authors: Cohen A,Barnett O

    更新日期:1995-08-30 00:00:00

  • Comparison of tests for categorical data from a stratified cluster randomized trial.

    abstract::Two features commonly exhibited by randomized trials of health promotion interventions are cluster randomization and stratification. Ignoring correlations between individuals within clusters can lead to an inflated type I error rate and hence a P-value which overstates the significance of the result. This paper compar...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1256

    authors: Dobbins TA,Simpson JM

    更新日期:2002-12-30 00:00:00

  • Analyzing survival curves at a fixed point in time.

    abstract::A common problem encountered in many medical applications is the comparison of survival curves. Often, rather than comparison of the entire survival curves, interest is focused on the comparison at a fixed point in time. In most cases, the naive test based on a difference in the estimates of survival is used for this ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2864

    authors: Klein JP,Logan B,Harhoff M,Andersen PK

    更新日期:2007-10-30 00:00:00

  • Bayesian bivariate meta-analysis of diagnostic test studies using integrated nested Laplace approximations.

    abstract::For bivariate meta-analysis of diagnostic studies, likelihood approaches are very popular. However, they often run into numerical problems with possible non-convergence. In addition, the construction of confidence intervals is controversial. Bayesian methods based on Markov chain Monte Carlo (MCMC) sampling could be u...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3858

    authors: Paul M,Riebler A,Bachmann LM,Rue H,Held L

    更新日期:2010-05-30 00:00:00

  • Group sequential designs for cure rate models with early stopping in favour of the null hypothesis.

    abstract::Ewell and Ibrahim derived the large sample distribution of the logrank statistic under general local alternatives. Their asymptotic results enable us to extend several group sequential designs which allow for early stopping in favour of the null hypothesis to the setting in which the cure rate model is appropriate. In...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/1097-0258(20001130)19:22<3023::aid-sim638>

    authors: Patricia Bernardo MV,Ibrahim JG

    更新日期:2000-11-30 00:00:00

  • A random effects model for ordinal responses from a crossover trial.

    abstract::Crossover studies have been successfully conducted in the case of continuous responses. Existing procedures of analysis for ordinal responses, on the other hand, are rarely satisfactory unless strict, usually unrealistic, assumptions are made. In this paper we investigate a random effects model and show that the model...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780100611

    authors: Ezzet F,Whitehead J

    更新日期:1991-06-01 00:00:00

  • Joint modeling of repeated multivariate cognitive measures and competing risks of dementia and death: a latent process and latent class approach.

    abstract::Joint models initially dedicated to a single longitudinal marker and a single time-to-event need to be extended to account for the rich longitudinal data of cohort studies. Multiple causes of clinical progression are indeed usually observed, and multiple longitudinal markers are collected when the true latent trait of...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6731

    authors: Proust-Lima C,Dartigues JF,Jacqmin-Gadda H

    更新日期:2016-02-10 00:00:00

  • Generalized linear model for partially ordered data.

    abstract::Within the rich literature on generalized linear models, substantial efforts have been devoted to models for categorical responses that are either completely ordered or completely unordered. Few studies have focused on the analysis of partially ordered outcomes, which arise in practically every area of study, includin...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4318

    authors: Zhang Q,Ip EH

    更新日期:2012-01-13 00:00:00

  • Robust versus consistent variance estimators in marginal structural Cox models.

    abstract::In survival analyses, inverse-probability-of-treatment (IPT) and inverse-probability-of-censoring (IPC) weighted estimators of parameters in marginal structural Cox models are often used to estimate treatment effects in the presence of time-dependent confounding and censoring. In most applications, a robust variance e...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7823

    authors: Enders D,Engel S,Linder R,Pigeot I

    更新日期:2018-10-30 00:00:00

  • Predictive diagnostics for logistic models.

    abstract::Novel methodology is implemented to assess the predictive power of covariate information associated with sequential binary events. Logistic models are first fitted on the basis of a subset of the observations and then evaluated sequentially on the rest. The probabilistic forecasts are compared to the outcomes via a sc...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(SICI)1097-0258(19961030)15:20<2149::AID-S

    authors: Seillier-Moiseiwitsch F

    更新日期:1996-10-30 00:00:00