Kappa coefficients in medical research.

Abstract:

:Kappa coefficients are measures of correlation between categorical variables often used as reliability or validity coefficients. We recapitulate development and definitions of the K (categories) by M (ratings) kappas (K x M), discuss what they are well- or ill-designed to do, and summarize where kappas now stand with regard to their application in medical research. The 2 x M(M>/=2) intraclass kappa seems the ideal measure of binary reliability; a 2 x 2 weighted kappa is an excellent choice, though not a unique one, as a validity measure. For both the intraclass and weighted kappas, we address continuing problems with kappas. There are serious problems with using the K x M intraclass (K>2) or the various K x M weighted kappas for K>2 or M>2 in any context, either because they convey incomplete and possibly misleading information, or because other approaches are preferable to their use. We illustrate the use of the recommended kappas with applications in medical research.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Chmura Kraemer H,Periyakoil VS,Noda A

doi

10.1002/sim.1180

subject

Has Abstract

pub_date

2002-07-30 00:00:00

pages

2109-29

issue

14

eissn

0277-6715

issn

1097-0258

journal_volume

21

pub_type

杂志文章,评审
  • A statistical assessment of clinical equivalence.

    abstract::An observed confidence distribution is proposed as a measure of strength of evidence for practically equivalent efficacies of two treatments. The concept is independent of prior opinions about relevant sizes of a difference in efficacy. It also avoids retrospective power calculations for trials with missed recruitment...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780071207

    authors: Mau J

    更新日期:1988-12-01 00:00:00

  • Assessing the robustness of sisVIVE in a Mendelian randomization study to estimate the causal effect of body mass index on income using multiple SNPs from understanding society.

    abstract::The "some invalid, some valid instrumental variable estimator" (sisVIVE) is a lasso-based method for instrumental variables (IVs) regression of outcome on an exposure. In principle, sisVIVE is robust to some of the IVs in the analysis being invalid, in the sense of being related to the outcome variable through pathway...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8066

    authors: Bao Y,Clarke PS,Smart M,Kumari M

    更新日期:2019-04-30 00:00:00

  • Personalized dose selection in radiation therapy using statistical models for toxicity and efficacy with dose and biomarkers as covariates.

    abstract::Selection of dose for cancer patients treated with radiation therapy (RT) must balance the increased efficacy with the increased toxicity associated with higher dose. Historically, a single dose has been selected for a population of patients (e.g., all stage III non-small cell lung cancer). However, the availability o...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6285

    authors: Schipper MJ,Taylor JM,TenHaken R,Matuzak MM,Kong FM,Lawrence TS

    更新日期:2014-12-30 00:00:00

  • Monitoring potential adverse event rate differences using data from blinded trials: the canary in the coal mine.

    abstract::The development of drugs and biologicals whose mechanisms of action may extend beyond their target indications has led to a need to identify unexpected potential toxicities promptly even while blinded clinical trials are under way. One component of recently issued FDA rules regarding safety reporting requirements rais...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7129

    authors: Gould AL,Wang WB

    更新日期:2017-01-15 00:00:00

  • Defining and ranking effects of individual agents based on survival times of cancer patients treated with combination chemotherapies.

    abstract::An important problem in oncology is comparing chemotherapy (chemo) agents in terms of their effects on survival or progression-free survival time. When the goal is to evaluate individual agents, a difficulty commonly encountered with observational data is that many patients receive a chemo combination including two or...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.4249

    authors: Thall PF,Liu DD,Berrak SG,Wolff JE

    更新日期:2011-07-10 00:00:00

  • Conditional power and predictive power based on right censored data with supplementary auxiliary information.

    abstract::Conditional power and predictive power provide estimates of the probability of success at the end of the trial based on the information from the interim analysis. The observed value of the time to event endpoint at the interim analysis could be biased for the true treatment effect due to early censoring, leading to a ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7673

    authors: Sun L,Wan Y

    更新日期:2018-08-15 00:00:00

  • Bayesian clinical trials in action.

    abstract::Although the frequentist paradigm has been the predominant approach to clinical trial design since the 1940s, it has several notable limitations. Advancements in computational algorithms and computer hardware have greatly enhanced the alternative Bayesian paradigm. Compared with its frequentist counterpart, the Bayesi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.5404

    authors: Lee JJ,Chu CT

    更新日期:2012-11-10 00:00:00

  • Accounting for informatively missing data in logistic regression by means of reassessment sampling.

    abstract::We explore the 'reassessment' design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mech...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6456

    authors: Lin J,Lyles RH

    更新日期:2015-05-20 00:00:00

  • Differences in surrogate threshold effect estimates between original and simplified correlation-based validation approaches.

    abstract::Surrogate endpoint validation has been well established by the meta-analytical correlation-based approach as outlined in the seminal work of Buyse et al. (Biostatistics, 2000). Surrogacy can be assumed if strong associations on individual and study levels can be demonstrated. Alternatively, if an effect on a true endp...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6778

    authors: Schürmann C,Sieben W

    更新日期:2016-03-30 00:00:00

  • Maximum likelihood estimation of the kappa coefficient from models of matched binary responses.

    abstract::We present an estimate of the kappa-coefficient of agreement between two methods of rating based on matched pairs of binary responses and show that the estimate depends on the common intraclass correlation coefficient between the pairs. Via Monte Carlo simulation, we investigate power of the test of significance on ka...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780140109

    authors: Shoukri MM,Martin SW,Mian IU

    更新日期:1995-01-15 00:00:00

  • Comparison of operational characteristics for binary tests with clustered data.

    abstract::Although statistical methodology is well-developed for comparing diagnostic tests in terms of their sensitivity and specificity, comparative inference about predictive values is not. In this paper, we consider the analysis of studies comparing operating characteristics of two diagnostic tests that are measured on all ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6485

    authors: Kwak M,Um SW,Jung SH

    更新日期:2015-07-10 00:00:00

  • Item response models for longitudinal quality of life data in clinical trials.

    abstract::Assessment of quality of life is becoming standard in clinical trials. A popular method for measuring quality of life is with instruments which utilize multiple-item subscales, in which each item is scored on a Likert scale. Most statistical methods for the analysis of quality of life data in clinical trials do not ex...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19991115)18:21<2917::aid-s

    authors: Douglas JA

    更新日期:1999-11-15 00:00:00

  • Testing the equality of two survival functions with right truncated data.

    abstract::To compare the survival functions based on right-truncated data, Lagakos et al. proposed a weighted logrank test based on a reverse time scale. This is in contrast to Bilker and Wang, who suggested a semi-parametric version of the Mann-Whitney test by assuming that the distribution of truncation times is known or can ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2556

    authors: Chi Y,Tsai WY,Chiang CL

    更新日期:2007-02-20 00:00:00

  • A small sample study of the STEPP approach to assessing treatment-covariate interactions in survival data.

    abstract::A new, intuitive method has recently been proposed to explore treatment-covariate interactions in survival data arising from two treatment arms of a clinical trial. The method is based on constructing overlapping subpopulations of patients with respect to one (or more) covariates of interest and in observing the patte...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3524

    authors: Bonetti M,Zahrieh D,Cole BF,Gelber RD

    更新日期:2009-04-15 00:00:00

  • Correction of sampling bias in a cross-sectional study of post-surgical complications.

    abstract::Cross-sectional designs are often used to monitor the proportion of infections and other post-surgical complications acquired in hospitals. However, conventional methods for estimating incidence proportions when applied to cross-sectional data may provide estimators that are highly biased, as cross-sectional designs t...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5608

    authors: Fluss R,Mandel M,Freedman LS,Weiss IS,Zohar AE,Haklai Z,Gordon ES,Simchen E

    更新日期:2013-06-30 00:00:00

  • Multiple imputation using chained equations: Issues and guidance for practice.

    abstract::Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations ar...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4067

    authors: White IR,Royston P,Wood AM

    更新日期:2011-02-20 00:00:00

  • Second-stage least squares versus penalized quasi-likelihood for fitting hierarchical models in epidemiologic analyses.

    abstract::Hierarchical regression analysis holds much promise for epidemiologic analysis, but has as yet seen limited application because of lack of easily used software and the relatively lengthy run times of preferred fitting methods (such as true maximum likelihood and Bayesian approaches). This paper compares three relative...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970315)16:5<515::aid-sim

    authors: Greenland S

    更新日期:1997-03-15 00:00:00

  • Non-parametric methods for recurrent event data with informative and non-informative censorings.

    abstract::Recurrent event data are commonly encountered in health-related longitudinal studies. In this paper time-to-events models for recurrent event data are studied with non-informative and informative censorings. In statistical literature, the risk set methods have been confirmed to serve as an appropriate and efficient ap...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1029

    authors: Wang MC,Chiang CT

    更新日期:2002-02-15 00:00:00

  • Joint analysis of mixed types of outcomes with latent variables.

    abstract::We propose a joint modeling approach to investigating the observed and latent risk factors of mixed types of outcomes. The proposed model comprises three parts. The first part is an exploratory factor analysis model that summarizes latent factors through multiple observed variables. The second part is a proportional h...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8840

    authors: Pan D,Wei Y,Song X

    更新日期:2020-12-09 00:00:00

  • Construction and validation of a prognostic model across several studies, with an application in superficial bladder cancer.

    abstract::Many models for clinical prediction (prognosis or diagnosis) are published in the medical literature every year but few such models find their way into clinical practice. The reason may be that since in most cases models have not been validated in independent data, they lack generality and/or credibility. In this pape...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1691

    authors: Royston P,Parmar MK,Sylvester R

    更新日期:2004-03-30 00:00:00

  • A review of methods for futility stopping based on conditional power.

    abstract::Conditional power (CP) is the probability that the final study result will be statistically significant, given the data observed thus far and a specific assumption about the pattern of the data to be observed in the remainder of the study, such as assuming the original design effect, or the effect estimated from the c...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.2151

    authors: Lachin JM

    更新日期:2005-09-30 00:00:00

  • Comparison of tests for categorical data from a stratified cluster randomized trial.

    abstract::Two features commonly exhibited by randomized trials of health promotion interventions are cluster randomization and stratification. Ignoring correlations between individuals within clusters can lead to an inflated type I error rate and hence a P-value which overstates the significance of the result. This paper compar...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1256

    authors: Dobbins TA,Simpson JM

    更新日期:2002-12-30 00:00:00

  • Flexible longitudinal linear mixed models for multiple censored responses data.

    abstract::In biomedical studies and clinical trials, repeated measures are often subject to some upper and/or lower limits of detection. Hence, the responses are either left or right censored. A complication arises when more than one series of responses is repeatedly collected on each subject at irregular intervals over a perio...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8017

    authors: Lachos VH,A Matos L,Castro LM,Chen MH

    更新日期:2019-03-15 00:00:00

  • Testing goodness-of-fit of the logistic regression model in case-control studies using sample reweighting.

    abstract::A new goodness-of-fit test for the logistic regression model is proposed. It exploits the property of this model that when it is correct, i.e. not misspecified, the parameter estimates are (asymptotically) invariant under reweighting the observations by weights wi that are a function of the binary (0/1) outcomes yi. M...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1997

    authors: Nagelkerke N,Smits J,le Cessie S,van Houwelingen H

    更新日期:2005-01-15 00:00:00

  • A unified inference procedure for a class of measures to assess improvement in risk prediction systems with survival data.

    abstract::Risk prediction procedures can be quite useful for the patient's treatment selection, prevention strategy, or disease management in evidence-based medicine. Often, potentially important new predictors are available in addition to the conventional markers. The question is how to quantify the improvement from the new ma...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5647

    authors: Uno H,Tian L,Cai T,Kohane IS,Wei LJ

    更新日期:2013-06-30 00:00:00

  • Construction, validation and updating of a prognostic model for kidney graft survival.

    abstract::The construction, validation and updating of a prognostic model for kidney graft survival is reported using data from the Eurotransplant database. First, a model is constructed for data from transplantations in the period 1984 to 1987. The model is later updated for the 1988 1990 data. The first data set was randomly ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780141806

    authors: Van Houwelingen HC,Thorogood J

    更新日期:1995-09-30 00:00:00

  • Simultaneous estimation of intrarater and interrater agreement for multiple raters under order restrictions for a binary trait.

    abstract::It is valuable in many studies to assess both intrarater and interrater agreement. Most measures of intrarater agreement do not adjust for unequal estimates of prevalence between the separate rating occasions for a given rater and measures of interrater agreement typically ignore data from the second set of assessment...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1138

    authors: Lester Kirchner H,Lemke JH

    更新日期:2002-06-30 00:00:00

  • Likelihood-based analysis of outcome-dependent sampling designs with longitudinal data.

    abstract::The use of outcome-dependent sampling with longitudinal data analysis has previously been shown to improve efficiency in the estimation of regression parameters. The motivating scenario is when outcome data exist for all cohort members but key exposure variables will be gathered only on a subset. Inference with outcom...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7633

    authors: Zelnick LR,Schildcrout JS,Heagerty PJ

    更新日期:2018-06-15 00:00:00

  • Semiparametric Bayesian variable selection for gene-environment interactions.

    abstract::Many complex diseases are known to be affected by the interactions between genetic variants and environmental exposures beyond the main genetic and environmental effects. Study of gene-environment (G×E) interactions is important for elucidating the disease etiology. Existing Bayesian methods for G×E interaction studie...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8434

    authors: Ren J,Zhou F,Li X,Chen Q,Zhang H,Ma S,Jiang Y,Wu C

    更新日期:2020-02-28 00:00:00

  • Measurement error in dietary assessment: an investigation using covariance structure models. Part II.

    abstract::In Part I we presented a covariance structure model for analysing measurement error in the assessment of nitrogen intake. In this paper we include data on urine nitrogen excretion which allows a critical assessment of the model proposed. Inclusion of urine nitrogen data produces more pessimistic estimates of the quali...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780121005

    authors: Plummer M,Clayton D

    更新日期:1993-05-30 00:00:00