Methods for assessing reliability and validity for a measurement tool: a case study and critique using the WHO haemoglobin colour scale.

Abstract:

:Before introducing a new measurement tool it is necessary to evaluate its performance. Several statistical methods have been developed, or used, to evaluate the reliability and validity of a new assessment method in such circumstances. In this paper we review some commonly used methods. Data from a study that was conducted to evaluate the usefulness of a specific measurement tool (the WHO Colour Scale) is then used to illustrate the application of these methods. The WHO Colour Scale was developed under the auspices of the WHO to provide a simple portable and reliable method of detecting anaemia. This Colour Scale is a discrete interval scale, whereas the actual haemoglobin values it is used to estimate are on a continuous interval scale and can be measured accurately using electrical laboratory equipment. The methods we consider are: linear regression, correlation coefficients, paired t-tests plotting differences against mean values and deriving limits of agreement; kappa and weighted kappa statistics, sensitivity and specificity, an intraclass correlation coefficient and the repeatability coefficient. We note that although the definition and properties of each of these methods is well established inappropriate methods continue to be used in medical literature for assessing reliability and validity, as evidenced in the context of the evaluation of the WHO Colour Scale.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

White SA,van den Broek NR

doi

10.1002/sim.1804

subject

Has Abstract

pub_date

2004-05-30 00:00:00

pages

1603-19

issue

10

eissn

0277-6715

issn

1097-0258

journal_volume

23

pub_type

杂志文章
  • A review of methods for futility stopping based on conditional power.

    abstract::Conditional power (CP) is the probability that the final study result will be statistically significant, given the data observed thus far and a specific assumption about the pattern of the data to be observed in the remainder of the study, such as assuming the original design effect, or the effect estimated from the c...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.2151

    authors: Lachin JM

    更新日期:2005-09-30 00:00:00

  • The relative importance of prognostic factors in studies of survival.

    abstract::The relative importance of prognostic factors in regression can be measured either by standardized regression coefficients or by percentages of explained variation in a dependent variable. One advantage of using explained variation is the direct comparability of qualitative prognostic factors with others, or of groups...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780122413

    authors: Schemper M

    更新日期:1993-12-30 00:00:00

  • Estimating treated prevalence and service utilization rates: assessing disparities in mental health.

    abstract::There is considerable public concern about health disparities among different cultural/racial/ethnic groups. Important process measures that might reflect inequities are treated prevalence and the service utilization rate in a defined period of time. We have previously described a method for estimating N, the distinct...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3904

    authors: Laska EM,Meisner M,Wanderling J,Siegel C

    更新日期:2010-07-20 00:00:00

  • A method to estimate the variance of an endpoint from an on-going blinded trial.

    abstract::Blinded estimation of variance allows for changing the sample size without compromising the integrity of the trial. Some of the methods that estimate the variance in a blinded manner either make untenable assumptions or are only applicable to two-treatment trials. We propose a new method for continuous endpoints that ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2070

    authors: Xing B,Ganju J

    更新日期:2005-06-30 00:00:00

  • Generalizability of causal inference in observational studies under retrospective convenience sampling.

    abstract::Many observational studies adopt what we call retrospective convenience sampling (RCS). With the sample size in each arm prespecified, RCS randomly selects subjects from the treatment-inclined subpopulation into the treatment arm and those from the control-inclined into the control arm. Samples in each arm are represe...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7808

    authors: Hu Z,Qin J

    更新日期:2018-05-20 00:00:00

  • Multiple imputation analysis of case-cohort studies.

    abstract::The usual methods for analyzing case-cohort studies rely on sometimes not fully efficient weighted estimators. Multiple imputation might be a good alternative because it uses all the data available and approximates the maximum partial likelihood estimator. This method is based on the generation of several plausible co...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4130

    authors: Marti H,Chavance M

    更新日期:2011-06-15 00:00:00

  • Application of the parallel line assay to assessment of biosimilar products based on binary endpoints.

    abstract::Biological drug products are therapeutic moieties manufactured by a living system or organisms. These are important life-saving drug products for patients with unmet medical needs. Because of expensive cost, only a few patients have access to life-saving biological products. Most of the early biological products will ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5565

    authors: Lin JR,Chow SC,Chang CH,Lin YC,Liu JP

    更新日期:2013-02-10 00:00:00

  • Estimating a survival curve with unlinked entry and failure times.

    abstract::In monitoring a clinical trial or other observational study with a survival endpoint, sometimes the numbers of patients entering and dying at each time point are presented, but the connections between them are kept confidential. Hence, the exact time to failure or censoring for each individual is missing. We refer to ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2819

    authors: Wu Y,Shih WJ,Moore DF

    更新日期:2007-08-30 00:00:00

  • Predictive diagnostics for logistic models.

    abstract::Novel methodology is implemented to assess the predictive power of covariate information associated with sequential binary events. Logistic models are first fitted on the basis of a subset of the observations and then evaluated sequentially on the rest. The probabilistic forecasts are compared to the outcomes via a sc...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(SICI)1097-0258(19961030)15:20<2149::AID-S

    authors: Seillier-Moiseiwitsch F

    更新日期:1996-10-30 00:00:00

  • Designing a study to evaluate the benefit of a biomarker for selecting patient treatment.

    abstract::Biomarkers that predict the efficacy of treatment can potentially improve clinical outcomes and decrease medical costs by allowing treatment to be provided only to those most likely to benefit. We consider the design of a randomized clinical trial in which one objective is to evaluate a treatment selection marker. The...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6564

    authors: Janes H,Brown MD,Pepe MS

    更新日期:2015-11-30 00:00:00

  • Design and sample size considerations for valuation studies of multi-attribute utility instruments.

    abstract::The EQ-5D, a widely used multiattribute utility instrument, is commonly used in health economic evaluations where the goal is to decide on which treatments to reimburse. Like other instruments, value sets of the EQ-5D are constructed using valuation studies typically valuing a subset of the health states and using pre...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8592

    authors: Shams S,Pullenayegum E

    更新日期:2020-10-15 00:00:00

  • Risk-adjusted CUSUM charts under model error.

    abstract::In recent years, quality control charts have been increasingly applied in the healthcare environment, for example, to monitor surgical performance. Risk-adjusted cumulative (CUSUM) charts that utilize risk scores like the Parsonnet score to estimate the probability of death of a patient from an operation turn out to b...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8104

    authors: Knoth S,Wittenberg P,Gan FF

    更新日期:2019-05-30 00:00:00

  • Small clinical trials: are they all bad?

    abstract::Statisticians have long argued that randomized controlled trials should be sufficiently large to achieve their purpose, and for common diseases with major public health implications this has brought many benefits. However, there are many instances where it is unrealistic to expect clinicians to provide the information...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.4780140204

    authors: Matthews JN

    更新日期:1995-01-30 00:00:00

  • The use of dual or multiple reports in epidemiologic studies.

    abstract::Weak measurement of epidemiologic exposures is an impediment to appreciation of the effects of those exposures. This paper discusses two strategies to assess the true effects of weakly measured exposure. The first is to use external information about the extent of mismeasurement to adjust estimates of the effects of e...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780080904

    authors: Marshall JR

    更新日期:1989-09-01 00:00:00

  • The basic science and mathematics of random mutation and natural selection.

    abstract::The mutation and natural selection phenomenon can and often does cause the failure of antimicrobial, herbicidal, pesticide and cancer treatments selection pressures. This phenomenon operates in a mathematically predictable behavior, which when understood leads to approaches to reduce and prevent the failure of the use...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6307

    authors: Kleinman A

    更新日期:2014-12-20 00:00:00

  • Concordance correlation coefficient applied to discrete data.

    abstract::In any field in which decisions are subject to measurements, interchangeability between the methods used to obtain these measurements is essential. To consider methods as interchangeable, a certain degree of agreement is needed between the measurements they provide. The concordance correlation coefficient is an index ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2397

    authors: Carrasco JL,Jover L

    更新日期:2005-12-30 00:00:00

  • The application of large Gaussian mixed models to the analysis of 24 hour ambulatory blood pressure monitoring data in clinical trials.

    abstract::We propose the use of Gaussian mixed models to analyse statistically 24 hour ambulatory blood pressure data from clinical trials. We develop specific models and apply them to data from a clinical study that compares two angiotensin-converting enzyme inhibitors. We investigate and discuss computing issues related to th...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/sim.4780121803

    authors: Selwyn MR,Difranco DM

    更新日期:1993-09-30 00:00:00

  • Using pilot data to size a two-arm randomized trial to find a nearly optimal personalized treatment strategy.

    abstract::A personalized treatment strategy formalizes evidence-based treatment selection by mapping patient information to a recommended treatment. Personalized treatment strategies can produce better patient outcomes while reducing cost and treatment burden. Thus, among clinical and intervention scientists, there is a growing...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6783

    authors: Laber EB,Zhao YQ,Regh T,Davidian M,Tsiatis A,Stanford JB,Zeng D,Song R,Kosorok MR

    更新日期:2016-04-15 00:00:00

  • A numerical strategy to evaluate performance of predictive scores via a copula-based approach.

    abstract::Assessing and comparing the performance of correlated predictive scores are of current interest in precision medicine. Given the limitations of available theoretical approaches for assessing and comparing the predictive accuracy, numerical methods are highly desired which, however, have not been systematically develop...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8566

    authors: Zhang Y,Shao Y

    更新日期:2020-09-10 00:00:00

  • Subgroup identification using covariate-adjusted interaction trees.

    abstract::We consider the problem of identifying subgroups of participants in a clinical trial that have enhanced treatment effect. Recursive partitioning methods that recursively partition the covariate space based on some measure of between groups treatment effect difference are popular for such subgroup identification. The m...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8214

    authors: Steingrimsson JA,Yang J

    更新日期:2019-09-20 00:00:00

  • Hierarchical multiple informants models: examining food environment contributions to the childhood obesity epidemic.

    abstract::Methods for multiple informants help to estimate the marginal effect of each multiple source predictor and formally compare the strength of their association with an outcome. We extend multiple informant methods to the case of hierarchical data structures to account for within cluster correlation. We apply the propose...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5967

    authors: Baek J,Sánchez BN,Sanchez-Vaznaugh EV

    更新日期:2014-02-20 00:00:00

  • A sexually transmitted infection screening algorithm based on semiparametric regression models.

    abstract::Sexually transmitted infections (STIs) with Chlamydia trachomatis, Neisseria gonorrhoeae, and Trichomonas vaginalis are among the most common infectious diseases in the United States, disproportionately affecting young women. Because a significant portion of the infections present no symptoms, infection control relies...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6515

    authors: Li Z,Liu H,Tu W

    更新日期:2015-09-10 00:00:00

  • Multilocus linkage disequilibrium mapping of epistatic quantitative trait loci that regulate HIV dynamics: a simulation approach.

    abstract::The time-dependent change of HIV particle load, i.e. HIV dynamics, is likely to be controlled by a multitude of quantitative trait loci (QTL) that interact with each other as well as with various developmental and environmental factors in a coordinated manner. In this article, we have derived a new statistical model f...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2489

    authors: Wu S,Yang J,Wu R

    更新日期:2006-11-30 00:00:00

  • Estimation of sojourn time distributions and false negative rates in screening programmes which use two modalities.

    abstract::Day and Walter derived methods of joint maximum likelihood estimation for the sojourn time distribution and the false negative rate for a screening programme. Their methods are not directly applicable to a programme which uses alternate screening by two modalities whose sojourn times and false negative rates will diff...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780080611

    authors: Alexander FE

    更新日期:1989-06-01 00:00:00

  • Adaptive increase in sample size when interim results are promising: a practical guide with examples.

    abstract::This paper discusses the benefits and limitations of adaptive sample size re-estimation for phase 3 confirmatory clinical trials. Comparisons are made with more traditional fixed sample and group sequential designs. It is seen that the real benefit of the adaptive approach arises through the ability to invest sample s...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4102

    authors: Mehta CR,Pocock SJ

    更新日期:2011-12-10 00:00:00

  • Data-adaptive additive modeling.

    abstract::In this paper, we consider fitting a flexible and interpretable additive regression model in a data-rich setting. We wish to avoid pre-specifying the functional form of the conditional association between each covariate and the response, while still retaining interpretability of the fitted functions. A number of recen...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7859

    authors: Petersen A,Witten D

    更新日期:2019-02-20 00:00:00

  • Complete imputation of missing repeated categorical data: one-sample applications.

    abstract::Longitudinal studies with repeated measures are often subject to non-response. Methods currently employed to alleviate the difficulties caused by missing data are typically unsatisfactory, especially when the cause of the missingness is related to the outcomes. We present an approach for incomplete categorical data in...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.982

    authors: West CP,Dawson JD

    更新日期:2002-01-30 00:00:00

  • Infant growth modelling using a shape invariant model with random effects.

    abstract::Models for infant growth have usually been based on parametric forms, commonly an exponential or similar model, which have been shown to fit poorly especially during the first year of life. An alternative approach is to use a non-parametric model, based on a shape invariant model (SIM), where a single function is tran...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2718

    authors: Beath KJ

    更新日期:2007-05-30 00:00:00

  • Modelling the geographical distribution of co-infection risk from single-disease surveys.

    abstract:BACKGROUND:The need to deliver interventions targeting multiple diseases in a cost-effective manner calls for integrated disease control efforts. Consequently, maps are required that show where the risk of co-infection is particularly high. Co-infection risk is preferably estimated via Bayesian geostatistical multinomi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4243

    authors: Schur N,Gosoniu L,Raso G,Utzinger J,Vounatsou P

    更新日期:2011-06-30 00:00:00

  • Curtailed two-stage designs in Phase II clinical trials.

    abstract::When the accrual rate is low and the treatment period is long, a long observational period is required before information concerning the primary end point, such as binary response, becomes available in the study. Simon's two-stage designs are often employed in Phase II clinical trials to avoid giving patient an ineffe...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3424

    authors: Chi Y,Chen CM

    更新日期:2008-12-20 00:00:00