Performance of analytical methods for overdispersed counts in cluster randomized trials: sample size, degree of clustering and imbalance.

Abstract:

:Many different methods have been proposed for the analysis of cluster randomized trials (CRTs) over the last 30 years. However, the evaluation of methods on overdispersed count data has been based mostly on the comparison of results using empiric data; i.e. when the true model parameters are not known. In this study, we assess via simulation the performance of five methods for the analysis of counts in situations similar to real community-intervention trials. We used the negative binomial distribution to simulate overdispersed counts of CRTs with two study arms, allowing the period of time under observation to vary among individuals. We assessed different sample sizes, degrees of clustering and degrees of cluster-size imbalance. The compared methods are: (i) the two-sample t-test of cluster-level rates, (ii) generalized estimating equations (GEE) with empirical covariance estimators, (iii) GEE with model-based covariance estimators, (iv) generalized linear mixed models (GLMM) and (v) Bayesian hierarchical models (Bayes-HM). Variation in sample size and clustering led to differences between the methods in terms of coverage, significance, power and random-effects estimation. GLMM and Bayes-HM performed better in general with Bayes-HM producing less dispersed results for random-effects estimates although upward biased when clustering was low. GEE showed higher power but anticonservative coverage and elevated type I error rates. Imbalance affected the overall performance of the cluster-level t-test and the GEE's coverage in small samples. Important effects arising from accounting for overdispersion are illustrated through the analysis of a community-intervention trial on Solar Water Disinfection in rural Bolivia.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Durán Pacheco G,Hattendorf J,Colford JM Jr,Mäusezahl D,Smith T

doi

10.1002/sim.3681

subject

Has Abstract

pub_date

2009-10-30 00:00:00

pages

2989-3011

issue

24

eissn

0277-6715

issn

1097-0258

journal_volume

28

pub_type

杂志文章
  • Analysis of the ratio of marginal probabilities in a matched-pair setting.

    abstract::Statistical methods for testing and interval estimation of the ratio of marginal probabilities in the matched-pair setting are considered in this paper. We are especially interested in the situation where the null value is not one, as in one-sided equivalence trials. We propose a Fieller-type statistic based on constr...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1017

    authors: Nam JM,Blackwelder WC

    更新日期:2002-03-15 00:00:00

  • Outcome-adaptive randomization for a delayed outcome with a short-term predictor: imputation-based designs.

    abstract::Delay in the outcome variable is challenging for outcome-adaptive randomization, as it creates a lag between the number of subjects accrued and the information known at the time of the analysis. Motivated by a real-life pediatric ulcerative colitis trial, we consider a case where a short-term predictor is available fo...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6222

    authors: Kim MO,Liu C,Hu F,Lee JJ

    更新日期:2014-10-15 00:00:00

  • Model selection in logistic joinpoint regression with applications to analyzing cohort mortality patterns.

    abstract::We consider a general model for anomaly detection in a longitudinal cohort mortality pattern based on logistic joinpoint regression with unknown joinpoints. We discuss backward and forward sequential procedures for selecting both the locations and the number of joinpoints. Estimation of the model parameters and the se...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3017

    authors: Czajkowski M,Gill R,Rempala G

    更新日期:2008-04-30 00:00:00

  • CoPlot: a tool for visualizing multivariate data in medicine.

    abstract::Many critical questions in medicine require the analysis of complex multivariate data, often from large data sets describing numerous variables for numerous subjects. In this paper, we describe CoPlot, a tool for visualizing multivariate data in medicine. CoPlot is an adaptation of multidimensional scaling (MDS) that ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3078

    authors: Bravata DM,Shojania KG,Olkin I,Raveh A

    更新日期:2008-05-30 00:00:00

  • A pathway analysis method for genome-wide association studies.

    abstract::For genome-wide association studies, we propose a new method for identifying significant biological pathways. In this approach, we aggregate data across single-nucleotide polymorphisms to obtain summary measures at the gene level. We then use a hierarchical Bayesian model, which takes the gene-level summary measures a...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4477

    authors: Shahbaba B,Shachaf CM,Yu Z

    更新日期:2012-05-10 00:00:00

  • Conflicts of interest in data monitoring of industry versus publicly financed clinical trials.

    abstract::The FDA Guidance, while highly appropriate for industry sponsored trials, need not be imposed on publicly (e.g. NIH) financed clinical trials. While the potential for conflicts of interest exist in the latter, they are in general manageable and pose an acceptable low risk of threatening the integrity of a study. Howev...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1787

    authors: Lachin JM

    更新日期:2004-05-30 00:00:00

  • An analysis of disease surveillance data that uses the geographic locations of the reporting units.

    abstract::The primary purpose of a disease surveillance system is to provide data for the detection of changes in the incidence of the disease. Methods for the analysis of data from surveillance systems are reviewed. A new procedure is proposed for use when the system includes geographically dispersed reporting units, such as h...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780080306

    authors: Raubertas RF

    更新日期:1989-03-01 00:00:00

  • Comparisons of risk prediction methods using nested case-control data.

    abstract::Using both simulated and real datasets, we compared two approaches for estimating absolute risk from nested case-control (NCC) data and demonstrated the feasibility of using the NCC design for estimating absolute risk. In contrast to previously published results, we successfully demonstrated not only that data from a ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7143

    authors: Salim A,Delcoigne B,Villaflores K,Koh WP,Yuan JM,van Dam RM,Reilly M

    更新日期:2017-02-10 00:00:00

  • A Markov mixed effect regression model for drug compliance.

    abstract::Patient compliance (adherence) with prescribed medication is often erratic, while clinical outcomes are causally linked to actual, rather than nominal medication dosage. We propose here a hierarchical Markov model for patient compliance. At the first stage, conditional upon individual random effects and a set of indiv...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19981030)17:20<2313::aid-s

    authors: Girard P,Blaschke TF,Kastrissios H,Sheiner LB

    更新日期:1998-10-30 00:00:00

  • Issues in applied statistics for public health bioterrorism surveillance using multiple data streams: research needs.

    abstract::The objective of this report is to provide a basis to inform decisions about priorities for developing statistical research initiatives in the field of public health surveillance for emerging threats. Rapid information system advances have created a vast opportunity of secondary data sources for information to enhance...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2793

    authors: Rolka H,Burkom H,Cooper GF,Kulldorff M,Madigan D,Wong WK

    更新日期:2007-04-15 00:00:00

  • Integrating multiple-domain rules for disease classification.

    abstract::In psychiatry, clinicians use criteria sets from the Diagnostic and Statistical Manual of Mental Disorders to diagnose mental disorders. Most criteria sets have several symptom domains, and in order to be diagnosed, an individual must meet the minimum number of symptoms required by each domain. Some efforts are now fo...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8173

    authors: Mauro C,Shear MK,Wang Y

    更新日期:2019-07-20 00:00:00

  • Cutoff designs for community-based intervention studies.

    abstract::Public health interventions are often designed to target communities defined either geographically (e.g. cities, counties) or socially (e.g. schools or workplaces). The group randomized trial (GRT) is regarded as the gold standard for evaluating these interventions. However, community leaders may object to randomizati...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,随机对照试验

    doi:10.1002/sim.4237

    authors: Pennell ML,Hade EM,Murray DM,Rhoda DA

    更新日期:2011-07-10 00:00:00

  • Reflecting on "A Statistician in Medicine" in 2020.

    abstract::In this commentary, we revisit Sir Austin Bradford Hill's seminal Alfred Watson Memorial Lecture in 1962 through the eyes of two practicing biostatisticians of the current era. We summarize some eternal takeaway messages from Hill's lecture regarding observations and experiments translated through the modern lexicon o...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8830

    authors: Dempsey W,Mukherjee B

    更新日期:2021-01-15 00:00:00

  • Human disease cost network analysis.

    abstract::Diseases can be interconnected. In the recent years, there has been a surge of multidisease studies. Among them, HDN (human disease network) analysis takes a system perspective, examines the interconnections among diseases along with their individual properties, and has demonstrated great potential. Most of the existi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8472

    authors: Ma C,Li Y,Shia B,Ma S

    更新日期:2020-04-30 00:00:00

  • Family-based association tests for survival and times-to-onset analysis.

    abstract::In this paper, we discuss family-based association test (FBATs) relating genetic data to survival and time-to-onset data. We show how the standard logrank and Wilcoxon statistics can be used with family data to develop tests of association. We prove that the FBAT-logrank approach can be identical to the proportional h...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1707

    authors: Lange C,Blacker D,Laird NM

    更新日期:2004-01-30 00:00:00

  • Multilevel latent variable models for global health-related quality of life assessment.

    abstract::Quality of life (QOL) assessment is a key component of many clinical studies and frequently requires the use of single global summary measures that capture the overall balance of findings from a potentially wide-ranging assessment of QOL issues. We propose and evaluate an irregular multilevel latent variable model sui...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4455

    authors: Kifley A,Heller GZ,Beath KJ,Bulger D,Ma J,Gebski V

    更新日期:2012-05-20 00:00:00

  • Complete imputation of missing repeated categorical data: one-sample applications.

    abstract::Longitudinal studies with repeated measures are often subject to non-response. Methods currently employed to alleviate the difficulties caused by missing data are typically unsatisfactory, especially when the cause of the missingness is related to the outcomes. We present an approach for incomplete categorical data in...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.982

    authors: West CP,Dawson JD

    更新日期:2002-01-30 00:00:00

  • Analyzing survival curves at a fixed point in time.

    abstract::A common problem encountered in many medical applications is the comparison of survival curves. Often, rather than comparison of the entire survival curves, interest is focused on the comparison at a fixed point in time. In most cases, the naive test based on a difference in the estimates of survival is used for this ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2864

    authors: Klein JP,Logan B,Harhoff M,Andersen PK

    更新日期:2007-10-30 00:00:00

  • Regression analysis applied to PVC histories: a statistical procedure for evaluating antiarrhythmic drug efficacy.

    abstract::Suppression of premature ventricular contractions (PVCs) is one of the goals of antiarrhythmic therapy. In a clinical trial, however, it may be difficult to distinguish antiarrhythmic drug effect from spontaneous variation in PVCs. We propose the application of linear regression to PVC histories to ascertain drug effe...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章

    doi:10.1002/sim.4780020305

    authors: Berry DA,Fox TL

    更新日期:1983-07-01 00:00:00

  • Infant growth modelling using a shape invariant model with random effects.

    abstract::Models for infant growth have usually been based on parametric forms, commonly an exponential or similar model, which have been shown to fit poorly especially during the first year of life. An alternative approach is to use a non-parametric model, based on a shape invariant model (SIM), where a single function is tran...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2718

    authors: Beath KJ

    更新日期:2007-05-30 00:00:00

  • A review of methods for futility stopping based on conditional power.

    abstract::Conditional power (CP) is the probability that the final study result will be statistically significant, given the data observed thus far and a specific assumption about the pattern of the data to be observed in the remainder of the study, such as assuming the original design effect, or the effect estimated from the c...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.2151

    authors: Lachin JM

    更新日期:2005-09-30 00:00:00

  • Incorporating data from various trial designs into a mixed treatment comparison model.

    abstract::Estimates of relative efficacy between alternative treatments are crucial for decision making in health care. Bayesian mixed treatment comparison models provide a powerful methodology to obtain such estimates when head-to-head evidence is not available or insufficient. In recent years, this methodology has become wide...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5764

    authors: Schmitz S,Adams R,Walsh C

    更新日期:2013-07-30 00:00:00

  • Joint analysis of multi-level repeated measures data and survival: an application to the end stage renal disease (ESRD) data.

    abstract::Shared random effects models have been increasingly common in the joint analyses of repeated measures (e.g. CD4 counts, hemoglobin levels) and a correlated failure time such as death. In this paper we study several shared random effects models in the multi-level repeated measures data setting with dependent failure ti...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3392

    authors: Liu L,Ma JZ,O'Quigley J

    更新日期:2008-11-29 00:00:00

  • Predictive accuracy of risk factors and markers: a simulation study of the effect of novel markers on different performance measures for logistic regression models.

    abstract::The change in c-statistic is frequently used to summarize the change in predictive accuracy when a novel risk factor is added to an existing logistic regression model. We explored the relationship between the absolute change in the c-statistic, Brier score, generalized R(2) , and the discrimination slope when a risk f...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5598

    authors: Austin PC,Steyerberg EW

    更新日期:2013-02-20 00:00:00

  • A meta-analysis of clinical trials involving different classifications of response into ordered categories.

    abstract::Statistical methods are available for performing a meta-analysis when the response variable of interest is the same in each study. Problems arise when studies exploring a common therapeutic question use different patient response types. This article presents statistical methods for combining studies which involve diff...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780132313

    authors: Whitehead A,Jones NM

    更新日期:1994-12-15 00:00:00

  • Spatial clustering of the failure to geocode and its implications for the detection of disease clustering.

    abstract::Geocoding a study population as completely as possible is an important data assimilation component of many spatial epidemiologic studies. Unfortunately, complete geocoding is rare in practice. The failure of a substantial proportion of study subjects' addresses to geocode has consequences for spatial analyses, some of...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3288

    authors: Zimmerman DL,Fang X,Mazumdar S

    更新日期:2008-09-20 00:00:00

  • Semi-parametric modelling for costs of health care technologies.

    abstract::Cost data that arise in the evaluation of health care technologies usually exhibit highly skew, heavy-tailed and, possibly, multi-modal distributions. Distribution-free methods for analysing these data, such as the bootstrap, or those based on the asymptotic normality of sample means, may often lead to inefficient or ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2012

    authors: Conigliani C,Tancredi A

    更新日期:2005-10-30 00:00:00

  • Proportion cured and mean log survival time as functions of tumour size.

    abstract::We obtained maximum likelihood estimates (MLEs) of the proportion cured pi c and mean log survival time mu t for a sample of 4355 patients with intraocular melanoma whose survival times subsequent to treatment were assumed to follow a log-normal distribution. Following stratification by tumour size, MLEs of pi c and m...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780090814

    authors: Gamel JW,McLean IW,Rosenberg SH

    更新日期:1990-08-01 00:00:00

  • Software for tabular data protection.

    abstract::In order for national statistical offices to maintain the trust of the public to collect data and publish statistics of importance to society and decision-making, it is imperative that respondents (persons or establishments) be guaranteed privacy and confidentiality in return for providing requested confidential data....

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2043

    authors: Gonzalez JF Jr,Cox LH

    更新日期:2005-02-28 00:00:00

  • Concordance correlation coefficient applied to discrete data.

    abstract::In any field in which decisions are subject to measurements, interchangeability between the methods used to obtain these measurements is essential. To consider methods as interchangeable, a certain degree of agreement is needed between the measurements they provide. The concordance correlation coefficient is an index ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2397

    authors: Carrasco JL,Jover L

    更新日期:2005-12-30 00:00:00