Estimating overdispersion in sparse multinomial data.

Abstract:

:Multinomial data arise in many areas of the life sciences, such as mark-recapture studies and phylogenetics, and will often by overdispersed, with the variance being higher than predicted by a multinomial model. The quasi-likelihood approach to modeling this overdispersion involves the assumption that the variance is proportional to that specified by the multinomial model. As this approach does not require specification of the full distribution of the response variable, it can be more robust than fitting a Dirichlet-multinomial model or adding a random effect to the linear predictor. Estimation of the amount of overdispersion is often based on Pearson's statistic X2 or the deviance D. For many types of study, such as mark-recapture, the data will be sparse. The estimator based on X2 can then be highly variable, and that based on D can have a large negative bias. We derive a new estimator, which has a smaller asymptotic variance than that based on X2 , the difference being most marked for sparse data. We illustrate the numerical difference between the three estimators using a mark-recapture study of swifts and compare their performance via a simulation study. The new estimator has the lowest root mean squared error across a range of scenarios, especially when the data are very sparse.

journal_name

Biometrics

journal_title

Biometrics

authors

Afroz F,Parry M,Fletcher D

doi

10.1111/biom.13194

subject

Has Abstract

pub_date

2020-09-01 00:00:00

pages

834-842

issue

3

eissn

0006-341X

issn

1541-0420

journal_volume

76

pub_type

杂志文章
  • Extension of the rank sum test for clustered data: two-group comparisons with group membership defined at the subunit level.

    abstract::The Wilcoxon rank sum test is widely used for two-group comparisons for nonnormal data. An assumption of this test is independence of sampling units both between and within groups. In ophthalmology, data are often collected on two eyes of an individual, which are highly correlated. In ophthalmological clinical trials,...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2006.00582.x

    authors: Rosner B,Glynn RJ,Lee ML

    更新日期:2006-12-01 00:00:00

  • Fitting nonlinear and constrained generalized estimating equations with optimization software.

    abstract::In this article, we present an estimation approach for solving nonlinear constrained generalized estimating equations that can be implemented using object-oriented software for nonlinear programming, such as nlminb in Splus or fmincon and lsqnonlin in Matlab. We show how standard estimating equation theory includes th...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2000.01268.x

    authors: Contreras M,Ryan LM

    更新日期:2000-12-01 00:00:00

  • A permutation approach for selecting the penalty parameter in penalized model selection.

    abstract::We describe a simple, computationally efficient, permutation-based procedure for selecting the penalty parameter in LASSO-penalized regression. The procedure, permutation selection, is intended for applications where variable selection is the primary focus, and can be applied in a variety of structural settings, inclu...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12359

    authors: Sabourin JA,Valdar W,Nobel AB

    更新日期:2015-12-01 00:00:00

  • Line-segment confidence bands for repeated measures.

    abstract::For the case of repeated measures on Y with mean values linear in a concomitant variable Z in [a, b], a straight-line confidence band over [a, b] is given with width linear in Z. Graphical presentation of such line-segment confidence bands can help emphasize that appropriate inferences are limited to the range of the ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Stewart PW

    更新日期:1987-09-01 00:00:00

  • Bayesian prediction of spatial count data using generalized linear mixed models.

    abstract::Spatial weed count data are modeled and predicted using a generalized linear mixed model combined with a Bayesian approach and Markov chain Monte Carlo. Informative priors for a data set with sparse sampling are elicited using a previously collected data set with extensive sampling. Furthermore, we demonstrate that so...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2002.00280.x

    authors: Christensen OF,Waagepetersen R

    更新日期:2002-06-01 00:00:00

  • On constrained balance randomization for clinical trials.

    abstract::A method is proposed for calculating the probabilities of assignment of a patient to treatments; it involves minimizing a quadratic criterion subject to a balance constraint. The optimal probabilities are very easy to compute. Numerical illustration is given and comparisons are drawn with the entropy-based methods of ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Titterington DM

    更新日期:1983-12-01 00:00:00

  • Procedures for comparing samples with multiple endpoints.

    abstract::Five procedures are considered for the comparison of two or more multivariate samples. These procedures include a newly proposed nonparametric rank-sum test and a generalized least squares test. Also considered are the following tests: ordinary least squares, Hotelling's T2, and a Bonferroni per-experiment error-rate ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: O'Brien PC

    更新日期:1984-12-01 00:00:00

  • Model selection and inference for censored lifetime medical expenditures.

    abstract::Identifying factors associated with increased medical cost is important for many micro- and macro-institutions, including the national economy and public health, insurers and the insured. However, assembling comprehensive national databases that include both the cost and individual-level predictors can prove challengi...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12464

    authors: Johnson BA,Long Q,Huang Y,Chansky K,Redman M

    更新日期:2016-09-01 00:00:00

  • An adaptive weighted log-rank test with application to cancer prevention and screening trials.

    abstract::A class of adaptive weighted log-rank statistics is described where the vector of weights is chosen in a data-dependent way from a family of "smooth" weight vectors. A parametric family of weight vectors is identified which includes most shapes of weighting vectors that will be near optimal in many cancer prevention a...

    journal_title:Biometrics

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:

    authors: Self SG

    更新日期:1991-09-01 00:00:00

  • Spatial regression with covariate measurement error: A semiparametric approach.

    abstract::Spatial data have become increasingly common in epidemiology and public health research thanks to advances in GIS (Geographic Information Systems) technology. In health research, for example, it is common for epidemiologists to incorporate geographically indexed data into their studies. In practice, however, the spati...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12474

    authors: Huque MH,Bondell HD,Carroll RJ,Ryan LM

    更新日期:2016-09-01 00:00:00

  • Logarithmic transformations in ANOVA.

    abstract::A method is presented for choosing an additive constant c when transforming data x to y = log(x + c). The method preserves Type I error probability and power in ANOVA under the assumption that the x + c for some c are log-normally distributed. The method has advantages similar to those of rank transformations--namely,...

    journal_title:Biometrics

    pub_type: 临床试验,杂志文章

    doi:

    authors: Berry DA

    更新日期:1987-06-01 00:00:00

  • Bayesian model selection for incomplete data using the posterior predictive distribution.

    abstract::We explore the use of a posterior predictive loss criterion for model selection for incomplete longitudinal data. We begin by identifying a property that most model selection criteria for incomplete data should consider. We then show that a straightforward extension of the Gelfand and Ghosh (1998, Biometrika, 85, 1-11...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2012.01766.x

    authors: Daniels MJ,Chatterjee AS,Wang C

    更新日期:2012-12-01 00:00:00

  • Quantifying the predictive performance of prognostic models for censored survival data with time-dependent covariates.

    abstract::Prognostic models in survival analysis typically aim to describe the association between patient covariates and future outcomes. More recently, efforts have been made to include covariate information that is updated over time. However, there exists as yet no standard approach to assess the predictive accuracy of such ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2007.00889.x

    authors: Schoop R,Graf E,Schumacher M

    更新日期:2008-06-01 00:00:00

  • Optimal Bayesian design for patient selection in a clinical study.

    abstract::Bayesian experimental design for a clinical trial involves specifying a utility function that models the purpose of the trial, in this case the selection of patients for a diagnostic test. The best sample of patients is selected by maximizing expected utility. This optimization task poses difficulties due to a high-di...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2008.01156.x

    authors: Buzoianu M,Kadane JB

    更新日期:2009-09-01 00:00:00

  • A hypothesis test for the end of a common source outbreak.

    abstract::The objective of this article is to develop a hypothesis-testing procedure to determine whether a common source outbreak has ended. We consider the case when neither the calendar date of exposure to the pathogen nor the exact incubation period distribution is known. The hypothesis-testing procedure is based on the spa...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2005.00421.x

    authors: Brookmeyer R,You X

    更新日期:2006-03-01 00:00:00

  • Hypothesis testing of matrix graph model with application to brain connectivity analysis.

    abstract::Brain connectivity analysis is now at the foreground of neuroscience research. A connectivity network is characterized by a graph, where nodes represent neural elements such as neurons and brain regions, and links represent statistical dependence that is often encoded in terms of partial correlation. Such a graph is i...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12633

    authors: Xia Y,Li L

    更新日期:2017-09-01 00:00:00

  • Analysis of ordered categorical data: two score-independent approaches.

    abstract:SUMMARY:A trend test is often employed to analyze ordered categorical data, in which a set of increasing scores is assigned a priori. There is a drawback in this approach, because how to choose a set of scores is not clear. There have been debates on which scores should be used (e.g., Graubard and Korn, 1987, Biometric...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2008.00992.x

    authors: Zheng G

    更新日期:2008-12-01 00:00:00

  • Sample size determination for establishing equivalence/noninferiority via ratio of two proportions in matched-pair design.

    abstract::In this article, we propose approximate sample size formulas for establishing equivalence or noninferiority of two treatments in match-pairs design. Using the ratio of two proportions as the equivalence measure, we derive sample size formulas based on a score statistic for two types of analyses: hypothesis testing and...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2002.00957.x

    authors: Tang ML,Tang NS,Chan IS,Chan BP

    更新日期:2002-12-01 00:00:00

  • Robust inference for the stepped wedge design.

    abstract::Stepped wedge designed trials are a type of cluster-randomized study in which the intervention is introduced to each cluster in a random order over time. This design is often used to assess the effect of a new intervention as it is rolled out across a series of clinics or communities. Based on a permutation argument, ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13106

    authors: Hughes JP,Heagerty PJ,Xia F,Ren Y

    更新日期:2020-03-01 00:00:00

  • A spatial Bayesian latent factor model for image-on-image regression.

    abstract::Image-on-image regression analysis, using images to predict images, is a challenging task, due to (1) the high dimensionality and (2) the complex spatial dependence structures in image predictors and image outcomes. In this work, we propose a novel image-on-image regression model, by extending a spatial Bayesian laten...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13420

    authors: Guo C,Kang J,Johnson TD

    更新日期:2020-12-27 00:00:00

  • Estimating treatment effect in a proportional hazards model in randomized clinical trials with all-or-nothing compliance.

    abstract::We consider methods for estimating the treatment effect and/or the covariate by treatment interaction effect in a randomized clinical trial under noncompliance with time-to-event outcome. As in Cuzick et al. (2007), assuming that the patient population consists of three (possibly latent) subgroups based on treatment p...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12472

    authors: Li S,Gray RJ

    更新日期:2016-09-01 00:00:00

  • Accelerated hazards model based on parametric families generalized with Bernstein polynomials.

    abstract::A transformed Bernstein polynomial that is centered at standard parametric families, such as Weibull or log-logistic, is proposed for use in the accelerated hazards model. This class provides a convenient way towards creating a Bayesian nonparametric prior for smooth densities, blending the merits of parametric and no...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12104

    authors: Chen Y,Hanson T,Zhang J

    更新日期:2014-03-01 00:00:00

  • Small sample inference for fixed effects from restricted maximum likelihood.

    abstract::Restricted maximum likelihood (REML) is now well established as a method for estimating the parameters of the general Gaussian linear model with a structured covariance matrix, in particular for mixed linear models. Conventionally, estimates of precision and inference for fixed effects are based on their asymptotic di...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Kenward MG,Roger JH

    更新日期:1997-09-01 00:00:00

  • Reader reaction: A note on the evaluation of group testing algorithms in the presence of misclassification.

    abstract::In the context of group testing screening, McMahan, Tebbs, and Bilder (2012, Biometrics 68, 287-296) proposed a two-stage procedure in a heterogenous population in the presence of misclassification. In earlier work published in Biometrics, Kim, Hudgens, Dreyfuss, Westreich, and Pilcher (2007, Biometrics 63, 1152-1162)...

    journal_title:Biometrics

    pub_type: 评论,杂志文章

    doi:10.1111/biom.12385

    authors: Malinovsky Y,Albert PS,Roy A

    更新日期:2016-03-01 00:00:00

  • Bayesian lasso for semiparametric structural equation models.

    abstract::There has been great interest in developing nonlinear structural equation models and associated statistical inference procedures, including estimation and model selection methods. In this paper a general semiparametric structural equation model (SSEM) is developed in which the structural equation is composed of nonpar...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2012.01751.x

    authors: Guo R,Zhu H,Chow SM,Ibrahim JG

    更新日期:2012-06-01 00:00:00

  • Bayesian models for multivariate current status data with informative censoring.

    abstract::Multivariate current status data, consist of indicators of whether each of several events occur by the time of a single examination. Our interest focuses on inferences about the joint distribution of the event times. Conventional methods for analysis of multiple event-time data cannot be used because all of the event ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2002.00079.x

    authors: Dunson DB,Dinse GE

    更新日期:2002-03-01 00:00:00

  • Response-adaptive regression for longitudinal data.

    abstract::We propose a response-adaptive model for functional linear regression, which is adapted to sparsely sampled longitudinal responses. Our method aims at predicting response trajectories and models the regression relationship by directly conditioning the sparse and irregular observations of the response on the predictor,...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2010.01518.x

    authors: Wu S,Müller HG

    更新日期:2011-09-01 00:00:00

  • Validity of tests under covariate-adaptive biased coin randomization and generalized linear models.

    abstract::Some covariate-adaptive randomization methods have been used in clinical trials for a long time, but little theoretical work has been done about testing hypotheses under covariate-adaptive randomization until Shao et al. (2010) who provided a theory with detailed discussion for responses under linear models. In this a...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12062

    authors: Shao J,Yu X

    更新日期:2013-12-01 00:00:00

  • Selecting factors predictive of heterogeneity in multivariate event time data.

    abstract::In multivariate survival analysis, investigators are often interested in testing for heterogeneity among clusters, both overall and within specific classes. We represent different hypotheses about the heterogeneity structure using a sequence of gamma frailty models, ranging from a null model with no random effects to ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2004.00179.x

    authors: Dunson DB,Chen Z

    更新日期:2004-06-01 00:00:00

  • Semiparametric modeling of longitudinal measurements and time-to-event data--a two-stage regression calibration approach.

    abstract:SUMMARY:In this article we investigate regression calibration methods to jointly model longitudinal and survival data using a semiparametric longitudinal model and a proportional hazards model. In the longitudinal model, a biomarker is assumed to follow a semiparametric mixed model where covariate effects are modeled p...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2007.00983.x

    authors: Ye W,Lin X,Taylor JM

    更新日期:2008-12-01 00:00:00