A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

Abstract:

:Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework.

journal_name

Biometrics

journal_title

Biometrics

authors

Schörgendorfer A,Branscum AJ,Hanson TE

doi

10.1111/biom.12007

subject

Has Abstract

pub_date

2013-06-01 00:00:00

pages

508-19

issue

2

eissn

0006-341X

issn

1541-0420

journal_volume

69

pub_type

杂志文章
  • A score regression approach to assess calibration of continuous probabilistic predictions.

    abstract::Calibration, the statistical consistency of forecast distributions and the observations, is a central requirement for probabilistic predictions. Calibration of continuous forecasts is typically assessed using the probability integral transform histogram. In this article, we propose significance tests based on scoring ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2010.01406.x

    authors: Held L,Rufibach K,Balabdaoui F

    更新日期:2010-12-01 00:00:00

  • Tests for monotone mean residual life, using randomly censored data.

    abstract::At any age the mean residual life function gives the expected remaining life at that age. Reliabilists and biometricians have found it useful to categorize failure distributions by the monotonicity properties of the mean residual life function. Hollander and Proschan (1975, Biometrika 62, 585-593) have derived tests o...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Chen YY,Hollander M,Langberg NA

    更新日期:1983-03-01 00:00:00

  • Performance of generalized estimating equations in practical situations.

    abstract::Moment methods for analyzing repeated binary responses have been proposed by Liang and Zeger (1986, Biometrika 73, 13-22), and extended by Prentice (1988, Biometrics 44, 1033-1048). In their generalized estimating equations (GEE), both Liang and Zeger (1986) and Prentice (1988) estimate the parameters associated with ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Lipsitz SR,Fitzmaurice GM,Orav EJ,Laird NM

    更新日期:1994-03-01 00:00:00

  • Sample size determination for testing whether an identified treatment is best.

    abstract::Laska and Meisner (1989, Biometrics 45, 1139-1151) dealt with the problem of testing whether an identified treatment belonging to a set of k + 1 treatments is better than each of the other k treatments. They calculated sample size tables for k = 2 when using multiple t-tests or Wilcoxon-Mann-Whitney tests, both under ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2000.00879.x

    authors: Horn M,Vollandt R,Dunnett CW

    更新日期:2000-09-01 00:00:00

  • Sequential model selection-based segmentation to detect DNA copy number variation.

    abstract::Array-based CGH experiments are designed to detect genomic aberrations or regions of DNA copy-number variation that are associated with an outcome, typically a state of disease. Most of the existing statistical methods target on detecting DNA copy number variations in a single sample or array. We focus on the detectio...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12478

    authors: Hu J,Zhang L,Wang HJ

    更新日期:2016-09-01 00:00:00

  • Combining band recovery data and Pollock's robust design to model temporary and permanent emigration.

    abstract::Capture-recapture models are widely used to estimate demographic parameters of marked populations. Recently, this statistical theory has been extended to modeling dispersal of open populations. Multistate models can be used to estimate movement probabilities among subdivided populations if multiple sites are sampled. ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2001.00273.x

    authors: Lindberg MS,Kendall WL,Hines JE,Anderson MG

    更新日期:2001-03-01 00:00:00

  • Bayesian estimation of the probability of asbestos exposure from lung fiber counts.

    abstract::Asbestos exposure is a well-known risk factor for various lung diseases, and when they occur, workmen's compensation boards need to make decisions concerning the probability the cause is work related. In the absence of a definitive work history, measures of short and long asbestos fibers as well as counts of asbestos ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2009.01279.x

    authors: Weichenthal S,Joseph L,Bélisle P,Dufresne A

    更新日期:2010-06-01 00:00:00

  • The probability of causation under a stochastic model for individual risk.

    abstract::In this paper we offer a mathematical definition for the probability of causation that formalizes the legal and ordinary-language meaning of the term. We show that, under this definition, even the average probability of causation among exposed cases is not identifiable from epidemiologic data. This is because the prob...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Robins J,Greenland S

    更新日期:1989-12-01 00:00:00

  • Drawing inferences for high-dimensional linear models: A selection-assisted partial regression and smoothing approach.

    abstract::Drawing inferences for high-dimensional models is challenging as regular asymptotic theories are not applicable. This article proposes a new framework of simultaneous estimation and inferences for high-dimensional linear models. By smoothing over partial regression estimates based on a given variable selection scheme,...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13013

    authors: Fei Z,Zhu J,Banerjee M,Li Y

    更新日期:2019-06-01 00:00:00

  • Propensity score matching and subclassification in observational studies with multi-level treatments.

    abstract::In this article, we develop new methods for estimating average treatment effects in observational studies, in settings with more than two treatment levels, assuming unconfoundedness given pretreatment variables. We emphasize propensity score subclassification and matching methods which have been among the most popular...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12505

    authors: Yang S,Imbens GW,Cui Z,Faries DE,Kadziola Z

    更新日期:2016-12-01 00:00:00

  • Valid inference in random effects meta-analysis.

    abstract::The standard approach to inference for random effects meta-analysis relies on approximating the null distribution of a test statistic by a standard normal distribution. This approximation is asymptotic on k, the number of studies, and can be substantially in error in medical meta-analyses, which often have only a few ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.1999.00732.x

    authors: Follmann DA,Proschan MA

    更新日期:1999-09-01 00:00:00

  • Determining the number of clusters using the weighted gap statistic.

    abstract::Estimating the number of clusters in a data set is a crucial step in cluster analysis. In this article, motivated by the gap method (Tibshirani, Walther, and Hastie, 2001, Journal of the Royal Statistical Society B63, 411-423), we propose the weighted gap and the difference of difference-weighted (DD-weighted) gap met...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2007.00784.x

    authors: Yan M,Ye K

    更新日期:2007-12-01 00:00:00

  • Hypothesis testing of matrix graph model with application to brain connectivity analysis.

    abstract::Brain connectivity analysis is now at the foreground of neuroscience research. A connectivity network is characterized by a graph, where nodes represent neural elements such as neurons and brain regions, and links represent statistical dependence that is often encoded in terms of partial correlation. Such a graph is i...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12633

    authors: Xia Y,Li L

    更新日期:2017-09-01 00:00:00

  • Estimating overdispersion in sparse multinomial data.

    abstract::Multinomial data arise in many areas of the life sciences, such as mark-recapture studies and phylogenetics, and will often by overdispersed, with the variance being higher than predicted by a multinomial model. The quasi-likelihood approach to modeling this overdispersion involves the assumption that the variance is ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13194

    authors: Afroz F,Parry M,Fletcher D

    更新日期:2020-09-01 00:00:00

  • Estimating the average treatment effect on survival based on observational data and using partly conditional modeling.

    abstract::Treatments are frequently evaluated in terms of their effect on patient survival. In settings where randomization of treatment is not feasible, observational data are employed, necessitating correction for covariate imbalances. Treatments are usually compared using a hazard ratio. Most existing methods which quantify ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12542

    authors: Gong Q,Schaubel DE

    更新日期:2017-03-01 00:00:00

  • On pooling across strata when frequency matching has been followed in a cohort study.

    abstract::In a study designed to assess the relationship between a dichotomous exposure and the eventual occurrence of a dichotomous outcome, frequency matching has been proposed as a way to balance the exposure cohorts with respect to the sampling distribution of potential confounding factors. This paper discusses the pooled e...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Weinberg CR

    更新日期:1985-03-01 00:00:00

  • Regression analysis of case K interval-censored failure time data in the presence of informative censoring.

    abstract::Interval-censored failure time data occur in many fields such as demography, economics, medical research, and reliability and many inference procedures on them have been developed (Sun, 2006; Chen, Sun, and Peace, 2012). However, most of the existing approaches assume that the mechanism that yields interval censoring ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12527

    authors: Wang P,Zhao H,Sun J

    更新日期:2016-12-01 00:00:00

  • Ultra high-dimensional semiparametric longitudinal data analysis.

    abstract::As ultra high-dimensional longitudinal data are becoming ever more apparent in fields such as public health and bioinformatics, developing flexible methods with a sparse model is of high interest. In this setting, the dimension of the covariates can potentially grow exponentially as ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13348

    authors: Green B,Lian H,Yu Y,Zu T

    更新日期:2020-08-04 00:00:00

  • Exact two-sample inference with missing data.

    abstract::When comparing follow-up measurements from two independent populations, missing records may arise due to censoring by events whose occurrence is associated with baseline covariates. In these situations, inferences based only on the completely followed observations may be biased if the follow-up measurements and the co...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2005.00332.x

    authors: Cheung YK

    更新日期:2005-06-01 00:00:00

  • Partially supervised learning using an EM-boosting algorithm.

    abstract::Training data in a supervised learning problem consist of the class label and its potential predictors for a set of observations. Constructing effective classifiers from training data is the goal of supervised learning. In biomedical sciences and other scientific applications, class labels may be subject to errors. We...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2004.00156.x

    authors: Yasui Y,Pepe M,Hsu L,Adam BL,Feng Z

    更新日期:2004-03-01 00:00:00

  • Estimation and interpretation of heterogeneous vaccine efficacy against recurrent infections.

    abstract::Vaccine-induced protection may not be homogeneous across individuals. It is possible that a vaccine gives complete protection for a portion of individuals, while the rest acquire only incomplete (leaky) protection of varying magnitude. If vaccine efficacy is estimated under wrong assumptions about such individual leve...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12473

    authors: Mehtälä J,Dagan R,Auranen K

    更新日期:2016-09-01 00:00:00

  • Correcting for the effect of misclassification bias in a case-control study using data from two different questionnaires.

    abstract::In an epidemiological study of risk factors in breast cancer, data are available on confirmed cases from a diagnostic clinic and on controls from a screening clinic that sampled the general population. Relative risk estimation is complicated by differences in the interviewing environment and in the wording and order o...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Elton RA,Duffy SW

    更新日期:1983-09-01 00:00:00

  • Estimating the size of closed populations using inverse multiple-recapture sampling.

    abstract::A log-linear model for estimating the size of a closed population is defined for inverse multiple-recapture sampling with dependent samples. Efficient estimators of the log-linear model parameters and the population size are obtained by the method of minimum chi-square. A chi-square test of the general linear hypothes...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Bonett DG,Woodward JA,Bentler PM

    更新日期:1987-12-01 00:00:00

  • Robustness of group testing in the estimation of proportions.

    abstract::In binomial group testing, unlike one-at-a-time testing, the test unit consists of a group of individuals, and each group is declared to be defective or nondefective. A defective group is one that is presumed to include one or more defective (e.g., infected, positive) individuals and a nondefective group to contain on...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.1999.00231.x

    authors: Hung M,Swallow WH

    更新日期:1999-03-01 00:00:00

  • On the treatment of grouped observations in life studies.

    abstract::Assuming a model of proportional failure rates, Cox (1972) presents a systematic study of the use of covariates in the analysis of life time. The treatment of tied observations is a particularly troublesome point in both theory and application. It appears that grouping rather than discrete time is the right way to han...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Thompson WA Jr

    更新日期:1977-09-01 00:00:00

  • A new exact and more powerful unconditional test of no treatment effect from binary matched pairs.

    abstract::We consider the problem of testing for a difference in the probability of success from matched binary pairs. Starting with three standard inexact tests, the nuisance parameter is first estimated and then the residual dependence is eliminated by maximization, producing what I call an E+M P-value. The E+M P-value based ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2007.00936.x

    authors: Lloyd CJ

    更新日期:2008-09-01 00:00:00

  • Coregionalized single- and multiresolution spatially varying growth curve modeling with application to weed growth.

    abstract::Modeling of longitudinal data from agricultural experiments using growth curves helps understand conditions conducive or unconducive to crop growth. Recent advances in Geographical Information Systems (GIS) now allow geocoding of agricultural data that help understand spatial patterns. A particularly common problem is...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2006.00535.x

    authors: Banerjee S,Johnson GA

    更新日期:2006-09-01 00:00:00

  • Cox regression model with doubly truncated data.

    abstract::Truncation is a well-known phenomenon that may be present in observational studies of time-to-event data. While many methods exist to adjust for either left or right truncation, there are very few methods that adjust for simultaneous left and right truncation, also known as double truncation. We propose a Cox regressi...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12809

    authors: Rennert L,Xie SX

    更新日期:2018-06-01 00:00:00

  • Bayesian model-averaged benchmark dose analysis via reparameterized quantal-response models.

    abstract::An important objective in biomedical and environmental risk assessment is estimation of minimum exposure levels that induce a pre-specified adverse response in a target population. The exposure points in such settings are typically referred to as benchmark doses (BMDs). Parametric Bayesian estimation for finding BMDs ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12340

    authors: Fang Q,Piegorsch WW,Simmons SJ,Li X,Chen C,Wang Y

    更新日期:2015-12-01 00:00:00

  • Randomization inference with general interference and censoring.

    abstract::Interference occurs between individuals when the treatment (or exposure) of one individual affects the outcome of another individual. Previous work on causal inference methods in the presence of interference has focused on the setting where it is a priori assumed that there is "partial interference," in the sense that...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13125

    authors: Loh WW,Hudgens MG,Clemens JD,Ali M,Emch ME

    更新日期:2020-03-01 00:00:00