Advanced colorectal neoplasia risk stratification by penalized logistic regression.

Abstract:

:Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance.

journal_name

Stat Methods Med Res

authors

Lin Y,Yu M,Wang S,Chappell R,Imperiale TF

doi

10.1177/0962280213497432

subject

Has Abstract

pub_date

2016-08-01 00:00:00

pages

1677-91

issue

4

eissn

0962-2802

issn

1477-0334

pii

0962280213497432

journal_volume

25

pub_type

杂志文章
  • Controlling for localised spatio-temporal autocorrelation in long-term air pollution and health studies.

    abstract::Estimating the long-term health impact of air pollution using an ecological spatio-temporal study design is a challenging task, due to the presence of residual spatio-temporal autocorrelation in the health counts after adjusting for the covariate effects. This autocorrelation is commonly modelled by a set of random ef...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280214527384

    authors: Lee D,Mitchell R

    更新日期:2014-12-01 00:00:00

  • The EM algorithm in medical imaging.

    abstract::This article outlines the statistical developments that have taken place in the use of the EM algorithm in emission and transmission tomography during the past decade or so. We discuss the statistical aspects of the modelling of the projection data for both the emission and transmission cases and define the relevant p...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/096228029700600105

    authors: Kay J

    更新日期:1997-03-01 00:00:00

  • Relative efficiency of unequal cluster sizes for variance component estimation in cluster randomized and multicentre trials.

    abstract::Cluster randomized and multicentre trials evaluate the effect of a treatment on persons nested within clusters, for instance patients within clinics or pupils within schools. Although equal sample sizes per cluster are generally optimal for parameter estimation, they are rarely feasible. This paper addresses the relat...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280206079018

    authors: van Breukelen GJ,Candel MJ,Berger MP

    更新日期:2008-08-01 00:00:00

  • Prediction intervals with random forests.

    abstract::The classical and most commonly used approach to building prediction intervals is the parametric approach. However, its main drawback is that its validity and performance highly depend on the assumed functional link between the covariates and the response. This research investigates new methods that improve the perfor...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280219829885

    authors: Roy MH,Larocque D

    更新日期:2020-01-01 00:00:00

  • Shared parameter models for joint analysis of longitudinal and survival data with left truncation due to delayed entry - Applications to cystic fibrosis.

    abstract::Many longitudinal studies observe time to occurrence of a clinical event such as death, while also collecting serial measurements of one or more biomarkers that are predictive of the event, or are surrogate outcomes of interest. Joint modeling can be used to examine the relationship between the biomarker and the event...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280218764193

    authors: Schluchter MD,Piccorelli AV

    更新日期:2019-05-01 00:00:00

  • Underestimation of treatment effects in sequentially monitored clinical trials that did not stop early for benefit.

    abstract::In recent years, there has been a prominent discussion in the literature about the potential for overestimation of the treatment effect when a clinical trial stops at an interim analysis due to the experimental treatment showing a benefit over the control. However, there has been much less attention paid to the conver...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280218795320

    authors: Marschner IC,Schou IM

    更新日期:2019-10-01 00:00:00

  • Estimating the dependence of mixed sensitive response types in randomized response technique.

    abstract::Sensitive questions are often involved in healthcare or medical survey research. Much empirical evidence has shown that the randomized response technique is useful for the collection of truthful responses. However, few studies have discussed methods to estimate the dependence of sensitive responses of multiple types. ...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280219847492

    authors: Chu AM,So MK,Chan TW,Tiwari A

    更新日期:2020-03-01 00:00:00

  • Promoting structural effects of covariates in the cure rate model with penalization.

    abstract::Cure rate models have been widely adopted for characterizing survival data that have long-term survivors. Under a mixture cure rate model where the population is a mixture of cured and susceptible subjects, a primary goal is to study covariate effects on the cure probability and survival function of the susceptible su...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280217708684

    authors: Fan X,Liu M,Fang K,Huang Y,Ma S

    更新日期:2017-10-01 00:00:00

  • Multi-state Markov models in cancer screening evaluation: a brief review and case study.

    abstract::This work presents a brief overview of Markov models in cancer screening evaluation and focuses on two specific models. A three-state model was first proposed to estimate jointly the sensitivity of the screening procedure and the average duration in the preclinical phase, i.e. the period when the cancer is asymptomati...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章,评审

    doi:10.1177/0962280209359848

    authors: Uhry Z,Hédelin G,Colonna M,Asselain B,Arveux P,Rogel A,Exbrayat C,Guldenfels C,Courtial I,Soler-Michel P,Molinié F,Eilstein D,Duffy SW

    更新日期:2010-10-01 00:00:00

  • Reference-based pattern-mixture models for analysis of longitudinal binary data.

    abstract::Pattern-mixture model (PMM)-based controlled imputations have become a popular tool to assess the sensitivity of primary analysis inference to different post-dropout assumptions or to estimate treatment effectiveness. The methodology is well established for continuous responses but less well established for binary res...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280220941880

    authors: Lu K

    更新日期:2020-12-01 00:00:00

  • Hierarchical mixture models for longitudinal immunologic data with heterogeneity, non-normality, and missingness.

    abstract::It is a common practice to analyze longitudinal data frequently arisen in medical studies using various mixed-effects models in the literature. However, the following issues may standout in longitudinal data analysis: (i) In clinical practice, the profile of each subject's response from a longitudinal study may follow...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280214544207

    authors: Huang Y,Chen J,Yin P

    更新日期:2017-02-01 00:00:00

  • Inferential tools in penalized logistic regression for small and sparse data: A comparative study.

    abstract::This paper focuses on inferential tools in the logistic regression model fitted by the Firth penalized likelihood. In this context, the Likelihood Ratio statistic is often reported to be the preferred choice as compared to the 'traditional' Wald statistic. In this work, we consider and discuss a wider range of test st...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280216661213

    authors: Siino M,Fasola S,Muggeo VM

    更新日期:2018-05-01 00:00:00

  • A transformation class for spatio-temporal survival data with a cure fraction.

    abstract::We propose a hierarchical Bayesian methodology to model spatially or spatio-temporal clustered survival data with possibility of cure. A flexible continuous transformation class of survival curves indexed by a single parameter is used. This transformation model is a larger class of models containing two special cases ...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280212445658

    authors: Hurtado Rúa SM,Dey DK

    更新日期:2016-02-01 00:00:00

  • Measuring agreement in method comparison studies.

    abstract::Agreement between two methods of clinical measurement can be quantified using the differences between observations made using the two methods on the same subjects. The 95% limits of agreement, estimated by mean difference +/- 1.96 standard deviation of the differences, provide an interval within which 95% of differenc...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章,评审

    doi:10.1177/096228029900800204

    authors: Bland JM,Altman DG

    更新日期:1999-06-01 00:00:00

  • Analysis of phase II methodologies for single-arm clinical trials with multiple endpoints in rare cancers: An example in Ewing's sarcoma.

    abstract::Trials run in either rare diseases, such as rare cancers, or rare sub-populations of common diseases are challenging in terms of identifying, recruiting and treating sufficient patients in a sensible period. Treatments for rare diseases are often designed for other disease areas and then later proposed as possible tre...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280216662070

    authors: Dutton P,Love SB,Billingham L,Hassan AB

    更新日期:2018-05-01 00:00:00

  • Bayesian sample size calculation for estimation of the difference between two binomial proportions.

    abstract::In this study, we discuss a decision theoretic or fully Bayesian approach to the sample size question in clinical trials with binary responses. Data are assumed to come from two binomial distributions. A Dirichlet distribution is assumed to describe prior knowledge of the two success probabilities p1 and p2. The param...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280211399562

    authors: Pezeshk H,Nematollahi N,Maroufy V,Marriott P,Gittins J

    更新日期:2013-12-01 00:00:00

  • The application of multidimensional scaling methods to epidemiological data.

    abstract::This paper illustrates the use of multidimensional scaling methods (MDS) to examine space-time patterns in epidemic data. The paper begins by outlining the principles of MDS. The model is then formally specified and illustrated by application to two data sets. The first is partly a tutorial example. It uses monthly re...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/096228029500400202

    authors: Cliff AD,Haggett P,Smallman-Raynor MR,Stroup DF,Williamson GD

    更新日期:1995-06-01 00:00:00

  • A frequentist approach to estimating the force of infection for a respiratory disease using repeated measurement data from a birth cohort.

    abstract::This article aims to develop a probability-based model involving the use of direct likelihood formulation and generalised linear modelling (GLM) approaches useful in estimating important disease parameters from longitudinal or repeated measurement data. The current application is based on infection with respiratory sy...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280210385749

    authors: Mwambi H,Ramroop S,White Lj,Okiro E,Nokes Dj,Shkedy Z,Molenberghs G

    更新日期:2011-10-01 00:00:00

  • A robust imputation method for missing responses and covariates in sample selection models.

    abstract::Sample selection arises when the outcome of interest is partially observed in a study. Although sophisticated statistical methods in the parametric and non-parametric framework have been proposed to solve this problem, it is yet unclear how to deal with selectively missing covariate data using simple multiple imputati...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280217715663

    authors: Ogundimu EO,Collins GS

    更新日期:2019-01-01 00:00:00

  • Survival forests for data with dependent censoring.

    abstract::Tree-based methods are very powerful and popular tools for analysing survival data with right-censoring. The existing methods assume that the true time-to-event and the censoring times are independent given the covariates. We propose different ways to build survival forests when dependent censoring is suspected, by us...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280217727314

    authors: Moradian H,Larocque D,Bellavance F

    更新日期:2019-02-01 00:00:00

  • Model selection in multivariate semiparametric regression.

    abstract::Variable selection in semiparametric mixed models for longitudinal data remains a challenge, especially in the presence of multiple correlated outcomes. In this paper, we propose a model selection procedure that simultaneously selects fixed and random effects using a maximum penalized likelihood method with the adapti...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280217690769

    authors: Li Z,Liu H,Tu W

    更新日期:2018-10-01 00:00:00

  • Stratified and randomized play-the-winner rule.

    abstract::In this paper, a new allocation rule for treatment assignments in sequential clinical trials is proposed. The stratified and randomized play-the-winner rule (SRPWR) is an extension of the randomized play-the-winner rule to more than two treatments. It is applicable to cases where the probabilities of success of a trea...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280207081606

    authors: Liang Y,Carriere KC

    更新日期:2008-12-01 00:00:00

  • The application of methods to quantify attributable risk in medical practice.

    abstract::Several epidemiological parameters have been introduced for quantifying the population impact of a certain exposure on morbidity on a population level, termed 'attributable risk' (AR). Of these definitions, the AR as suggested by Levin in 1953 or some algebraic transformations of it are most commonly used. A structure...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/096228020101000305

    authors: Uter W,Pfahlberg A

    更新日期:2001-06-01 00:00:00

  • Estimation of regression quantiles in complex surveys with data missing at random: An application to birthweight determinants.

    abstract::The estimation of population parameters using complex survey data requires careful statistical modelling to account for the design features. This is further complicated by unit and item nonresponse for which a number of methods have been developed in order to reduce estimation bias. In this paper, we address some issu...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280213484401

    authors: Geraci M

    更新日期:2016-08-01 00:00:00

  • Estimation of half-life periods in nonlinear data with fractional polynomials.

    abstract::Regression models are frequently used to model the functional relationship between an interesting outcome parameter and one or more potentially relevant explanatory variables. Objectives can be to set up as a prognostic model, for example, or an estimation model for a certain parameter of interest. Determining half-li...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280213502403

    authors: Mayer B,Keller F,Syrovets T,Wittau M

    更新日期:2016-10-01 00:00:00

  • Statistical challenges in assessing potential efficacy of complex interventions in pilot or feasibility studies.

    abstract::Early phase trials of complex interventions currently focus on assessing the feasibility of a large randomised control trial and on conducting pilot work. Assessing the efficacy of the proposed intervention is generally discouraged, due to concerns of underpowered hypothesis testing. In contrast, early assessment of e...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280215589507

    authors: Wilson DT,Walwyn RE,Brown J,Farrin AJ,Brown SR

    更新日期:2016-06-01 00:00:00

  • Forensic inference from genetic markers.

    abstract::This review provides an overview of forensic inference from genetic markers. Because the judge and jurors are charged with decision-making, the forensic expert's job is to provide a useful summary of the evidence to the court. Hence, this review focuses on the likelihood ratio as a means of summarizing the genetic dat...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章,评审

    doi:10.1177/096228029300200304

    authors: Devlin B

    更新日期:1993-01-01 00:00:00

  • Estimating the average treatment effects of nutritional label use using subclassification with regression adjustment.

    abstract::Propensity score methods are common for estimating a binary treatment effect when treatment assignment is not randomized. When exposure is measured on an ordinal scale (i.e. low-medium-high), however, propensity score inference requires extensions which have received limited attention. Estimands of possible interest w...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280214560046

    authors: Lopez MJ,Gutman R

    更新日期:2017-04-01 00:00:00

  • Separating variability in healthcare practice patterns from random error.

    abstract::Improving the quality of care that patients receive is a major focus of clinical research, particularly in the setting of cardiovascular hospitalization. Quality improvement studies seek to estimate and visualize the degree of variability in dichotomous treatment patterns and outcomes across different providers, where...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280217754230

    authors: Thomas LE,Schulte PJ

    更新日期:2019-04-01 00:00:00

  • Continuous(ly) missing outcome data in network meta-analysis: A one-stage pattern-mixture model approach.

    abstract::Appropriate handling of aggregate missing outcome data is necessary to minimise bias in the conclusions of systematic reviews. The two-stage pattern-mixture model has been already proposed to address aggregate missing continuous outcome data. While this approach is more proper compared with the exclusion of missing co...

    journal_title:Statistical methods in medical research

    pub_type: 杂志文章

    doi:10.1177/0962280220983544

    authors: Spineli LM,Kalyvas C,Papadimitropoulou K

    更新日期:2021-01-06 00:00:00