Abstract:
:We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.
journal_name
Stat Methods Med Resjournal_title
Statistical methods in medical researchauthors
Jiang D,Huang J,Zhang Ydoi
10.1177/0962280211428385subject
Has Abstractpub_date
2013-10-01 00:00:00pages
505-18issue
5eissn
0962-2802issn
1477-0334pii
0962280211428385journal_volume
22pub_type
杂志文章abstract::The change-point model has drawn much attention over the past few decades. It can accommodate the jump process, which allows for changes of the effects before and after the change point. Intellectual disability is a long-term disability that impacts performance in cognitive aspects of life and usually has its onset pr...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280212463415
更新日期:2016-04-01 00:00:00
abstract::This paper presents and compares several methods of measuring continuous baseline covariate imbalance in clinical trial data. Simulations illustrate that though the t-test is an inappropriate method of assessing continuous baseline covariate imbalance, the test statistic itself is a robust measure in capturing imbalan...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280211416038
更新日期:2015-04-01 00:00:00
abstract::This paper reviews models for incomplete continuous and categorical longitudinal data. In terms of Rubin's classification of missing value processes we are specifically concerned with the problem of nonrandom missingness. A distinction is drawn between the classes of selection and pattern-mixture models and, using sev...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1177/096228029900800105
更新日期:1999-03-01 00:00:00
abstract::We review the origins of backcalculation (or back projection) methods developed for the analysis of AIDS (acquired immunodeficiency syndrome) incidence data. These techniques have been used extensively for >15 years to deconvolute clinical case incidence, given knowledge of the incubation period distribution, to obtai...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1191/0962280203sm337ra
更新日期:2003-06-01 00:00:00
abstract::We propose a novel likelihood method for analyzing time-to-event data when multiple events and multiple missing data intervals are possible prior to the first observed event for a given subject. This research is motivated by data obtained from a heart monitor used to track the recovery process of subjects experiencing...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280212466089
更新日期:2016-04-01 00:00:00
abstract::Identification of cancer patient subgroups using high throughput genomic data is of critical importance to clinicians and scientists because it can offer opportunities for more personalized treatment and overlapping treatments of cancers. In spite of tremendous efforts, this problem still remains challenging because o...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217752980
更新日期:2019-07-01 00:00:00
abstract::Trials run in either rare diseases, such as rare cancers, or rare sub-populations of common diseases are challenging in terms of identifying, recruiting and treating sufficient patients in a sensible period. Treatments for rare diseases are often designed for other disease areas and then later proposed as possible tre...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280216662070
更新日期:2018-05-01 00:00:00
abstract::Clinical trials are expensive and time-consuming and so should also be used to study how treatments work, allowing for the evaluation of theoretical treatment models and refinement and improvement of treatments. These treatment processes can be studied using mediation analysis. Randomised treatment makes some of the a...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280216666111
更新日期:2018-06-01 00:00:00
abstract::Conditional two-part random-effects models have been proposed for the analysis of healthcare cost panel data that contain both zero costs from the non-users of healthcare facilities and positive costs from the users. These models have been extended to accommodate more flexible data structures when using the generalize...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217690770
更新日期:2018-10-01 00:00:00
abstract::Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217737157
更新日期:2019-03-01 00:00:00
abstract::This paper reviews the application of statistical models to outbreaks of two common respiratory viral diseases, measles and influenza. For each disease, we look first at its epidemiological characteristics and assess the extent to which these either aid or hinder modelling. We then turn to the models that have been de...
journal_title:Statistical methods in medical research
pub_type: 杂志文章,评审
doi:10.1177/096228029300200104
更新日期:1993-01-01 00:00:00
abstract::This paper illustrates the use of multidimensional scaling methods (MDS) to examine space-time patterns in epidemic data. The paper begins by outlining the principles of MDS. The model is then formally specified and illustrated by application to two data sets. The first is partly a tutorial example. It uses monthly re...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/096228029500400202
更新日期:1995-06-01 00:00:00
abstract::Estimation of the effect of a binary exposure on an outcome in the presence of confounding is often carried out via outcome regression modelling. An alternative approach is to use propensity score methodology. The propensity score is the conditional probability of receiving the exposure given the observed covariates a...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280210394483
更新日期:2012-06-01 00:00:00
abstract::The three-class Youden index serves both as a measure of medical test accuracy and a criterion to choose the optimal pair of cutoff values for classifying subjects into three ordinal disease categories (e.g. no disease, mild disease, advanced disease). We present a Bayesian nonparametric approach for estimating the th...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217742538
更新日期:2018-03-01 00:00:00
abstract::The accuracy of a diagnostic test, which is often quantified by a pair of measures such as sensitivity and specificity, is critical for medical decision making. Separate studies of an investigational diagnostic test can be combined through meta-analysis; however, such an analysis can be threatened by publication bias....
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280218791602
更新日期:2019-10-01 00:00:00
abstract::The maximal procedure is a restricted randomization method that maximizes the number of feasible allocation sequences under the constraints of the maximum tolerated imbalance and the allocation sequence length. It assigns an equal probability to all feasible sequences. However, its implementation is not easy due to th...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280216677107
更新日期:2018-07-01 00:00:00
abstract::Aim To present a flexible model for repeated measures longitudinal growth data within individuals that allows trends over time to incorporate individual-specific random effects. These may reflect the timing of growth events and characterise within-individual variability which can be modelled as a function of age. Subj...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217706728
更新日期:2018-11-01 00:00:00
abstract::Couples with diseases associated with the sexual chromosomes, as well as families in countries where the desire for a male is extreme, are interested in influencing the sex of the baby. We propose an original composite likelihood approach to analyse the relation between sex of the newborn and timing of the intercourse...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280217702415
更新日期:2018-11-01 00:00:00
abstract::Binary logistic regression is one of the most frequently applied statistical approaches for developing clinical prediction models. Developers of such models often rely on an Events Per Variable criterion (EPV), notably EPV ≥10, to determine the minimal sample size required and the maximum number of candidate predictor...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280218784726
更新日期:2019-08-01 00:00:00
abstract::In many experiments and especially in translational and preclinical research, sample sizes are (very) small. In addition, data designs are often high dimensional, i.e. more dependent than independent replications of the trial are observed. The present paper discusses the applicability of max t-test-type statistics (mu...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280220970228
更新日期:2020-11-24 00:00:00
abstract::The mixed effects model for repeated measures has been widely used for the analysis of longitudinal clinical data collected at a number of fixed time points. We propose a robust extension of the mixed effects model for repeated measures for skewed and heavy-tailed data on basis of the multivariate skew-t distribution,...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280219865579
更新日期:2020-06-01 00:00:00
abstract::The analysis of walking behavior in a physical activity intervention is considered. A Bayesian latent structure modeling approach is proposed whereby the ability and willingness of participants is modeled via latent effects. The dropout process is jointly modeled via a linked survival model. Computational issues are a...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280214529932
更新日期:2016-12-01 00:00:00
abstract::This article aims to develop a probability-based model involving the use of direct likelihood formulation and generalised linear modelling (GLM) approaches useful in estimating important disease parameters from longitudinal or repeated measurement data. The current application is based on infection with respiratory sy...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280210385749
更新日期:2011-10-01 00:00:00
abstract::Researchers and clinicians often need to know whether a new method of measurement is equivalent to an established one that is already in use. For this problem, the estimation of limits of agreement advocated by Bland and Altman is a widely used solution. However, this approach ignores two vital issues in method compar...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280210379365
更新日期:2012-08-01 00:00:00
abstract::In this paper, a new allocation rule for treatment assignments in sequential clinical trials is proposed. The stratified and randomized play-the-winner rule (SRPWR) is an extension of the randomized play-the-winner rule to more than two treatments. It is applicable to cases where the probabilities of success of a trea...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280207081606
更新日期:2008-12-01 00:00:00
abstract::One of the main advantages of Bayesian analyses of clinical trials is their ability to formally incorporate skepticism about large treatment effects through the use of informative priors. We conducted a simulation study to assess the performance of informative normal, Student- t, and beta distributions in estimating r...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280215620828
更新日期:2018-01-01 00:00:00
abstract::Monte Carlo evaluation of resampling-based tests is often conducted in statistical analysis. However, this procedure is generally computationally intensive. The pooling resampling-based method has been developed to reduce the computational burden but the validity of the method has not been studied before. In this arti...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280216661876
更新日期:2018-05-01 00:00:00
abstract::Comparison of sequences that have descended from a common ancestor based on an explicit stochastic model of substitutions, insertions and deletions has risen to prominence in the last decade. Making statements about the positions of insertions-deletions (abbr. indels) is central in sequence and genome analysis and is ...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280208099500
更新日期:2009-10-01 00:00:00
abstract::Stochastic transmission dynamic models are needed to quantify the uncertainty in estimates and predictions during outbreaks of infectious diseases. We previously developed a calibration method for stochastic epidemic compartmental models, called Multiple Shooting for Stochastic Systems (MSS), and demonstrated its comp...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280218805780
更新日期:2019-12-01 00:00:00
abstract::Nonlinear mixed-effects modeling is a popular approach to describe the temporal trajectory of repeated measurements of clinical endpoints collected over time in clinical trials, to distinguish the within-subject and the between-subject variabilities, and to investigate clinically important risk factors (covariates) th...
journal_title:Statistical methods in medical research
pub_type: 杂志文章
doi:10.1177/0962280218812595
更新日期:2019-12-01 00:00:00