Investigating the prediction ability of survival models based on both clinical and omics data: two case studies.

Abstract:

:In biomedical literature, numerous prediction models for clinical outcomes have been developed based either on clinical data or, more recently, on high-throughput molecular data (omics data). Prediction models based on both types of data, however, are less common, although some recent studies suggest that a suitable combination of clinical and molecular information may lead to models with better predictive abilities. This is probably due to the fact that it is not straightforward to combine data with different characteristics and dimensions (poorly characterized high-dimensional omics data, well-investigated low-dimensional clinical data). In this paper, we analyze two publicly available datasets related to breast cancer and neuroblastoma, respectively, in order to show some possible ways to combine clinical and omics data into a prediction model of time-to-event outcome. Different strategies and statistical methods are exploited. The results are compared and discussed according to different criteria, including the discriminative ability of the models, computed on a validation dataset.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

De Bin R,Sauerbrei W,Boulesteix AL

doi

10.1002/sim.6246

subject

Has Abstract

pub_date

2014-12-30 00:00:00

pages

5310-29

issue

30

eissn

0277-6715

issn

1097-0258

journal_volume

33

pub_type

杂志文章
  • Analysis of the ratio of marginal probabilities in a matched-pair setting.

    abstract::Statistical methods for testing and interval estimation of the ratio of marginal probabilities in the matched-pair setting are considered in this paper. We are especially interested in the situation where the null value is not one, as in one-sided equivalence trials. We propose a Fieller-type statistic based on constr...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1017

    authors: Nam JM,Blackwelder WC

    更新日期:2002-03-15 00:00:00

  • Model selection techniques for the covariance matrix for incomplete longitudinal data.

    abstract::In longitudinal studies with incomplete data, where the number of time points can become numerous, it is often advantageous to model the covariance matrix. We describe several covariance models (for example, mixed models, compound symmetry, AR(1)-type models, and combination models) that offer parsimonious alternative...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.4780141302

    authors: Grady JJ,Helms RW

    更新日期:1995-07-15 00:00:00

  • Estimating the mean hazard ratio parameters for clustered survival data with random clusters.

    abstract::We consider a latent variable hazard model for clustered survival data where clusters are a random sample from an underlying population. We allow interactions between the random cluster effect and covariates. We use a maximum pseudo-likelihood estimator to estimate the mean hazard ratio parameters. We propose a bootst...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19970915)16:17<2009::aid-s

    authors: Cai J,Zhou H,Davis CE

    更新日期:1997-09-15 00:00:00

  • Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.

    abstract::Identification of key factors associated with the risk of developing cardiovascular disease and quantification of this risk using multivariable prediction algorithms are among the major advances made in preventive cardiology and cardiovascular epidemiology in the 20th century. The ongoing discovery of new risk markers...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2929

    authors: Pencina MJ,D'Agostino RB Sr,D'Agostino RB Jr,Vasan RS

    更新日期:2008-01-30 00:00:00

  • Estimating time-dependent ROC curves using data under prevalent sampling.

    abstract::Prevalent sampling is frequently a convenient and economical sampling technique for the collection of time-to-event data and thus is commonly used in studies of the natural history of a disease. However, it is biased by design because it tends to recruit individuals with longer survival times. This paper considers est...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7184

    authors: Li S

    更新日期:2017-04-15 00:00:00

  • Designs for phase I trials in ordered groups.

    abstract::We propose a new design for dose finding for cytotoxic agents in two ordered groups of patients. By ordered groups, we mean that prior to the study there is clinical information that would indicate that for a given dose one group would be more susceptible to toxicities than patients in the other group. The designs are...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7133

    authors: Conaway MR,Wages NA

    更新日期:2017-01-30 00:00:00

  • Combining mortality and longitudinal measures in clinical trials.

    abstract::Clinical trials often assess therapeutic benefit on the basis of an event such as death or the diagnosis of disease. Usually, there are several additional longitudinal measures of clinical status which are collected to be used in the treatment comparison. This paper proposes a simple non-parametric test which combines...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/(sici)1097-0258(19990615)18:11<1341::aid-s

    authors: Finkelstein DM,Schoenfeld DA

    更新日期:1999-06-15 00:00:00

  • Performance of analytical methods for overdispersed counts in cluster randomized trials: sample size, degree of clustering and imbalance.

    abstract::Many different methods have been proposed for the analysis of cluster randomized trials (CRTs) over the last 30 years. However, the evaluation of methods on overdispersed count data has been based mostly on the comparison of results using empiric data; i.e. when the true model parameters are not known. In this study, ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3681

    authors: Durán Pacheco G,Hattendorf J,Colford JM Jr,Mäusezahl D,Smith T

    更新日期:2009-10-30 00:00:00

  • Construction and validation of a prognostic model across several studies, with an application in superficial bladder cancer.

    abstract::Many models for clinical prediction (prognosis or diagnosis) are published in the medical literature every year but few such models find their way into clinical practice. The reason may be that since in most cases models have not been validated in independent data, they lack generality and/or credibility. In this pape...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1691

    authors: Royston P,Parmar MK,Sylvester R

    更新日期:2004-03-30 00:00:00

  • A method to test for a recent increase in HIV-1 seroconversion incidence: results from the Multicenter AIDS Cohort Study (MACS).

    abstract::We have formulated the problem of determining whether there has been an upturn in HIV-1 seroconversion incidence over the first five years of follow-up in the Multicenter AIDS Cohort Study (MACS) as that of locating the minimum of a quadratic regression or examination of two-knot piecewise spline models. Under a quadr...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,多中心研究

    doi:10.1002/sim.4780120207

    authors: Zhou SY,Kingsley LA,Taylor JM,Chmiel JS,He DY,Hoover DR

    更新日期:1993-01-30 00:00:00

  • Semiparametric transformation models for panel count data with correlated observation and follow-up times.

    abstract::The statistical analysis of panel count data has recently attracted a great deal of attention, and a number of approaches have been developed. However, most of these approaches are for situations where the observation and follow-up processes are independent of the underlying recurrent event process unconditional or co...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5724

    authors: Li N,Zhao H,Sun J

    更新日期:2013-07-30 00:00:00

  • Estimating net transition probabilities from cross-sectional data with application to risk factors in chronic disease modeling.

    abstract::A problem occurring in chronic disease modeling is the estimation of transition probabilities of moving from one state of a categorical risk factor to another. Transitions could be obtained from a cohort study, but often such data may not be available. However, under the assumption that transitions remain stable over ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4423

    authors: Kassteele Jv,Hoogenveen RT,Engelfriet PM,Baal PH,Boshuizen HC

    更新日期:2012-03-15 00:00:00

  • Nonparametric covariate hypothesis tests for the cure rate in mixture cure models.

    abstract::In lifetime data, like cancer studies, there may be long term survivors, which lead to heavy censoring at the end of the follow-up period. Since a standard survival model is not appropriate to handle these data, a cure model is needed. In the literature, covariate hypothesis tests for cure models are limited to parame...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8530

    authors: López-Cheda A,Jácome MA,Van Keilegom I,Cao R

    更新日期:2020-07-30 00:00:00

  • Alpha calculus in clinical trials: considerations and commentary for the new millennium.

    abstract::Regardless of whether a statistician believes in letting a data set speak for itself through nominal p-values or believes in strict alpha conservation, the interpretation of experiments which are negative for the primary endpoint but positive for secondary endpoints is the source of some angst. The purpose of this pap...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(20000330)19:6<767::aid-sim

    authors: Moyé LA

    更新日期:2000-03-30 00:00:00

  • Testing the equality of two survival functions with right truncated data.

    abstract::To compare the survival functions based on right-truncated data, Lagakos et al. proposed a weighted logrank test based on a reverse time scale. This is in contrast to Bilker and Wang, who suggested a semi-parametric version of the Mann-Whitney test by assuming that the distribution of truncation times is known or can ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2556

    authors: Chi Y,Tsai WY,Chiang CL

    更新日期:2007-02-20 00:00:00

  • STRengthening analytical thinking for observational studies: the STRATOS initiative.

    abstract::The validity and practical utility of observational medical research depends critically on good study design, excellent data quality, appropriate statistical methods and accurate interpretation of results. Statistical methodology has seen substantial development in recent times. Unfortunately, many of these methodolog...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6265

    authors: Sauerbrei W,Abrahamowicz M,Altman DG,le Cessie S,Carpenter J,STRATOS initiative.

    更新日期:2014-12-30 00:00:00

  • Cutoff designs for community-based intervention studies.

    abstract::Public health interventions are often designed to target communities defined either geographically (e.g. cities, counties) or socially (e.g. schools or workplaces). The group randomized trial (GRT) is regarded as the gold standard for evaluating these interventions. However, community leaders may object to randomizati...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,随机对照试验

    doi:10.1002/sim.4237

    authors: Pennell ML,Hade EM,Murray DM,Rhoda DA

    更新日期:2011-07-10 00:00:00

  • Accounting for competing risks in randomized controlled trials: a review and recommendations for improvement.

    abstract::In studies with survival or time-to-event outcomes, a competing risk is an event whose occurrence precludes the occurrence of the primary event of interest. Specialized statistical methods must be used to analyze survival data in the presence of competing risks. We conducted a review of randomized controlled trials wi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.7215

    authors: Austin PC,Fine JP

    更新日期:2017-04-15 00:00:00

  • Nonparametric meta-analysis for diagnostic accuracy studies.

    abstract::Summarizing the information of many studies using a meta-analysis becomes more and more important, also in the field of diagnostic studies. The special challenge in meta-analysis of diagnostic accuracy studies is that in general sensitivity and specificity are co-primary endpoints. Across the studies both endpoints ar...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6583

    authors: Zapf A,Hoyer A,Kramer K,Kuss O

    更新日期:2015-12-20 00:00:00

  • A practical introduction to Bayesian estimation of causal effects: Parametric and nonparametric approaches.

    abstract::Substantial advances in Bayesian methods for causal inference have been made in recent years. We provide an introduction to Bayesian inference for causal effects for practicing statisticians who have some familiarity with Bayesian models and would like an overview of what it can add to causal estimation in practical s...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8761

    authors: Oganisian A,Roy JA

    更新日期:2021-01-30 00:00:00

  • Statistical estimation of parameters in a disease transmission model: analysis of a Cryptosporidium outbreak.

    abstract::Population dynamic models, commonly used tools in the study of epidemics and other complex population processes, are implicit non-linear mathematical equations. Inference based on such models can be difficult due to the problems associated with high dimensional parameters that may be non-identified and complex likelih...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1258

    authors: Brookhart MA,Hubbard AE,van der Laan MJ,Colford JM Jr,Eisenberg JN

    更新日期:2002-12-15 00:00:00

  • An application of Harrell's C-index to PH frailty models.

    abstract::Frailty models are encountered in many medical applications, yet little research has been devoted to develop measures that quantify the predictive ability of these models. In this paper, we elaborate on the concept of the concordance probability to clustered data, resulting in an 'Overall Conditional C-index' or bfC(O...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4058

    authors: Van Oirbeek R,Lesaffre E

    更新日期:2010-12-30 00:00:00

  • Model diagnostics for censored regression via randomized survival probabilities.

    abstract::Residuals in normal regression are used to assess a model's goodness-of-fit (GOF) and discover directions for improving the model. However, there is a lack of residuals with a characterized reference distribution for censored regression. In this article, we propose to diagnose censored regression with normalized rando...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8852

    authors: Li L,Wu T,Feng C

    更新日期:2020-12-13 00:00:00

  • Spatial clustering of the failure to geocode and its implications for the detection of disease clustering.

    abstract::Geocoding a study population as completely as possible is an important data assimilation component of many spatial epidemiologic studies. Unfortunately, complete geocoding is rare in practice. The failure of a substantial proportion of study subjects' addresses to geocode has consequences for spatial analyses, some of...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3288

    authors: Zimmerman DL,Fang X,Mazumdar S

    更新日期:2008-09-20 00:00:00

  • Bayesian joint ordinal and survival modeling for breast cancer risk assessment.

    abstract::We propose a joint model to analyze the structure and intensity of the association between longitudinal measurements of an ordinal marker and time to a relevant event. The longitudinal process is defined in terms of a proportional-odds cumulative logit model. Time-to-event is modeled through a left-truncated proportio...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7065

    authors: Armero C,Forné C,Rué M,Forte A,Perpiñán H,Gómez G,Baré M

    更新日期:2016-12-10 00:00:00

  • Comparison of tests for categorical data from a stratified cluster randomized trial.

    abstract::Two features commonly exhibited by randomized trials of health promotion interventions are cluster randomization and stratification. Ignoring correlations between individuals within clusters can lead to an inflated type I error rate and hence a P-value which overstates the significance of the result. This paper compar...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1256

    authors: Dobbins TA,Simpson JM

    更新日期:2002-12-30 00:00:00

  • Multinomial goodness-of-fit tests for logistic regression models.

    abstract::We examine the properties of several tests for goodness-of-fit for multinomial logistic regression. One test is based on a strategy of sorting the observations according to the complement of the estimated probability for the reference outcome category and then grouping the subjects into g equal-sized groups. A g x c c...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3202

    authors: Fagerland MW,Hosmer DW,Bofin AM

    更新日期:2008-09-20 00:00:00

  • Hypothesis testing in the polychotomous logistic model with an application to detecting gastrointestinal cancer.

    abstract::We discuss the use of the trichotomous logistic model to discriminate between patients with gastrointestinal (GI) cancer, patients with benign GI disease and 'normal' subjects, using symptoms and the concentrations of some serum proteins that are potentially indicative of malignancy as covariates. A parsimonious model...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780040313

    authors: Marshall RJ,Chisholm EM

    更新日期:1985-07-01 00:00:00

  • Statistical comparison of two handwashing protocols.

    abstract::This paper describes statistical procedures for use in an experiment that compares two handwashing protocols. The evaluation of a handwashing protocol entails collection of the wash effluent. Colony counts for the effluent reflect the number of flora removed by the wash protocol. The analysis aims to formulate and est...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/sim.4780050412

    authors: Le CT

    更新日期:1986-07-01 00:00:00

  • Measurement error correction for nutritional exposures with correlated measurement error: use of the method of triads in a longitudinal setting.

    abstract::Nutritional exposures are often measured with considerable error in commonly used surrogate instruments such as the food frequency questionnaire (FFQ) (denoted by Q(i) for the ith subject). The error can be both systematic and random. The diet record (DR) denoted by R(i) for the ith subject is considered an alloyed go...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3238

    authors: Rosner B,Michels KB,Chen YH,Day NE

    更新日期:2008-08-15 00:00:00