Classification using ensemble learning under weighted misclassification loss.

Abstract:

:Binary classification rules based on covariates typically depend on simple loss functions such as zero-one misclassification. Some cases may require more complex loss functions. For example, individual-level monitoring of HIV-infected individuals on antiretroviral therapy requires periodic assessment of treatment failure, defined as having a viral load (VL) value above a certain threshold. In some resource limited settings, VL tests may be limited by cost or technology, and diagnoses are based on other clinical markers. Depending on scenario, higher premium may be placed on avoiding false-positives, which brings greater cost and reduced treatment options. Here, the optimal rule is determined by minimizing a weighted misclassification loss/risk. We propose a method for finding and cross-validating optimal binary classification rules under weighted misclassification loss. We focus on rules comprising a prediction score and an associated threshold, where the score is derived using an ensemble learner. Simulations and examples show that our method, which derives the score and threshold jointly, more accurately estimates overall risk and has better operating characteristics compared with methods that derive the score first and the cutoff conditionally on the score especially for finite samples.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Xu Y,Liu T,Daniels MJ,Kantor R,Mwangi A,Hogan JW

doi

10.1002/sim.8082

subject

Has Abstract

pub_date

2019-05-20 00:00:00

pages

2002-2012

issue

11

eissn

0277-6715

issn

1097-0258

journal_volume

38

pub_type

杂志文章
  • A new proposal to adjust Moran's I for population density.

    abstract::We analyse the effect of using prevalence rates based on populations with different sizes in the power of spatial independence tests. We compare the well known spatial correlation Moran's index to three indexes obtained after adjusting for population density, one proposed by Oden, another proposed by Waldhör, and a th...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990830)18:16<2147::aid-s

    authors: Assunção RM,Reis EA

    更新日期:1999-08-30 00:00:00

  • Estimation of dynamic treatment strategies for maintenance therapy of children with acute lymphoblastic leukaemia: an application of history-adjusted marginal structural models.

    abstract::Childhood acute lymphoblastic leukaemia is treated with long-term intensive chemotherapy. During the latter part of the treatment, the maintenance therapy, the patients receive oral doses of two cytostatics. The doses are tailored to blood counts measured on a weekly basis, and the treatment is therefore highly dynami...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4393

    authors: Rosthøj S,Keiding N,Schmiegelow K

    更新日期:2012-02-28 00:00:00

  • Case-control analysis with a continuous outcome variable.

    abstract::It is not uncommon for a continuous outcome variable Y to be dichotomized and analysed using logistic regression. Moser and Coombs (Statist. Med. 2004; 23:1843-1860) provide a method for converting the output from a standard linear regression analysis using the original continuous outcome Y to give much more efficient...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3474

    authors: Jiang Y,Scott A,Wild CJ

    更新日期:2009-01-30 00:00:00

  • Statistical education for medical students--concepts are what remain when the details are forgotten.

    abstract::Teaching statistics to medical students is a challenging and often unrewarding task. However, few would argue the need for statistics in the medical school curriculum. In recent years, there has been a growing call for teaching only statistical concepts in medical schools. We strongly oppose this opinion and offer an ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2906

    authors: Herman A,Notzer N,Libman Z,Braunstein R,Steinberg DM

    更新日期:2007-10-15 00:00:00

  • Estimating treated prevalence and service utilization rates: assessing disparities in mental health.

    abstract::There is considerable public concern about health disparities among different cultural/racial/ethnic groups. Important process measures that might reflect inequities are treated prevalence and the service utilization rate in a defined period of time. We have previously described a method for estimating N, the distinct...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3904

    authors: Laska EM,Meisner M,Wanderling J,Siegel C

    更新日期:2010-07-20 00:00:00

  • Estimation of the mediation effect with a binary mediator.

    abstract::A mediator acts as a third variable in the causal pathway between a risk factor and an outcome. In this paper, we consider the estimation of the mediation effect when the mediator is a binary variable. We give a precise definition of the mediation effect and examine asymptotic properties of five different estimators o...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2730

    authors: Li Y,Schneider JA,Bennett DA

    更新日期:2007-08-15 00:00:00

  • Bounding the bias of unmeasured factors with confounding and effect-modifying potentials.

    abstract::Confounding is a major concern in observational studies. To adjust for confounding bias, the potential confounder(s) for a study must first be identified and measured. But this is not always possible. The unmeasured factors may also exhibit effect modification, and this further complicates the situation. In this paper...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4151

    authors: Lee WC

    更新日期:2011-04-30 00:00:00

  • Marginal versus conditional versus 'structural source' models: a rationale for an alternative to log-linear methods for capture-recapture estimates.

    abstract::Log-linear models for capture-recapture type data are widely used for estimating sizes of populations. Log-linear methods model conditional interactions between the sources. Often, however, the marginal associations are more appropriate and easier for the practitioner to conceptualize. Analyses here of previously publ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19980115)17:1<69::aid-sim7

    authors: Regal RR,Hook EB

    更新日期:1998-01-15 00:00:00

  • A functional-model-adjusted spatial scan statistic.

    abstract::This paper introduces a new spatial scan statistic designed to adjust cluster detection for longitudinal confounding factors indexed in space. The functional-model-adjusted statistic was developed using generalized functional linear models in which longitudinal confounding factors were considered to be functional cova...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8459

    authors: Ahmed MS,Genin M

    更新日期:2020-04-15 00:00:00

  • Two-sample rank tests for acceleration in cure models.

    abstract::I derive the locally most powerful rank tests for acceleration against semi-parametric alternatives when some patients are cured of the disease. I consider some particular classes of alternatives and present simulation results to verify the validity of the proposed tests. Real data from clinical trials for childhood l...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780141905

    authors: Lee JW

    更新日期:1995-10-15 00:00:00

  • Testing departure from additivity in Tukey's model using shrinkage: application to a longitudinal setting.

    abstract::While there has been extensive research developing gene-environment interaction (GEI) methods in case-control studies, little attention has been given to sparse and efficient modeling of GEI in longitudinal studies. In a two-way table for GEI with rows and columns as categorical variables, a conventional saturated int...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6281

    authors: Ko YA,Mukherjee B,Smith JA,Park SK,Kardia SL,Allison MA,Vokonas PS,Chen J,Diez-Roux AV

    更新日期:2014-12-20 00:00:00

  • Non-parametric methods for comparing multiple treatment groups to a control group, based on incomplete non-decreasing repeated measurements.

    abstract::In the comparison of two or more treatment groups to a control group, consider a study with non-decreasing repeated measurements of the same characteristic taken over a common set of time points for each subject. Based on the vector of possibly incomplete responses from each subject, this paper considers asymptoticall...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(SICI)1097-0258(19961215)15:23<2509::AID-S

    authors: Davis CS

    更新日期:1996-12-15 00:00:00

  • Automated time series forecasting for biosurveillance.

    abstract::For robust detection performance, traditional control chart monitoring for biosurveillance is based on input data free of trends, day-of-week effects, and other systematic behaviour. Time series forecasting methods may be used to remove this behaviour by subtracting forecasts from observations to form residuals for al...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2835

    authors: Burkom HS,Murphy SP,Shmueli G

    更新日期:2007-09-30 00:00:00

  • A Bayesian analysis of mixture structural equation models with non-ignorable missing responses and covariates.

    abstract::In behavioral, biomedical, and social-psychological sciences, it is common to encounter latent variables and heterogeneous data. Mixture structural equation models (SEMs) are very useful methods to analyze these kinds of data. Moreover, the presence of missing data, including both missing responses and missing covaria...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3915

    authors: Cai JH,Song XY,Hser YI

    更新日期:2010-08-15 00:00:00

  • The many weak instruments problem and Mendelian randomization.

    abstract::Instrumental variable estimates of causal effects can be biased when using many instruments that are only weakly associated with the exposure. We describe several techniques to reduce this bias and estimate corrected standard errors. We present our findings using a simulation study and an empirical application. For th...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6358

    authors: Davies NM,von Hinke Kessler Scholder S,Farbmacher H,Burgess S,Windmeijer F,Smith GD

    更新日期:2015-02-10 00:00:00

  • The effect of unbalanced randomization on the progressively censored Savage test.

    abstract::Equal allocation of patients to treatment in a randomized clinical trial may have disadvantages ethically if the new treatment is believed to be at least as beneficial as the standard treatment. Others have considered, in a non-sequential setting, unbalanced randomized designs which allocate fewer patients to the pote...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780010309

    authors: Lesser ML

    更新日期:1982-07-01 00:00:00

  • Scientific considerations for assessing biosimilar products.

    abstract::The problem for assessing biosimilarity and drug interchangeability of follow-on biologics (biosimilar products) is studied. Unlike the generic products, the development of biosimilar products is much more complicated because of fundamental differences in functional structures and manufacturing processes. As a result,...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5571

    authors: Chow SC,Wang J,Endrenyi L,Lachenbruch PA

    更新日期:2013-02-10 00:00:00

  • Variance estimators for attributable fraction estimates consistent in both large strata and sparse data.

    abstract::A number of variance formulae for the attributable fraction have been presented, but none is consistent in sparse data, such as found in individually matched case-control studies. This paper employs Mantel-Haenszel estimation to derive variance estimators for attributable fractions that are dually consistent, that is,...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780060607

    authors: Greenland S

    更新日期:1987-09-01 00:00:00

  • A score test for establishing non-inferiority with respect to short-term survival in two-sample comparisons with identical proportions of long-term survivors.

    abstract::In recent years randomized trials designed to establish non-inferiority of a new treatment as compared to a standard one have been more widely used. Two-sample statistics have been proposed for this equivalence testing problem. However, they are not suited to situations where a long-term survivor fraction is expected....

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1453

    authors: Broët P,Tubert-Bitter P,De Rycke Y,Moreau T

    更新日期:2003-03-30 00:00:00

  • A multiple imputation strategy for incomplete longitudinal data.

    abstract::Longitudinal studies are commonly used to study processes of change. Because data are collected over time, missing data are pervasive in longitudinal studies, and complete ascertainment of all variables is rare. In this paper a new imputation strategy for completing longitudinal data sets is proposed. The proposed met...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.740

    authors: Landrum MB,Becker MP

    更新日期:2001-09-15 00:00:00

  • Causal inference in paired two-arm experimental studies under noncompliance with application to prognosis of myocardial infarction.

    abstract::Motivated by a study about prompt coronary angiography in myocardial infarction, we propose a method to estimate the causal effect of a treatment in two-arm experimental studies with possible noncompliance in both treatment and control arms. We base the method on a causal model for repeated binary outcomes (before and...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5856

    authors: Bartolucci F,Farcomeni A

    更新日期:2013-11-10 00:00:00

  • Multi-state models for colon cancer recurrence and death with a cured fraction.

    abstract::In cancer clinical trials, patients often experience a recurrence of disease prior to the outcome of interest, overall survival. Additionally, for many cancers, there is a cured fraction of the population who will never experience a recurrence. There is often interest in how different covariates affect the probability...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6056

    authors: Conlon AS,Taylor JM,Sargent DJ

    更新日期:2014-05-10 00:00:00

  • Dunnett-type inference in the frailty Cox model with covariates.

    abstract::A frequent objective in medical research is the investigation of differences in patient survival between several experimental treatments and one standard treatment. In order to assess these differences statistically, we have to apply adjustments for multiple comparisons to prevent an increased number of false-positive...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4403

    authors: Herberich E,Hothorn T

    更新日期:2012-01-13 00:00:00

  • Accounting for informatively missing data in logistic regression by means of reassessment sampling.

    abstract::We explore the 'reassessment' design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mech...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6456

    authors: Lin J,Lyles RH

    更新日期:2015-05-20 00:00:00

  • Small clinical trials: are they all bad?

    abstract::Statisticians have long argued that randomized controlled trials should be sufficiently large to achieve their purpose, and for common diseases with major public health implications this has brought many benefits. However, there are many instances where it is unrealistic to expect clinicians to provide the information...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.4780140204

    authors: Matthews JN

    更新日期:1995-01-30 00:00:00

  • How should meta-regression analyses be undertaken and interpreted?

    abstract::Appropriate methods for meta-regression applied to a set of clinical trials, and the limitations and pitfalls in interpretation, are insufficiently recognized. Here we summarize recent research focusing on these issues, and consider three published examples of meta-regression in the light of this work. One principal m...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1187

    authors: Thompson SG,Higgins JP

    更新日期:2002-06-15 00:00:00

  • Methods for dose finding studies in cancer clinical trials: a review and results of a Monte Carlo study.

    abstract::We discuss some of the statistical approaches to the design and analysis of phase I clinical trials in cancer. An attempt is made to identify the issues, particular to this type of trial, that should be addressed by an appropriate methodology. A brief review of schemes currently in use is provided together with our vi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.4780101104

    authors: O'Quigley J,Chevret S

    更新日期:1991-11-01 00:00:00

  • Comparison of methods for the analysis of longitudinal interval count data.

    abstract::Longitudinal studies are often concerned with estimating the recurrence rate of a non-fatal event. In many cases, only the total number of events occurring during successive time intervals is known. We compared a mixed Poisson-gamma regression method proposed by Thall and a quasi-likelihood method proposed by Zeger an...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/sim.4780121406

    authors: Stukel TA

    更新日期:1993-07-30 00:00:00

  • Redesign of trials under different enrollment mixes.

    abstract::A few large multi-centre male-only heart trials done in the 1970s and 1980s have been seen as ill-conceived because they did not include females. The purpose here is to revisit two of those trials and to consider consequences in terms of cost and power had they been designed to include females. ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19990215)18:3<241::aid-sim

    authors: Meinert CL

    更新日期:1999-02-15 00:00:00

  • Effect of regression to the mean in the presence of within-subject variability.

    abstract::Regression to the mean arises often in statistical applications where the units chosen for study relate to some observed characteristic in the extreme of its distribution. Gardner and Heady attribute the effect of regression to the mean to measurement errors. They assume the model Yi = U + ei, where U is a fixed withi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780100812

    authors: Johnson WD,George VT

    更新日期:1991-08-01 00:00:00