Abstract:
BACKGROUND:Modern modelling techniques may potentially provide more accurate predictions of binary outcomes than classical techniques. We aimed to study the predictive performance of different modelling techniques in relation to the effective sample size ("data hungriness"). METHODS:We performed simulation studies based on three clinical cohorts: 1282 patients with head and neck cancer (with 46.9% 5 year survival), 1731 patients with traumatic brain injury (22.3% 6 month mortality) and 3181 patients with minor head injury (7.6% with CT scan abnormalities). We compared three relatively modern modelling techniques: support vector machines (SVM), neural nets (NN), and random forests (RF) and two classical techniques: logistic regression (LR) and classification and regression trees (CART). We created three large artificial databases with 20 fold, 10 fold and 6 fold replication of subjects, where we generated dichotomous outcomes according to different underlying models. We applied each modelling technique to increasingly larger development parts (100 repetitions). The area under the ROC-curve (AUC) indicated the performance of each model in the development part and in an independent validation part. Data hungriness was defined by plateauing of AUC and small optimism (difference between the mean apparent AUC and the mean validated AUC <0.01). RESULTS:We found that a stable AUC was reached by LR at approximately 20 to 50 events per variable, followed by CART, SVM, NN and RF models. Optimism decreased with increasing sample sizes and the same ranking of techniques. The RF, SVM and NN models showed instability and a high optimism even with >200 events per variable. CONCLUSIONS:Modern modelling techniques such as SVM, NN and RF may need over 10 times as many events per variable to achieve a stable AUC and a small optimism than classical modelling techniques such as LR. This implies that such modern techniques should only be used in medical prediction problems if very large data sets are available.
journal_name
BMC Med Res Methodoljournal_title
BMC medical research methodologyauthors
van der Ploeg T,Austin PC,Steyerberg EWdoi
10.1186/1471-2288-14-137subject
Has Abstractpub_date
2014-12-22 00:00:00pages
137issn
1471-2288pii
1471-2288-14-137journal_volume
14pub_type
杂志文章abstract:BACKGROUND:When discussing results medical research articles often tear substantive and statistical (methodical) contributions apart, just as if both were independent. Consequently, reasoning on bias tends to be vague, unclear and superficial. This can lead to over-generalized, too narrow and misleading conclusions, es...
journal_title:BMC medical research methodology
pub_type: 杂志文章,评审
doi:10.1186/s12874-018-0490-1
更新日期:2018-04-17 00:00:00
abstract:BACKGROUND:Self-administered questionnaires are becoming increasingly common in general practice. Much research has explored methods to increase response rates but comparatively few studies have explored the effect of questionnaire administration on reported answers. METHODS:The aim of this study was to determine the ...
journal_title:BMC medical research methodology
pub_type: 杂志文章,随机对照试验
doi:10.1186/1471-2288-8-42
更新日期:2008-06-27 00:00:00
abstract:BACKGROUND:Randomised controlled trials (RCTs) are the gold standard for evidence-based practice. However, RCTs can have limitations. For example, translation of findings into practice can be limited by design features, such as inclusion criteria, not accurately reflecting clinical populations. In addition, it is expen...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-020-01078-9
更新日期:2020-07-25 00:00:00
abstract:BACKGROUND:Multiple Imputation as usually implemented assumes that data are Missing At Random (MAR), meaning that the underlying missing data mechanism, given the observed data, is independent of the unobserved data. To explore the sensitivity of the inferences to departures from the MAR assumption, we applied the meth...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-12-73
更新日期:2012-06-08 00:00:00
abstract:BACKGROUND:The Community Reintegration of Service Members (CRIS) is a new measure of community reintegration developed to measure veteran's participation in life roles. It consists of three sub-scales: Extent of Participation (Extent), Perceived Limitations with Participation (Perceived), and Satisfaction with Particip...
journal_title:BMC medical research methodology
pub_type: 杂志文章,随机对照试验
doi:10.1186/1471-2288-11-98
更新日期:2011-06-25 00:00:00
abstract:BACKGROUND:The mechanisms and pathways to impacts from public health research in the UK have not been widely studied. Through the lens of one funder (NIHR), our aims are to map the diversity of public health research, in terms of funding mechanisms, disciplinary contributions, and public health impacts, identify exampl...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-020-0905-7
更新日期:2020-02-19 00:00:00
abstract:BACKGROUND:Crowding in the emergency department (ED) is associated with increased mortality, increased treatment cost, and reduced quality of care. Crowding arises when demand exceed resources in the ED and a first sign may be increasing waiting time. We aimed to quantify predictors for departure from the ED, and relat...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-019-0710-3
更新日期:2019-03-29 00:00:00
abstract:BACKGROUND:Diabetes-related lower limb amputations are associated with considerable morbidity and mortality and are usually preceded by foot ulceration. The available systematic reviews of aggregate data are compromised because the primary studies report both adjusted and unadjusted estimates. As adjusted meta-analyses...
journal_title:BMC medical research methodology
pub_type: 杂志文章,meta分析
doi:10.1186/1471-2288-13-22
更新日期:2013-02-15 00:00:00
abstract:BACKGROUND:In order to accurately measure and monitor levels of moderate-to-vigorous physical activity (MVPA) and sedentary behaviour (SB) in older adults, cost efficient and valid instruments are required. To date, the International Physical Activity Questionnaire (IPAQ) has not been validated with older adults (aged ...
journal_title:BMC medical research methodology
pub_type: 杂志文章,多中心研究
doi:10.1186/s12874-018-0642-3
更新日期:2018-12-22 00:00:00
abstract:BACKGROUND:Mediation is an important issue considered in the behavioral, medical, and social sciences. It addresses situations where the effect of a predictor variable X on an outcome variable Y is explained to some extent by an intervening, mediator variable M. Methods for addressing mediation have been available for ...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-017-0296-6
更新日期:2017-03-21 00:00:00
abstract:BACKGROUND:Cluster randomised controlled trials (CRCTs) are frequently used in health service evaluation. Assuming an average cluster size, required sample sizes are readily computed for both binary and continuous outcomes, by estimating a design effect or inflation factor. However, where the number of clusters are fix...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-11-102
更新日期:2011-06-30 00:00:00
abstract:BACKGROUND:To provide empirical evidence about prevalence, reporting and handling of missing outcome data in systematic reviews with network meta-analysis and acknowledgement of their impact on the conclusions. METHODS:We conducted a systematic survey including all published systematic reviews of randomized controlled...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-018-0576-9
更新日期:2018-10-24 00:00:00
abstract:BACKGROUND:When conducting a meta-analysis of a continuous outcome, estimated means and standard deviations from the selected studies are required in order to obtain an overall estimate of the mean effect and its confidence interval. If these quantities are not directly reported in the publications, they must be estima...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-015-0055-5
更新日期:2015-08-12 00:00:00
abstract:BACKGROUND:This paper focuses on measuring the efficiency and effectiveness of two diagramming methods employed in key informant interviews with clinicians and health care administrators. The two methods are 'participatory diagramming', where the respondent creates a diagram that assists in their communication of answe...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-8-53
更新日期:2008-08-08 00:00:00
abstract:BACKGROUND:Generic quality of life (QoL) instruments provide important measures of self-reported wellbeing that can be compared across healthy and clinical populations. The aim of this analysis is to validate the ten-item QoL instrument "QOL10", as well as to confirm the validity of the embedded "QOL5" questionnaire an...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-016-0163-x
更新日期:2016-05-23 00:00:00
abstract:BACKGROUND:The BSI-18 contains the three six-item scales somatization, depression, and anxiety as well as the Global Severity Index (GSI), including all 18 items. The BSI-18 is the latest and shortest of the multidimensional versions of the Symptom-Checklist 90-R, but its psychometric properties have not been sufficien...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-016-0283-3
更新日期:2017-01-26 00:00:00
abstract:BACKGROUND:Patient-reported outcome (PRO) measures play a key role in the advancement of patient-centered care research. The accuracy of inferences, relevance of predictions, and the true nature of the associations made with PRO data depend on the validity of these measures. Errors inherent to self-report measures can ...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-016-0161-z
更新日期:2016-05-26 00:00:00
abstract:BACKGROUND:Although in health services survey research we strive for a high response rate, this must be balanced against the need to recruit participants ethically and considerately, particularly in surveys with a sensitive nature. In survey research there are no established recommendations to guide recruitment approac...
journal_title:BMC medical research methodology
pub_type: 临床试验,杂志文章
doi:10.1186/1471-2288-13-3
更新日期:2013-01-11 00:00:00
abstract:BACKGROUND:To determine if the search technique that is used to sample randomized controlled trial (RCT) manuscripts from a field of medical science can influence the measurement of the change in quality over time in that field. METHODS:RCT manuscripts in the field of brain injury were identified using two readily-ava...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-5-7
更新日期:2005-02-07 00:00:00
abstract:BACKGROUND:Dynamic risk models, which incorporate disease-free survival and repeated measurements over time, might yield more accurate predictions of future health status compared to static models. The objective of this study was to develop and apply a dynamic prediction model to estimate the risk of developing type 2 ...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-019-0812-y
更新日期:2019-08-14 00:00:00
abstract:BACKGROUND:A zero-inflated continuous outcome is characterized by occurrence of "excess" zeros that more than a single distribution can explain, with the positive observations forming a skewed distribution. Mixture models are employed for regression analysis of zero-inflated data. Moreover, for repeated measures zero-i...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-10-55
更新日期:2010-06-14 00:00:00
abstract:BACKGROUND:Regular and timely monitoring of blood glucose (BG) levels in hospitalized patients with diabetes mellitus is crucial to optimizing inpatient glycaemic control. However, methods to quantify timeliness as a measurement of quality of care are lacking. We propose an analytical approach that utilizes BG measurem...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-016-0142-2
更新日期:2016-04-08 00:00:00
abstract:BACKGROUND:To describe how frequently harm is reported in the abstract of high impact factor medical journals. METHODS: DESIGN AND POPULATION:We carried out a blinded structured review of a random sample of 363 Randomised Controlled Trials (RCTs) carried out on human beings, and published in high impact factor medica...
journal_title:BMC medical research methodology
pub_type: 信件,评审
doi:10.1186/1471-2288-8-14
更新日期:2008-03-27 00:00:00
abstract:BACKGROUND:Small number of clusters and large variation of cluster sizes commonly exist in cluster-randomized trials (CRTs) and are often the critical factors affecting the validity and efficiency of statistical analyses. F tests are commonly used in the generalized linear mixed model (GLMM) to test intervention effect...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-015-0026-x
更新日期:2015-04-23 00:00:00
abstract:BACKGROUND:Emergency Departments (EDs) are a first point-of-contact for many youth with mental health and suicidality concerns and can serve as an effective recruitment source for randomized controlled trials (RCTs) of mental health interventions. However, recruitment in acute care settings is impeded by several challe...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-020-01117-5
更新日期:2020-09-14 00:00:00
abstract:BACKGROUND:It is notoriously difficult to recruit patients to randomised controlled trials in primary care. This is particularly true when the disease process under investigation occurs relatively infrequently and must be investigated during a brief time window. Bell's palsy, an acute unilateral paralysis of the facial...
journal_title:BMC medical research methodology
pub_type: 杂志文章,随机对照试验
doi:10.1186/1471-2288-7-15
更新日期:2007-03-28 00:00:00
abstract:BACKGROUND:This study aimed to investigate whether awareness of being monitored by an accelerometer has an effect on physical activity in young people. METHODS:Eighty healthy participants aged 10-18 years were randomized between blinded and nonblinded groups. The blinded participants were informed that we were testing...
journal_title:BMC medical research methodology
pub_type: 杂志文章,随机对照试验
doi:10.1186/s12874-017-0378-5
更新日期:2017-07-11 00:00:00
abstract:BACKGROUND:Although some nonparametric methods have been proposed in the literature to test for the equality of median survival times for censored data in medical research, in general they have inflated type I error rates, which make their use limited in practice, especially when the sample sizes are small. METHODS:In...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-016-0133-3
更新日期:2016-03-16 00:00:00
abstract:BACKGROUND:How overall physical activity relates to specific activities and how reported activity changes over time may influence interpretation of observed associations between physical activity and health. We examine the relationships between various physical activities self-reported at different times in a large coh...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-11-97
更新日期:2011-06-22 00:00:00
abstract:BACKGROUND:The intraclass correlation coefficient (ICC) is widely used in biomedical research to assess the reproducibility of measurements between raters, labs, technicians, or devices. For example, in an inter-rater reliability study, a high ICC value means that noise variability (between-raters and within-raters) is...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-14-121
更新日期:2014-11-22 00:00:00