Abstract:
BACKGROUND:Social-environmental data obtained from the US Census is an important resource for understanding health disparities, but rarely is the full dataset utilized for analysis. A barrier to incorporating the full data is a lack of solid recommendations for variable selection, with researchers often hand-selecting a few variables. Thus, we evaluated the ability of empirical machine learning approaches to identify social-environmental factors having a true association with a health outcome. METHODS:We compared several popular machine learning methods, including penalized regressions (e.g. lasso, elastic net), and tree ensemble methods. Via simulation, we assessed the methods' ability to identify census variables truly associated with binary and continuous outcomes while minimizing false positive results (10 true associations, 1000 total variables). We applied the most promising method to the full census data (p = 14,663 variables) linked to prostate cancer registry data (n = 76,186 cases) to identify social-environmental factors associated with advanced prostate cancer. RESULTS:In simulations, we found that elastic net identified many true-positive variables, while lasso provided good control of false positives. Using a combined measure of accuracy, hierarchical clustering based on Spearman's correlation with sparse group lasso regression performed the best overall. Bayesian Adaptive Regression Trees outperformed other tree ensemble methods, but not the sparse group lasso. In the full dataset, the sparse group lasso successfully identified a subset of variables, three of which replicated earlier findings. CONCLUSIONS:This analysis demonstrated the potential of empirical machine learning approaches to identify a small subset of census variables having a true association with the outcome, and that replicate across empiric methods. Sparse clustered regression models performed best, as they identified many true positive variables while controlling false positive discoveries.
journal_name
BMC Med Res Methodoljournal_title
BMC medical research methodologyauthors
Handorf E,Yin Y,Slifker M,Lynch Sdoi
10.1186/s12874-020-01183-9subject
Has Abstractpub_date
2020-12-10 00:00:00pages
302issue
1issn
1471-2288pii
10.1186/s12874-020-01183-9journal_volume
20pub_type
杂志文章abstract:BACKGROUND:The study design and protocol that underpin a randomised controlled trial (RCT) are critical for the ultimate success of the trial. Although RCTs are considered the gold standard for research, there are multiple threats to their validity such as participant recruitment and retention, identifying a meaningful...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-019-0772-2
更新日期:2019-06-18 00:00:00
abstract:BACKGROUND:The systematic collection of high-quality mortality data is a prerequisite in designing relevant drowning prevention programmes. This descriptive study aimed to assess the quality (i.e., level of specificity) of cause-of-death reporting using ICD-10 drowning codes across 69 countries. METHODS:World Health O...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-10-30
更新日期:2010-04-08 00:00:00
abstract:BACKGROUND:Occupational stress and specifically job anxiety are crucial factors in determining health outcomes, job satisfaction as well as performance. In order to assess this phenomenon, the Job Anxiety Scale is one of the instruments available. It consists of 70 items that are clustered in 14 subscales and five dime...
journal_title:BMC medical research methodology
pub_type: 杂志文章,收录出版
doi:10.1186/s12874-020-00974-4
更新日期:2020-04-21 00:00:00
abstract:BACKGROUND:Exposure to unhealthy environments and inadequate child stimulation are main risk factors that affect children's health and wellbeing in low- and middle-income countries. Interventions that simultaneously address several risk factors at the household level have great potential to reduce these negative effect...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-020-00950-y
更新日期:2020-04-02 00:00:00
abstract:BACKGROUND:The occurrence of communicable diseases (CD) depends on exposure to contagious persons. The effects of exposure to CD are delayed in time and contagious persons remain contagious for several days during which their contagiousness varies. Moreover when multiple exposures occur, it is difficult to know which e...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-13-26
更新日期:2013-02-20 00:00:00
abstract:BACKGROUND:Typically, randomization software should allow users to exert control over the different aspects of randomization including block design, provision of unique identifiers and control over the format and type of program output. While some of these characteristics have been addressed by available software, none...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-4-26
更新日期:2004-11-09 00:00:00
abstract:BACKGROUND:Cluster randomised controlled trials (CRCTs) are frequently used in health service evaluation. Assuming an average cluster size, required sample sizes are readily computed for both binary and continuous outcomes, by estimating a design effect or inflation factor. However, where the number of clusters are fix...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-11-102
更新日期:2011-06-30 00:00:00
abstract:BACKGROUND:Systematic reviewers seek to comprehensively search for relevant studies and summarize these to present the most valid estimate of intervention effectiveness. The more resources searched, the higher the yield, and thus time and costs required to conduct a systematic review. While there is an abundance of evi...
journal_title:BMC medical research methodology
pub_type: 杂志文章,评审
doi:10.1186/1471-2288-5-24
更新日期:2005-08-10 00:00:00
abstract:BACKGROUND:In Routine Outcome Monitoring (ROM) there is a high demand for short assessments. Computerized Adaptive Testing (CAT) is a promising method for efficient assessment. In this article, the efficiency of a CAT version of the Mood and Anxiety Symptom Questionnaire, - Anhedonic Depression scale (MASQ-AD) for use ...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-12-4
更新日期:2012-01-10 00:00:00
abstract:BACKGROUND:Disease incidence and prevalence are both core indicators of population health. Incidence is generally not as readily accessible as prevalence. Cohort studies and electronic health record systems are two major way to estimate disease incidence. The former is time-consuming and expensive; the latter is not av...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-016-0288-y
更新日期:2017-01-23 00:00:00
abstract:BACKGROUND:Missing data present a challenge to many research projects. The problem is often pronounced in studies utilizing self-report scales, and literature addressing different strategies for dealing with missing data in such circumstances is scarce. The objective of this study was to compare six different imputatio...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-6-57
更新日期:2006-12-13 00:00:00
abstract:BACKGROUND:There is increasing awareness that meta-analyses require a sufficiently large information size to detect or reject an anticipated intervention effect. The required information size in a meta-analysis may be calculated from an anticipated a priori intervention effect or from an intervention effect suggested b...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-9-86
更新日期:2009-12-30 00:00:00
abstract:BACKGROUND:A recent paper found that terminal digits of statistical values in Nature deviated significantly from an equiprobable distribution, indicating errors or inconsistencies in rounding. This finding, as well as the discovery that a large percentage of p values were inconsistent with reported test statistics, led...
journal_title:BMC medical research methodology
pub_type: 评论,杂志文章
doi:10.1186/1471-2288-6-45
更新日期:2006-09-13 00:00:00
abstract:BACKGROUND:Outcomes in observational studies may not best estimate those expected in the HIV vaccine efficacy trials. We compared retention in Simulated HIV Vaccine Efficacy Trials (SiVETs) and observational cohorts drawn from two key populations in Uganda. METHODS:Two SiVETs were nested within two observational cohor...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-020-00920-4
更新日期:2020-02-12 00:00:00
abstract:BACKGROUND:Surveys are established methods for collecting population data that are unavailable from other sources; however, response rates to surveys are declining. A number of methods have been identified to increase survey returns yet response rates remain low. This paper evaluates the impact of five selected methods...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-019-0702-3
更新日期:2019-03-20 00:00:00
abstract:BACKGROUND:To accurately predict the response to treatment, we need a stable and effective risk score that can be calculated from patient characteristics. When we evaluate such risks from time-to-event data with right-censoring, Cox's proportional hazards model is the most popular for estimating the linear risk score. ...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-020-01063-2
更新日期:2020-07-06 00:00:00
abstract:BACKGROUND:Multiple Imputation as usually implemented assumes that data are Missing At Random (MAR), meaning that the underlying missing data mechanism, given the observed data, is independent of the unobserved data. To explore the sensitivity of the inferences to departures from the MAR assumption, we applied the meth...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-12-73
更新日期:2012-06-08 00:00:00
abstract:BACKGROUND:Network meta-analysis can be used to combine results from several randomized trials involving more than two treatments. Potential inconsistency among different types of trial (designs) differing in the set of treatments tested is a major challenge, and application of procedures for detecting and locating inc...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-14-61
更新日期:2014-05-10 00:00:00
abstract:BACKGROUND:Modern modelling techniques may potentially provide more accurate predictions of binary outcomes than classical techniques. We aimed to study the predictive performance of different modelling techniques in relation to the effective sample size ("data hungriness"). METHODS:We performed simulation studies bas...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-14-137
更新日期:2014-12-22 00:00:00
abstract:BACKGROUND:It has been demonstrated that the enclosure of money with a mailed questionnaire increases the response rate significantly. We evaluated scratch lottery tickets as an alternative to cash. METHODS:1500 randomly selected Norwegians between the ages of 40 and 65 years were sent a short questionnaire. 250 recei...
journal_title:BMC medical research methodology
pub_type: 杂志文章,随机对照试验
doi:10.1186/1471-2288-6-19
更新日期:2006-04-28 00:00:00
abstract:BACKGROUND:Information and theory beyond copula concepts are essential to understand the dependence relationship between several marginal covariates distributions. In a therapeutic trial data scheme, most of the time, censoring occurs. That could lead to a biased interpretation of the dependence relationship between ma...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-017-0305-9
更新日期:2017-02-15 00:00:00
abstract:BACKGROUND:The Patient Reported Outcomes Measurement Information System 43-item short form (PROMIS-43) and the five-level EQ-5D (EQ-5D-5L) are recently developed measures of health-related quality of life (HRQL) that have potentially broad application in evaluating treatments and capturing burden of respiratory-related...
journal_title:BMC medical research methodology
pub_type: 杂志文章,多中心研究
doi:10.1186/1471-2288-14-78
更新日期:2014-06-16 00:00:00
abstract:BACKGROUND:Emergency Departments (EDs) are a first point-of-contact for many youth with mental health and suicidality concerns and can serve as an effective recruitment source for randomized controlled trials (RCTs) of mental health interventions. However, recruitment in acute care settings is impeded by several challe...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-020-01117-5
更新日期:2020-09-14 00:00:00
abstract:BACKGROUND:Amyotrophic Lateral Sclerosis (ALS), also known as Lou Gehrig's disease, is a rare disease with extreme between-subject variability, especially with respect to rate of disease progression. This makes modelling a subject's disease progression, which is measured by the ALS Functional Rating Scale (ALSFRS), ver...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-018-0479-9
更新日期:2018-02-06 00:00:00
abstract:BACKGROUND:The reporting of randomised controlled trial (RCT) abstracts is of vital importance. The primary objective of this study was to investigate the association between structure format and RCT abstracts' quality of methodology reporting, informed by the current requirement and usage of structure formats by leadi...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-017-0469-3
更新日期:2018-01-10 00:00:00
abstract:BACKGROUND:Randomized controlled trials are the gold-standard for clinical trials. However, randomization is not always feasible. In this article we propose a prospective and adaptive matched case-control trial design assuming that a control group already exists. METHODS:We propose and discuss an interim analysis step...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-019-0763-3
更新日期:2019-07-16 00:00:00
abstract:BACKGROUND:Multistate models have become increasingly useful to study the evolution of a patient's state over time in intensive care units ICU (e.g. admission, infections, alive discharge or death in ICU). In addition, in critically-ill patients, data come from different ICUs, and because observations are clustered int...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/1471-2288-12-79
更新日期:2012-06-15 00:00:00
abstract:BACKGROUND:This study aimed to investigate whether awareness of being monitored by an accelerometer has an effect on physical activity in young people. METHODS:Eighty healthy participants aged 10-18 years were randomized between blinded and nonblinded groups. The blinded participants were informed that we were testing...
journal_title:BMC medical research methodology
pub_type: 杂志文章,随机对照试验
doi:10.1186/s12874-017-0378-5
更新日期:2017-07-11 00:00:00
abstract:BACKGROUND:While randomised controlled trials (RCTs) provide high-quality evidence to guide practice, much routine care is not based upon available RCTs. This disconnect between evidence and practice is not sufficiently well understood. This case study explores this relationship using a novel approach. Better understan...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-020-01009-8
更新日期:2020-05-12 00:00:00
abstract:BACKGROUND:Multilevel models for non-normal outcomes are widely used in medical and health sciences research. While methods for interpreting fixed effects are well-developed, methods to quantify and interpret random cluster variation and compare it with other sources of variation are less established. Random cluster va...
journal_title:BMC medical research methodology
pub_type: 杂志文章
doi:10.1186/s12874-018-0517-7
更新日期:2018-07-06 00:00:00