Abstract:
:In recent years, increasing attention has been devoted to the problem of the stability of multivariable regression models, understood as the resistance of the model to small changes in the data on which it has been fitted. Resampling techniques, mainly based on the bootstrap, have been developed to address this issue. In particular, the approaches based on the idea of "inclusion frequency" consider the repeated implementation of a variable selection procedure, for example backward elimination, on several bootstrap samples. The analysis of the variables selected in each iteration provides useful information on the model stability and on the variables' importance. Recent findings, nevertheless, show possible pitfalls in the use of the bootstrap, and alternatives such as subsampling have begun to be taken into consideration in the literature. Using model selection frequencies and variable inclusion frequencies, we empirically compare these two different resampling techniques, investigating the effect of their use in selected classical model selection procedures for multivariable regression. We conduct our investigations by analyzing two real data examples and by performing a simulation study. Our results reveal some advantages in using a subsampling technique rather than the bootstrap in this context.
journal_name
Biometricsjournal_title
Biometricsauthors
De Bin R,Janitza S,Sauerbrei W,Boulesteix ALdoi
10.1111/biom.12381subject
Has Abstractpub_date
2016-03-01 00:00:00pages
272-80issue
1eissn
0006-341Xissn
1541-0420journal_volume
72pub_type
杂志文章相关文献
BIOMETRICS文献大全abstract::In genetic family studies, ages at onset of diseases are routinely collected. Often one is interested in assessing the familial association of ages at the onset of a certain disease type. However, when a competing risk is present and is related to the disease of interest, the usual measure of association by treating t...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.1541-0420.2009.01372.x
更新日期:2010-12-01 00:00:00
abstract::Linear multivariate theory is applied to the problem of combining several multivariate bioassays. Results are an asymptotic test of the hypothesis of a common log relative potency; the maximum likelihood estimator of the common log relative potency; and an exact and asymptotic confidence interval estimator for log rel...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1986-06-01 00:00:00
abstract::The objective of this article is to develop a hypothesis-testing procedure to determine whether a common source outbreak has ended. We consider the case when neither the calendar date of exposure to the pathogen nor the exact incubation period distribution is known. The hypothesis-testing procedure is based on the spa...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.1541-0420.2005.00421.x
更新日期:2006-03-01 00:00:00
abstract::I discuss diagnostic methods for discriminant analysis. The equivalence with linear regression is noted and regression diagnostics are considered. The leverage is a function of the linear discriminant function and the Mahalanobis distance of the observation from the group mean. The distribution of this distance is app...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1997-12-01 00:00:00
abstract:SUMMARY:A trend test is often employed to analyze ordered categorical data, in which a set of increasing scores is assigned a priori. There is a drawback in this approach, because how to choose a set of scores is not clear. There have been debates on which scores should be used (e.g., Graubard and Korn, 1987, Biometric...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.1541-0420.2008.00992.x
更新日期:2008-12-01 00:00:00
abstract::Treatments are frequently evaluated in terms of their effect on patient survival. In settings where randomization of treatment is not feasible, observational data are employed, necessitating correction for covariate imbalances. Treatments are usually compared using a hazard ratio. Most existing methods which quantify ...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/biom.12542
更新日期:2017-03-01 00:00:00
abstract::The fitting of finite mixture models via the EM algorithm is considered for data which are available only in grouped form and which may also be truncated. A practical example is presented where a mixture of two doubly truncated log-normal distributions is adopted to model the distribution of the volume of red blood ce...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1988-06-01 00:00:00
abstract::Unless the true association is very strong, simple large-sample confidence intervals for the odds ratio based on the delta method perform well even for small samples. Such intervals include the Woolf logit interval and the related Gart interval based on adding .5 before computing the log odds ratio estimate and its st...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.0006-341x.1999.00597.x
更新日期:1999-06-01 00:00:00
abstract::The distribution of ventilation-perfusion ratio over the lung is a useful indicator of the efficiency of lung function. Information about this distribution can be obtained by observing the retention in blood of inert gases passed through the lung. These retentions are related to the ventilation-perfusion distribution ...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1992-03-01 00:00:00
abstract::Identifying disease-associated changes in DNA methylation can help us gain a better understanding of disease etiology. Bisulfite sequencing allows the generation of high-throughput methylation profiles at single-base resolution of DNA. However, optimally modeling and analyzing these sparse and discrete sequencing data...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/biom.13307
更新日期:2020-05-21 00:00:00
abstract::A method purporting to provide optimal allocations in bioequivalence studies fails to do so on both statistical and practical grounds. Reasons as to why this is so are given. ...
journal_title:Biometrics
pub_type: 评论,杂志文章
doi:10.1111/j.0006-341x.1999.01314.x
更新日期:1999-12-01 00:00:00
abstract::Brain evoked potential (EP) data consist of a true response ("signal") and random background activity ("noise"), which are observed over repeated stimulus presentations ("trials"). A signal that changes slowly from trial to trial can be estimated by smoothing across trials and over time within trials. We present a met...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1989-09-01 00:00:00
abstract::Tag loss in mark-recapture experiments is a violation of one of the Jolly-Seber model assumptions. It causes bias in parameter estimates and has only been dealt with in an ad hoc manner. We develop methodology to estimate tag retention and abundance in double-tagging mark-recapture experiments. We apply this methodolo...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.1541-0420.2006.00523.x
更新日期:2006-09-01 00:00:00
abstract::The class of admissible tests for Hardy-Weinberg equilibrium in a multi-allelic system is characterized. The standard goodness-of-fit chi-square tests is shown to be admissible for systems of two or more alleles. The conditional probability distribution required to determine the exact significance level of this test i...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1980-03-01 00:00:00
abstract::Dose-response models are intensively used in herbicide bioassays. Despite recent advancements in the development of new herbicides, statistical analyses are commonly based on asymptotic approximations that are sometimes poor. This paper presents the use of recent results in higher order asymptotics for likelihood-base...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.0006-341x.2000.01204.x
更新日期:2000-12-01 00:00:00
abstract::The accuracy of a new diagnostic test is often determined by comparison with a reference test which also has unknown error rates. Maximum likelihood estimation of the error rates of both tests is possible if they are simultaneously applied to two populations with different disease prevalences. The estimation procedure...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1985-12-01 00:00:00
abstract::Brownie, Boos, and Hughes-Oliver (1990, Biometrics 46, 259-266) suggested a modification to the fixed-effects analysis of variance (ANOVA) F test for use in situations where treatments are likely to affect mean response while simultaneously increasing between-subject variability. These authors suggest that the modifie...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1993-09-01 00:00:00
abstract::Geographic information about the levels of toxics in environmental media is commonly used in regional environmental health studies when direct measurements of personal exposure is limited or unavailable. In this article, we propose a statistical framework for analyzing the spatial distribution of topsoil geochemical p...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.1541-0420.2008.01041.x
更新日期:2009-03-01 00:00:00
abstract::In this article, we propose a graphical technique for assessing the goodness-of-fit of a stationary hidden Markov model (HMM). We show that plots of the estimated distribution against the empirical distribution detect lack of fit with high probability for large sample sizes. By considering plots of the univariate and ...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.0006-341X.2004.00189.x
更新日期:2004-06-01 00:00:00
abstract::The use of the uniformly most powerful among the unbiased (UMPU) test was recently suggested for the study of gametic association between two polymorphic loci as an alternative to the Fisher's exact test (Zapata and Alvarez, 1997, Annals of Human Genetics 61, 71-77). However, the proposed test is not UMPU for two-side...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.0006-341x.2001.00535.x
更新日期:2001-06-01 00:00:00
abstract::In this paper, a group sequential procedure for all-private comparisons of the means of k independent normal populations with a common known variance is proposed. A repeated range test is defined and its critical points are tabulated. The power function is studied and minimum group size needed to achieve a desirable p...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1995-09-01 00:00:00
abstract::For patients on dialysis, hospitalizations remain a major risk factor for mortality and morbidity. We use data from a large national database, United States Renal Data System, to model time-varying effects of hospitalization risk factors as functions of time since initiation of dialysis. To account for the three-level...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/biom.13205
更新日期:2020-09-01 00:00:00
abstract::It has become increasingly common in epidemiological studies to pool specimens across subjects to achieve accurate quantitation of biomarkers and certain environmental chemicals. In this article, we consider the problem of fitting a binary regression model when an important exposure is subject to pooling. We take a re...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.1541-0420.2010.01464.x
更新日期:2011-06-01 00:00:00
abstract::Zhao and Tsiatis (1997) consider the problem of estimation of the distribution of the quality-adjusted lifetime when the chronological survival time is subject to right censoring. The quality-adjusted lifetime is typically defined as a weighted sum of the times spent in certain states up until death or some other fail...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.0006-341x.1999.00530.x
更新日期:1999-06-01 00:00:00
abstract::In a period starting around 2007, the Hand, Foot, and Mouth Disease (HFMD) became wide-spreading in China, and the Chinese public health was seriously threatened. To prevent the outbreak of infectious diseases like HFMD, effective disease surveillance systems would be especially helpful to give signals of disease outb...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/biom.12301
更新日期:2015-09-01 00:00:00
abstract::In the calibration problem, the need to construct a confidence interval to estimate the unknown chi 0 arises when the null hypothesis of zero slope is rejected. Otherwise, the resulting confidence interval will be infinite to reflect the fact that the slope of the regression line may be zero. Under the condition of re...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1991-12-01 00:00:00
abstract::Statistical methods for the detection of genes influencing quantitative traits with the aid of genetic markers are well developed for normally distributed, fully observed phenotypes. Many experiments are concerned with failure-time phenotypes, which have skewed distributions and which are usually subject to censoring ...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/j.1541-0420.2005.00346.x
更新日期:2005-09-01 00:00:00
abstract::The feasibility and cost-effectiveness of estimation of kappa using a case-control method of sampling, proposed by Jannarone, Macera, and Garrison (1987, Biometrics 43, 433-437), is provided support. However, in this article unrealistic assumptions in their presentation are identified and more general results for more...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1990-03-01 00:00:00
abstract::At any age the mean residual life function gives the expected remaining life at that age. Reliabilists and biometricians have found it useful to categorize failure distributions by the monotonicity properties of the mean residual life function. Hollander and Proschan (1975, Biometrika 62, 585-593) have derived tests o...
journal_title:Biometrics
pub_type: 杂志文章
doi:
更新日期:1983-03-01 00:00:00
abstract::Many problems that appear in biomedical decision-making, such as diagnosing disease and predicting response to treatment, can be expressed as binary classification problems. The support vector machine (SVM) is a popular classification technique that is robust to model misspecification and effectively handles high-dime...
journal_title:Biometrics
pub_type: 杂志文章
doi:10.1111/biom.13365
更新日期:2020-08-31 00:00:00