Ultra high-dimensional semiparametric longitudinal data analysis.

Abstract:

:As ultra high-dimensional longitudinal data are becoming ever more apparent in fields such as public health and bioinformatics, developing flexible methods with a sparse model is of high interest. In this setting, the dimension of the covariates can potentially grow exponentially as exp ( n 1 / 2 ) with respect to the number of clusters n. We consider a flexible semiparametric approach, namely, partially linear single-index models, for ultra high-dimensional longitudinal data. Most importantly, we allow not only the partially linear covariates but also the single-index covariates within the unknown flexible function estimated nonparametrically to be ultra high dimensional. Using penalized generalized estimating equations, this approach can capture correlation within subjects, can perform simultaneous variable selection and estimation with a smoothly clipped absolute deviation penalty, and can capture nonlinearity and potentially some interactions among predictors. We establish asymptotic theory for the estimators including the oracle property in ultra high dimension for both the partially linear and nonparametric components, and we present an efficient algorithm to handle the computational challenges. We show the effectiveness of our method and algorithm via a simulation study and a yeast cell cycle gene expression data.

journal_name

Biometrics

journal_title

Biometrics

authors

Green B,Lian H,Yu Y,Zu T

doi

10.1111/biom.13348

subject

Has Abstract

pub_date

2020-08-04 00:00:00

eissn

0006-341X

issn

1541-0420

pub_type

杂志文章
  • Further aspects of a Markovian sampling policy for water quality monitoring.

    abstract::In this paper, a Markov process is developed as a mathematical model to study the general problem of quality control monitoring. This approach was previously used by Arnold (1970) in development of sampling plans to study the water quality monitoring of streams. Arnold considered the expected sample size required for ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Smeach SC,Jernigan RW

    更新日期:1977-03-01 00:00:00

  • Breeding return times and abundance in capture-recapture models.

    abstract::For many long-lived animal species, individuals do not breed every year, and are often not accessible during non-breeding periods. Individuals exhibit site fidelity if they return to the same breeding colony or spawning ground when they breed. If capture and recapture is only possible at the breeding site, temporary e...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12094

    authors: Pledger S,Baker E,Scribner K

    更新日期:2013-12-01 00:00:00

  • Criteria for the validation of surrogate endpoints in randomized experiments.

    abstract::The validation of surrogate endpoints has been studied by Prentice (1989, Statistics in Medicine 8, 431-440) and Freedman, Graubard, and Schatzkin (1992, Statistics in Medicine 11, 167-178). We extended their proposals in the cases where the surrogate and the final endpoints are both binary or normally distributed. Le...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Buyse M,Molenberghs G

    更新日期:1998-09-01 00:00:00

  • Combining multivariate bioassays.

    abstract::Linear multivariate theory is applied to the problem of combining several multivariate bioassays. Results are an asymptotic test of the hypothesis of a common log relative potency; the maximum likelihood estimator of the common log relative potency; and an exact and asymptotic confidence interval estimator for log rel...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Meisner M,Kushner HB,Laska EM

    更新日期:1986-06-01 00:00:00

  • A group sequential procedure for all-pairwise comparisons of k treatments based on the range statistic.

    abstract::In this paper, a group sequential procedure for all-private comparisons of the means of k independent normal populations with a common known variance is proposed. A repeated range test is defined and its critical points are tabulated. The power function is studied and minimum group size needed to achieve a desirable p...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Liu W

    更新日期:1995-09-01 00:00:00

  • Prediction in the presence of measurement error: general discussion and an example predicting defoliation.

    abstract::Motivated by the particular problem of predicting defoliation based on a measure of gypsy moth egg mass density, prediction in the presence of measurement error is discussed. The measurement error variances and covariances are allowed to vary from unit to unit and are estimated by some type of within unit sampling. A ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Buonaccorsi JP

    更新日期:1995-12-01 00:00:00

  • Model selection and inference for censored lifetime medical expenditures.

    abstract::Identifying factors associated with increased medical cost is important for many micro- and macro-institutions, including the national economy and public health, insurers and the insured. However, assembling comprehensive national databases that include both the cost and individual-level predictors can prove challengi...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12464

    authors: Johnson BA,Long Q,Huang Y,Chansky K,Redman M

    更新日期:2016-09-01 00:00:00

  • A logistic-bivariate normal model for overdispersed two-state Markov processes.

    abstract::We describe a logistic-bivariate normal mixture model for a two-state Markov chain in which each individual makes transitions between states according to a subject-specific transition probability matrix. The use of the bivariate normal mixing distribution facilitates inferences regarding the correlation of the random ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Cook RJ,Ng ET

    更新日期:1997-03-01 00:00:00

  • Regression models for disease prevalence with diagnostic tests on pools of serum samples.

    abstract::Whether the aim is to diagnose individuals or estimate prevalence, many epidemiological studies have demonstrated the successful use of tests on pooled sera. These tests detect whether at least one sample in the pool is positive. Although originally designed to reduce diagnostic costs, testing pools also lowers false ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2000.01126.x

    authors: Vansteelandt S,Goetghebeur E,Verstraeten T

    更新日期:2000-12-01 00:00:00

  • Heterogeneity models of disease susceptibility, with application to diabetic nephropathy.

    abstract::It is not, in general, possible to include all relevant risk factors in a model of survival or disease incidence. This heterogeneity must be accounted for in the interpretation, as it can imply otherwise unexpected results. This is illustrated by diabetic nephropathy, a serious complication experienced by some diabeti...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Hougaard P,Myglegaard P,Borch-Johnsen K

    更新日期:1994-12-01 00:00:00

  • On Bayesian methods for bioequivalence.

    abstract::Bayesian methods are presented for assessing bioequivalence for studies in which a new formulation and a standard are administered simultaneously, and for Latin square designs which compare two or more new formulations to a standard. Two examples illustrate the application of the methods. ...

    journal_title:Biometrics

    pub_type: 临床试验,杂志文章

    doi:

    authors: Selwyn MR,Hall NR

    更新日期:1984-12-01 00:00:00

  • A two-stage stepwise estimation procedure.

    abstract::This article proposes a two-stage simultaneous confidence procedure for the comparisons of k pairs of population means, without using multiplicity adjustment of more than two populations. The proposed procedure can be broadly applied to parametric or nonparametric models. It is robust and versatile because its derivat...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2007.00902.x

    authors: Chen JT

    更新日期:2008-06-01 00:00:00

  • Abundance-based similarity indices and their estimation when there are unseen species in samples.

    abstract::A wide variety of similarity indices for comparing two assemblages based on species incidence (i.e., presence/absence) data have been proposed in the literature. These indices are generally based on three simple incidence counts: the number of species shared by two assemblages and the number of species unique to each ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2005.00489.x

    authors: Chao A,Chazdon RL,Colwell RK,Shen TJ

    更新日期:2006-06-01 00:00:00

  • Models for circular-linear and circular-circular data constructed from circular distributions based on nonnegative trigonometric sums.

    abstract::Johnson and Wehrly (1978, Journal of the American Statistical Association 73, 602-606) and Wehrly and Johnson (1980, Biometrika 67, 255-256) show one way to construct the joint distribution of a circular and a linear random variable, or the joint distribution of a pair of circular random variables from their marginal ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2006.00716.x

    authors: Fernández-Durán JJ

    更新日期:2007-06-01 00:00:00

  • Selecting the smoothing parameter for estimation of slowly changing evoked potential signals.

    abstract::Brain evoked potential (EP) data consist of a true response ("signal") and random background activity ("noise"), which are observed over repeated stimulus presentations ("trials"). A signal that changes slowly from trial to trial can be estimated by smoothing across trials and over time within trials. We present a met...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Raz J,Turetsky B,Fein G

    更新日期:1989-09-01 00:00:00

  • Sample-size formula for the proportional-hazards regression model.

    abstract::A formula is derived for determining the number of observations necessary to test the equality of two survival distributions when concomitant information is incorporated. This formula should be useful in designing clinical trials with a heterogeneous patient population. Schoenfeld (1981, Biometrika 68, 316-319) derive...

    journal_title:Biometrics

    pub_type: 临床试验,杂志文章

    doi:

    authors: Schoenfeld DA

    更新日期:1983-06-01 00:00:00

  • Combined maximum likelihood estimates for the equicorrelation coefficient.

    abstract::Combined maximum likelihood estimates for equicorrelation covariance matrices are considered. The case of a common equicorrelation rho and possibly different standard deviations sigma 1, ..., sigma k among k experimental groups is examined first, and the estimation of (rho, sigma 1, ..., sigma k) is discussed. Second,...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Viana MA

    更新日期:1994-09-01 00:00:00

  • Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors.

    abstract::Directed acyclic graphical (DAG) models are increasingly employed in the study of physical and biological systems to model direct influences between variables. Identifying the graph from data is a challenging endeavor, which can be more reasonably tackled if the variables are assumed to satisfy a given ordering; in th...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12018

    authors: Altomare D,Consonni G,La Rocca L

    更新日期:2013-06-01 00:00:00

  • Approximate Bayesian inference for discretely observed continuous-time multi-state models.

    abstract::Inference for continuous time multi-state models presents considerable computational difficulties when the process is only observed at discrete time points with no additional information about the state transitions. In fact, for general multi-state Markov model, evaluation of the likelihood function is possible only v...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13019

    authors: Tancredi A

    更新日期:2019-09-01 00:00:00

  • FPCA-based method to select optimal sampling schedules that capture between-subject variability in longitudinal studies.

    abstract::A critical component of longitudinal study design involves determining the sampling schedule. Criteria for optimal design often focus on accurate estimation of the mean profile, although capturing the between-subject variance of the longitudinal process is also important since variance patterns may be associated with ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12714

    authors: Wu M,Diez-Roux A,Raghunathan TE,Sánchez BN

    更新日期:2018-03-01 00:00:00

  • Assessing the goodness-of-fit of hidden Markov models.

    abstract::In this article, we propose a graphical technique for assessing the goodness-of-fit of a stationary hidden Markov model (HMM). We show that plots of the estimated distribution against the empirical distribution detect lack of fit with high probability for large sample sizes. By considering plots of the univariate and ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2004.00189.x

    authors: MacKay Altman R

    更新日期:2004-06-01 00:00:00

  • Sample size determination for testing whether an identified treatment is best.

    abstract::Laska and Meisner (1989, Biometrics 45, 1139-1151) dealt with the problem of testing whether an identified treatment belonging to a set of k + 1 treatments is better than each of the other k treatments. They calculated sample size tables for k = 2 when using multiple t-tests or Wilcoxon-Mann-Whitney tests, both under ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2000.00879.x

    authors: Horn M,Vollandt R,Dunnett CW

    更新日期:2000-09-01 00:00:00

  • Connecting the latent multinomial.

    abstract::Link et al. (2010, Biometrics 66, 178-185) define a general framework for analyzing capture-recapture data with potential misidentifications. In this framework, the observed vector of counts, y, is considered as a linear function of a vector of latent counts, x, such that y=Ax, with x assumed to follow a multinomial d...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12333

    authors: Schofield MR,Bonner SJ

    更新日期:2015-12-01 00:00:00

  • Asymptotic confidence bands for generalized nonlinear regression models.

    abstract::Asymptotic confidence bands for generalized nonlinear regression models are developed. These are based on a combination of the S method of Scheffe, together with the delta method which is used to approximate the mean function by a linear combination of the parameters. The approach can be used in any situation where la...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Cox C,Ma G

    更新日期:1995-03-01 00:00:00

  • To use or not to use? Backward equations in stochastic carcinogenesis models.

    abstract::The method based on the Kolmogorov backward equations of Little (1995, Biometrics 51, 1278-1291) for computing hazard functions for the multistage carcinogenesis models fails when model parameters are time-dependent. In addition to suggesting an alternative method based on the Kolmogorov forward equation, this note hi...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Zheng Q

    更新日期:1998-03-01 00:00:00

  • Maximum likelihood estimation for N-mixture models.

    abstract::The focus of this article is on the nature of the likelihood associated with N-mixture models for repeated count data. It is shown that the infinite sum embedded in the likelihood associated with the Poisson mixing distribution can be expressed in terms of a hypergeometric function and, thence, in closed form. The res...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12521

    authors: Haines LM

    更新日期:2016-12-01 00:00:00

  • Bayesian semiparametric models for survival data with a cure fraction.

    abstract::We propose methods for Bayesian inference for a new class of semiparametric survival models with a cure fraction. Specifically, we propose a semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution. We show that such a parameter ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2001.00383.x

    authors: Ibrahim JG,Chen MH,Sinha D

    更新日期:2001-06-01 00:00:00

  • A Monte Carlo investigation of homogeneity tests of the odds ratio under various sample size configurations.

    abstract::Epidemiologic data for case-control studies are often summarized into K 2 x 2 tables. Given a fixed number of cases and controls, the degree of sparseness in the data depends on the number of strata, K. The effect of increasing stratification on size and power of seven tests of homogeneity of the odds ratio is studied...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Jones MP,O'Gorman TW,Lemke JH,Woolson RF

    更新日期:1989-03-01 00:00:00

  • An implicitly defined parametric model for censored survival data and covariates.

    abstract::Parametric survival functions are usually defined as explicit functions of time and covariates. However, consideration of some simple differential equations describing certain survival curves leads to a descriptive equation which cannot be explicitly solved for the survival function. Nevertheless, the resulting surviv...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Piantadosi S,Crowley J

    更新日期:1995-03-01 00:00:00

  • Sequential monitoring for comparison of changes in a response variable in clinical studies.

    abstract::The spending function approach proposed by Lan and DeMets (1983, Biometrika 70, 659-663) for sequential monitoring of clinical trials is applied to situations where comparison of changes in a continuous response variable between two groups is the primary concern. Death, loss to follow-up, and missed visits could cause...

    journal_title:Biometrics

    pub_type: 临床试验,杂志文章

    doi:

    authors: Wu MC,Lan KK

    更新日期:1992-09-01 00:00:00