Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables.

Abstract:

:In record linkage studies, unique identifiers are often not available, and therefore, the linkage procedure depends on combinations of partially identifying variables with low discriminating power. As a consequence, wrongly linked covariate and outcome pairs will be created and bias further analysis of the linked data. In this article, we investigated two estimators that correct for linkage error in regression analysis. We extended the estimators developed by Lahiri and Larsen and also suggested a weighted least squares approach to deal with linkage error. We considered both linear and logistic regression problems and evaluated the performance of both methods with simulations. Our results show that all wrong covariate and outcome pairs need to be removed from the analysis in order to calculate unbiased regression coefficients in both approaches. This removal requires strong assumptions on the structure of the data. In addition, the bias significantly increases when the assumptions do not hold and wrongly linked records influence the coefficient estimation. Our simulations showed that both methods had similar performance in linear regression problems. With logistic regression problems, the weighted least squares method showed less bias. Because the specific structure of the data in record linkage problems often leads to different assumptions, it is necessary that the analyst has prior knowledge on the nature of the data. These assumptions are more easily introduced in the weighted least squares approach than in the Lahiri and Larsen estimator.

journal_name

Stat Med

journal_title

Statistics in medicine

authors

Hof MH,Zwinderman AH

doi

10.1002/sim.5498

subject

Has Abstract

pub_date

2012-12-30 00:00:00

pages

4231-42

issue

30

eissn

0277-6715

issn

1097-0258

journal_volume

31

pub_type

杂志文章
  • The ghosts of departed quantities: approaches to dealing with observations below the limit of quantitation.

    abstract::A common but not necessarily logical requirement in drug development is that a 'limit of quantitation' be set for chemical assays and that observations that fall below the limit should not be treated as real data but should be labelled as below the limit and set aside for special treatment. We examine five of seven ap...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5515

    authors: Senn S,Holford N,Hockey H

    更新日期:2012-12-30 00:00:00

  • Power and money in cluster randomized trials: when is it worth measuring a covariate?

    abstract::The power to detect a treatment effect in cluster randomized trials can be increased by increasing the number of clusters. An alternative is to include covariates into the regression model that relates treatment condition to outcome. In this paper, formulae are derived in order to evaluate both strategies on basis of ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.2297

    authors: Moerbeek M

    更新日期:2006-08-15 00:00:00

  • British 1990 growth reference centiles for weight, height, body mass index and head circumference fitted by maximum penalized likelihood.

    abstract::To update the British growth reference, anthropometric data for weight, height, body mass index (weight/height2) and head circumference from 17 distinct surveys representative of England, Scotland and Wales (37,700 children, age range 23 weeks gestation to 23 years) were analysed by maximum penalized likelihood using ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:

    authors: Cole TJ,Freeman JV,Preece MA

    更新日期:1998-02-28 00:00:00

  • Drug treatment of mild hypertension to reduce the risk of CHD: is it worth-while?

    abstract::Although hypertension is regarded as a causal factor for coronary heart disease (CHD) a reduction in the risk of CHD as a result of lowering blood pressure in mild hypertension could not be demonstrated. This conclusion is based on an overview analysis of all published randomized trials in mild hypertension, including...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780071104

    authors: Holme I

    更新日期:1988-11-01 00:00:00

  • Improved tests for a random effects meta-regression with a single covariate.

    abstract::The explanation of heterogeneity plays an important role in meta-analysis. The random effects meta-regression model allows the inclusion of trial-specific covariates which may explain a part of the heterogeneity. We examine the commonly used tests on the parameters in the random effects meta-regression with one covari...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1482

    authors: Knapp G,Hartung J

    更新日期:2003-09-15 00:00:00

  • Nonparametric regression of state occupation, entry, exit, and waiting times with multistate right-censored data.

    abstract::We construct nonparametric regression estimators of a number of temporal functions in a multistate system based on a continuous univariate baseline covariate. These estimators include state occupation probabilities, state entry, exit, and waiting (sojourn) time distribution functions of a general progressive (e.g., ac...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5703

    authors: Mostajabi F,Datta S

    更新日期:2013-07-30 00:00:00

  • A prediction-based test for multiple endpoints.

    abstract::This article introduces a global hypothesis test intended for studies with multiple endpoints. Our test makes use of a priori predictions about the direction of the result of each endpoint and we weight these predictions using the sample correlation matrix. The global alternative hypothesis concerns a parameter, ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8724

    authors: Montgomery RN,Mahnken JD

    更新日期:2020-12-10 00:00:00

  • Modelling age-specific risk: application to dementia.

    abstract::We give up-to-date methods for estimating the age-specific incidence of a disease and for estimating the effect of risk factors. We recommend taking age as the basic time scale of the analysis; then, the hazard function can be interpreted as the age-specific incidence of the disease. This choice raises a delayed entry...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19980915)17:17<1973::aid-s

    authors: Commenges D,Letenneur L,Joly P,Alioum A,Dartigues JF

    更新日期:1998-09-15 00:00:00

  • Variable length testing using the ordinal regression model.

    abstract::Health questionnaires are often built up from sets of questions that are totaled to obtain a sum score. An important consideration in designing questionnaires is to minimize respondent burden. An increasingly popular method for efficient measurement is computerized adaptive testing; unfortunately, many health question...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5936

    authors: Smits N,Finkelman MD

    更新日期:2014-02-10 00:00:00

  • On the relationship between association and surrogacy when both the surrogate and true endpoint are binary outcomes.

    abstract::The relationship between association and surrogacy has been the focus of much debate in the surrogate marker literature. Recently, the individual causal association (ICA) has been introduced as a metric of surrogacy in the causal inference framework, when both the surrogate and the true endpoint are normally distribut...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8698

    authors: Meyvisch P,Alonso A,Van der Elst W,Molenberghs G

    更新日期:2020-11-20 00:00:00

  • Model diagnostics for censored regression via randomized survival probabilities.

    abstract::Residuals in normal regression are used to assess a model's goodness-of-fit (GOF) and discover directions for improving the model. However, there is a lack of residuals with a characterized reference distribution for censored regression. In this article, we propose to diagnose censored regression with normalized rando...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8852

    authors: Li L,Wu T,Feng C

    更新日期:2020-12-13 00:00:00

  • Bayesian clinical trials in action.

    abstract::Although the frequentist paradigm has been the predominant approach to clinical trial design since the 1940s, it has several notable limitations. Advancements in computational algorithms and computer hardware have greatly enhanced the alternative Bayesian paradigm. Compared with its frequentist counterpart, the Bayesi...

    journal_title:Statistics in medicine

    pub_type: 杂志文章,评审

    doi:10.1002/sim.5404

    authors: Lee JJ,Chu CT

    更新日期:2012-11-10 00:00:00

  • Optimizing and evaluating biomarker combinations as trial-level general surrogates.

    abstract::We extend the method proposed in a recent work by the Authors for trial-level general surrogate evaluation to allow combinations of biomarkers and provide a procedure for finding the "best" combination of biomarkers based on the absolute prediction error summary of surrogate quality. We use a nonparametric Bayesian mo...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.7996

    authors: Gabriel EE,Sachs MC,Daniels MJ,Halloran ME

    更新日期:2019-03-30 00:00:00

  • Non-parametric bootstrap confidence intervals for the intraclass correlation coefficient.

    abstract::The intraclass correlation coefficient rho plays a key role in the design of cluster randomized trials. Estimates of rho obtained from previous cluster trials and used to inform sample size calculation in planned trials may be imprecise due to the typically small numbers of clusters in such studies. It may be useful t...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1643

    authors: Ukoumunne OC,Davison AC,Gulliford MC,Chinn S

    更新日期:2003-12-30 00:00:00

  • An extension of the continual reassessment method using decision theory.

    abstract::The primary goal of a phase I trial is to find the maximally tolerated dose (MTD) of a treatment. The MTD is usually defined in terms of a tolerable probability, q(*), of toxicity. Our objective is to find the highest dose with toxicity risk that does not exceed q(*), a criterion that is often desired in designing pha...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.970

    authors: Leung DH,Wang YG

    更新日期:2002-01-15 00:00:00

  • Bootstrap confidence intervals for medical costs with censored observations.

    abstract::Medical costs data with administratively censored observations often arise in cost-effectiveness studies of treatments for life-threatening diseases. Mean of medical costs incurred from the start of a treatment until death or a certain time point after the implementation of treatment is frequently of interest. In many...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1556

    authors: Jiang H,Zhou XH

    更新日期:2004-11-15 00:00:00

  • Viral load detectability profiles for HIV infection.

    abstract::The introduction of potent antiretroviral therapies for treatment of HIV infection typically results in a dramatic reduction in plasma HIV RNA concentration, often to levels undetectable by current measurement practices. However, although a high proportion of patients achieve 'undetectability', many then experience a ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1325

    authors: McKinnon EJ,James IR,John M,Mallal SA

    更新日期:2003-02-15 00:00:00

  • Planning future studies based on the conditional power of a meta-analysis.

    abstract::Systematic reviews often provide recommendations for further research. When meta-analyses are inconclusive, such recommendations typically argue for further studies to be conducted. However, the nature and amount of future research should depend on the nature and amount of the existing research. We propose a method ba...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5524

    authors: Roloff V,Higgins JP,Sutton AJ

    更新日期:2013-01-15 00:00:00

  • Dose-interpolation of immunoassay data: uncertainties associated with curve-fitting.

    abstract::Estimates of analyte concentrations, obtained by immunoassay, have error distributions which are generally underestimated. Better estimates, which take into account the distribution of the response metameter of the calibration curve and uncertainties associated with the location of the fitted curve, have been obtained...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780050208

    authors: Kay C,Nix AB,Kemp KW,Rowlands RJ,Richards G,Groom GV,Griffiths K,Wilson DW

    更新日期:1986-03-01 00:00:00

  • Analytical, practical and regulatory issues in prevention studies.

    abstract::Prevention studies, as distinguished from studies investigating treatments for established disease, present some distinct challenges. Perhaps the most extensive experience with preventive agents is in the area of infectious diseases; vaccines have been extremely effective in preventing many such diseases. Vaccines hav...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1717

    authors: Ellenberg SS

    更新日期:2004-01-30 00:00:00

  • Longitudinal quantile regression in the presence of informative dropout through longitudinal-survival joint modeling.

    abstract::We propose a joint model for a time-to-event outcome and a quantile of a continuous response repeatedly measured over time. The quantile and survival processes are associated via shared latent and manifest variables. Our joint model provides a flexible approach to handle informative dropout in quantile regression. A M...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.6393

    authors: Farcomeni A,Viviani S

    更新日期:2015-03-30 00:00:00

  • A comparison of methods for determining HIV viral set point.

    abstract::During a course of human immunodeficiency virus (HIV-1) infection, the viral load usually increases sharply to a peak following infection and then drops rapidly to a steady state, where it remains until progression to AIDS. This steady state is often referred to as the viral set point. It is believed that the HIV vira...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3038

    authors: Mei Y,Wang L,Holte SE

    更新日期:2008-01-15 00:00:00

  • Robust and efficient estimation in the parametric proportional hazards model under random censoring.

    abstract::Cox proportional hazard regression model is a popular tool to analyze the relationship between a censored lifetime variable with other relevant factors. The semiparametric Cox model is widely used to study different types of data arising from applied disciplines such as medical science, biology, and reliability studie...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.8377

    authors: Ghosh A,Basu A

    更新日期:2019-11-30 00:00:00

  • Assessing the effect of interventions in the context of mixture distributions with detection limits.

    abstract::Many quantitative assay measurements of metabolites of environmental toxicants in clinical investigations are subject to left censoring due to values falling below assay detection limits. Moreover, when observations occur in both unexposed individuals and exposed individuals who reflect a mixture of two distributions ...

    journal_title:Statistics in medicine

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:10.1002/sim.2079

    authors: Chu H,Kensler TW,Muñoz A

    更新日期:2005-07-15 00:00:00

  • Competing approaches to analysis of failure times with competing risks.

    abstract::For the analysis of time to event data in contraceptive studies when individuals are subject to competing causes for discontinuation, some authors have recently advocated the use of the cumulative incidence rate as a more appropriate measure to summarize data than the complement of the Kaplan-Meier estimate of discont...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.1135

    authors: Farley TM,Ali MM,Slaymaker E

    更新日期:2001-12-15 00:00:00

  • Emergence of childhood psychiatric disorders: a multivariate probit analysis.

    abstract::We applied a computationally practical form of probit analysis for multiple response variables to data on early childhood development of four psychiatric disorders: disruptive disorders (DD-attention deficit disorders, oppositional defiant disorder, conduct disorder); adjustment disorders (ADJ); emotional disorders (E...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/(sici)1097-0258(19981115)17:21<2487::aid-s

    authors: Gibbons RD,Lavigne JV

    更新日期:1998-11-15 00:00:00

  • Accumulating evidence from independent studies: what we can win and what we can lose.

    abstract::When asking 'what is known' about a drug or therapy or program at any time, both researchers and practitioners often confront more than a single study. Facing a variety of findings, where conflicts may outweigh agreement, how can a reviewer constructively approach the task? In this discussion, I will outline some ques...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780060304

    authors: Light RJ

    更新日期:1987-04-01 00:00:00

  • Testing whether genetic variation explains correlation of quantitative measures of gene expression, and application to genetic network analysis.

    abstract::Genetic networks for gene expression data are often built by graphical models, which in turn are built from pair-wise correlations of gene expression levels. A key feature of building graphical models is the evaluation of conditional independence of two traits, given other traits. When conditional independence can be ...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.3274

    authors: Yu Z,Wang L,Hildebrandt MA,Schaid DJ

    更新日期:2008-08-30 00:00:00

  • Adjusting for confounding by neighborhood using generalized linear mixed models and complex survey data.

    abstract::When investigating health disparities, it can be of interest to explore whether adjustment for socioeconomic factors at the neighborhood level can account for, or even reverse, an unadjusted difference. Recently, we proposed new methods to adjust the effect of an individual-level covariate for confounding by unmeasure...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.5624

    authors: Brumback BA,Zheng HW,Dailey AB

    更新日期:2013-04-15 00:00:00

  • An extension of the continual reassessment methods using a preliminary up-and-down design in a dose finding study in cancer patients, in order to investigate a greater range of doses.

    abstract::In a phase I clinical trial in cancer patients, the drug involved had one known main adverse effect, which also occurs spontaneously in cancer patients with a fairly high frequency. Experiments in rats have shown marked effects of the drug on tumour growth in high doses, but also dose-dependent toxicity. Consequently,...

    journal_title:Statistics in medicine

    pub_type: 杂志文章

    doi:10.1002/sim.4780140909

    authors: Møller S

    更新日期:1995-05-15 00:00:00