A Bayesian hidden Markov model for detecting differentially methylated regions.

Abstract:

:Alterations in DNA methylation have been linked to the development and progression of many diseases. The bisulfite sequencing technique presents methylation profiles at base resolution. Count data on methylated and unmethylated reads provide information on the methylation level at each CpG site. As more bisulfite sequencing data become available, these data are increasingly needed to infer methylation aberrations in diseases. Automated and powerful algorithms also need to be developed to accurately identify differentially methylated regions between treatment groups. This study adopts a Bayesian approach using the hidden Markov model to account for inherent dependence in read count data. Given the expense of sequencing experiments, few replicates are available for each treatment group. A Bayesian approach that borrows information across an entire chromosome improves the reliability of statistical inferences. The proposed hidden Markov model considers location dependence among genomic loci by incorporating correlation structures as a function of genomic distance. An iterative algorithm based on expectation-maximization is designed for parameter estimation. Methylation states are inferred by identifying the optimal sequence of latent states from observations. Real datasets and simulation studies that mimic the real datasets are used to illustrate the reliability and success of the proposed method.

journal_name

Biometrics

journal_title

Biometrics

authors

Ji T

doi

10.1111/biom.13000

subject

Has Abstract

pub_date

2019-06-01 00:00:00

pages

663-673

issue

2

eissn

0006-341X

issn

1541-0420

journal_volume

75

pub_type

杂志文章
  • A score regression approach to assess calibration of continuous probabilistic predictions.

    abstract::Calibration, the statistical consistency of forecast distributions and the observations, is a central requirement for probabilistic predictions. Calibration of continuous forecasts is typically assessed using the probability integral transform histogram. In this article, we propose significance tests based on scoring ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2010.01406.x

    authors: Held L,Rufibach K,Balabdaoui F

    更新日期:2010-12-01 00:00:00

  • Prediction in censored survival data: a comparison of the proportional hazards and linear regression models.

    abstract::Although the analysis of censored survival data using the proportional hazards and linear regression models is common, there has been little work examining the ability of these estimators to predict time to failure. This is unfortunate, since a predictive plot illustrating the relationship between time to failure and ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Heller G,Simonoff JS

    更新日期:1992-03-01 00:00:00

  • A two-stage experimental design for dilution assays.

    abstract::Dilution assays to determine solute concentration have found wide use in biomedical research. Many dilution assays return imprecise concentration estimates because they are only done to orders of magnitude. Previous statistical work has focused on how to design efficient experiments that can return more precise estima...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13032

    authors: Ferguson JM,Miura TA,Miller CR

    更新日期:2019-09-01 00:00:00

  • Catch estimation with restricted randomization in the effort survey.

    abstract::One common method for estimating total catch is to multiply an estimate for CPUE, the catch per unit effort, by an estimate of total effort obtained from an independent second survey. In general, estimating total effort requires that sample times are chosen at random over the full fishing period; however, in practice,...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2001.00461.x

    authors: Dauk PC,Schwarz CJ

    更新日期:2001-06-01 00:00:00

  • Functional multiple indicators, multiple causes measurement error models.

    abstract::Objective measures of oxygen consumption and carbon dioxide production by mammals are used to predict their energy expenditure. Since energy expenditure is not directly observable, it can be viewed as a latent construct with multiple physical indirect measures such as respiratory quotient, volumetric oxygen consumptio...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12706

    authors: Tekwe CD,Zoh RS,Bazer FW,Wu G,Carroll RJ

    更新日期:2018-03-01 00:00:00

  • A semiparametric empirical likelihood method for biased sampling schemes with auxiliary covariates.

    abstract::We consider a semiparametric inference procedure for data from epidemiologic studies conducted with a two-component sampling scheme where both a simple random sample and multiple outcome- or outcome-/auxiliary-dependent samples are observed. This sampling scheme allows the investigators to oversample certain subpopula...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2006.00612.x

    authors: Wang X,Zhou H

    更新日期:2006-12-01 00:00:00

  • Extension of the rank sum test for clustered data: two-group comparisons with group membership defined at the subunit level.

    abstract::The Wilcoxon rank sum test is widely used for two-group comparisons for nonnormal data. An assumption of this test is independence of sampling units both between and within groups. In ophthalmology, data are often collected on two eyes of an individual, which are highly correlated. In ophthalmological clinical trials,...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2006.00582.x

    authors: Rosner B,Glynn RJ,Lee ML

    更新日期:2006-12-01 00:00:00

  • Sparse generalized eigenvalue problem with application to canonical correlation analysis for integrative analysis of methylation and gene expression data.

    abstract::We present a method for individual and integrative analysis of high dimension, low sample size data that capitalizes on the recurring theme in multivariate analysis of projecting higher dimensional data onto a few meaningful directions that are solutions to a generalized eigenvalue problem. We propose a general framew...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12886

    authors: Safo SE,Ahn J,Jeon Y,Jung S

    更新日期:2018-12-01 00:00:00

  • Modification of the Greenwood formula for correlated response times.

    abstract::Life-table methodology for interval-censored survival times is used to estimate marginal survival probabilities from data consisting of independent cohorts of correlated responses. We restrict our attention to situations where response times within cohorts are exchangeable and the marginal survival distributions are t...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Kang SS,Koehler KJ

    更新日期:1997-09-01 00:00:00

  • A statistical method for detecting differentially expressed SNVs based on next-generation RNA-seq data.

    abstract::In this article, we propose a new statistical method-MutRSeq-for detecting differentially expressed single nucleotide variants (SNVs) based on RNA-seq data. Specifically, we focus on nonsynonymous mutations and employ a hierarchical likelihood approach to jointly model observed mutation events as well as read count me...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12548

    authors: Fu R,Wang P,Ma W,Taguchi A,Wong CH,Zhang Q,Gazdar A,Hanash SM,Zhou Q,Zhong H,Feng Z

    更新日期:2017-03-01 00:00:00

  • Estimation and interpretation of heterogeneous vaccine efficacy against recurrent infections.

    abstract::Vaccine-induced protection may not be homogeneous across individuals. It is possible that a vaccine gives complete protection for a portion of individuals, while the rest acquire only incomplete (leaky) protection of varying magnitude. If vaccine efficacy is estimated under wrong assumptions about such individual leve...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12473

    authors: Mehtälä J,Dagan R,Auranen K

    更新日期:2016-09-01 00:00:00

  • A spatial Bayesian latent factor model for image-on-image regression.

    abstract::Image-on-image regression analysis, using images to predict images, is a challenging task, due to (1) the high dimensionality and (2) the complex spatial dependence structures in image predictors and image outcomes. In this work, we propose a novel image-on-image regression model, by extending a spatial Bayesian laten...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13420

    authors: Guo C,Kang J,Johnson TD

    更新日期:2020-12-27 00:00:00

  • Heterogeneity models of disease susceptibility, with application to diabetic nephropathy.

    abstract::It is not, in general, possible to include all relevant risk factors in a model of survival or disease incidence. This heterogeneity must be accounted for in the interpretation, as it can imply otherwise unexpected results. This is illustrated by diabetic nephropathy, a serious complication experienced by some diabeti...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Hougaard P,Myglegaard P,Borch-Johnsen K

    更新日期:1994-12-01 00:00:00

  • A semiparametric joint model for longitudinal and survival data with application to hemodialysis study.

    abstract::In many longitudinal clinical studies, the level and progression rate of repeatedly measured biomarkers on each subject quantify the severity of the disease and that subject's susceptibility to progression of the disease. It is of scientific and clinical interest to relate such quantities to a later time-to-event clin...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2008.01168.x

    authors: Li L,Hu B,Greene T

    更新日期:2009-09-01 00:00:00

  • Basal body temperature, ovulation and the risk of conception, with special reference to the lifetimes of sperm and egg.

    abstract::The risks of conception, due to sexual intercourse at various times before and after the periovulatory rise in the woman's basal body temperature, are evaluated. In general, the risk is small nine or more days before, and two or more days after, the first day of elevated temperature. The model for the conception proba...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Royston JP

    更新日期:1982-06-01 00:00:00

  • Group sequential tests for bivariate response: interim analyses of clinical trials with both efficacy and safety endpoints.

    abstract::We describe group sequential tests for a bivariate response. The tests are defined in terms of the two response components jointly, rather than through a single summary statistic. Such methods are appropriate when the two responses concern different aspects of a treatment; for example, one might wish to show that a ne...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Jennison C,Turnbull BW

    更新日期:1993-09-01 00:00:00

  • Dynamic models for estimating the effect of HAART on CD4 in observational studies: Application to the Aquitaine Cohort and the Swiss HIV Cohort Study.

    abstract::Highly active antiretroviral therapy (HAART) has proved efficient in increasing CD4 counts in many randomized clinical trials. Because randomized trials have some limitations (e.g., short duration, highly selected subjects), it is interesting to assess the effect of treatments using observational studies. This is chal...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12564

    authors: Prague M,Commenges D,Gran JM,Ledergerber B,Young J,Furrer H,Thiébaut R

    更新日期:2017-03-01 00:00:00

  • Two-stage designs for gene-disease association studies with sample size constraints.

    abstract::Gene-disease association studies based on case-control designs may often be used to identify candidate polymorphisms (markers) conferring disease risk. If a large number of markers are studied, genotyping all markers on all samples is inefficient in resource utilization. Here, we propose an alternative two-stage metho...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2004.00207.x

    authors: Satagopan JM,Venkatraman ES,Begg CB

    更新日期:2004-09-01 00:00:00

  • Nonparametric Bayesian covariate-adjusted estimation of the Youden index.

    abstract::A novel nonparametric regression model is developed for evaluating the covariate-specific accuracy of a continuous biological marker. Accurately screening diseased from nondiseased individuals and correctly diagnosing disease stage are critically important to health care on several fronts, including guiding recommenda...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.12686

    authors: Inácio de Carvalho V,de Carvalho M,Branscum AJ

    更新日期:2017-12-01 00:00:00

  • Procedures for comparing samples with multiple endpoints.

    abstract::Five procedures are considered for the comparison of two or more multivariate samples. These procedures include a newly proposed nonparametric rank-sum test and a generalized least squares test. Also considered are the following tests: ordinary least squares, Hotelling's T2, and a Bonferroni per-experiment error-rate ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: O'Brien PC

    更新日期:1984-12-01 00:00:00

  • Receiver operating characteristic curves and confidence bands for support vector machines.

    abstract::Many problems that appear in biomedical decision-making, such as diagnosing disease and predicting response to treatment, can be expressed as binary classification problems. The support vector machine (SVM) is a popular classification technique that is robust to model misspecification and effectively handles high-dime...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13365

    authors: Luckett DJ,Laber EB,El-Kamary SS,Fan C,Jhaveri R,Perou CM,Shebl FM,Kosorok MR

    更新日期:2020-08-31 00:00:00

  • On the use of the variogram in checking for independence in spatial data.

    abstract::The variogram is a standard tool in the analysis of spatial data, and its shape provides useful information on the form of spatial correlation that may be present. However, it is also useful to be able to assess the evidence for the presence of any spatial correlation. A method of doing this, based on an assessment of...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.2001.00211.x

    authors: Diblasi A,Bowman AW

    更新日期:2001-03-01 00:00:00

  • Incorporating marginal covariate information in a nonparametric regression model for a sample of R x C tables.

    abstract:SUMMARY:Nonparametric regression models are proposed in the framework of ecological inference for exploratory modeling of disease prevalence rates adjusted for variables, such as age, ethnicity/race, and socio-economic status. Ecological inference is needed when a response variable and covariate are not available at th...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2008.00997.x

    authors: Staniswalis JG

    更新日期:2008-12-01 00:00:00

  • Robust inference for the stepped wedge design.

    abstract::Stepped wedge designed trials are a type of cluster-randomized study in which the intervention is introduced to each cluster in a random order over time. This design is often used to assess the effect of a new intervention as it is rolled out across a series of clinics or communities. Based on a permutation argument, ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13106

    authors: Hughes JP,Heagerty PJ,Xia F,Ren Y

    更新日期:2020-03-01 00:00:00

  • A unifying family of group sequential test designs.

    abstract::Currently, the design of group sequential clinical trials requires choosing among several distinct design categories, design scales, and strategies for determining stopping rules. This approach can limit the design selection process so that clinical issues are not fully addressed. This paper describes a family of desi...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341x.1999.00874.x

    authors: Kittelson JM,Emerson SS

    更新日期:1999-09-01 00:00:00

  • A note on permutation tests for variance components in multilevel generalized linear mixed models.

    abstract::In many applications of generalized linear mixed models to multilevel data, it is of interest to test whether a random effects variance component is zero. It is well known that the usual asymptotic chi-square distribution of the likelihood ratio and score statistics under the null does not necessarily hold. In this no...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.1541-0420.2007.00775.x

    authors: Fitzmaurice GM,Lipsitz SR,Ibrahim JG

    更新日期:2007-09-01 00:00:00

  • Duration of ventilating tubes: a test for comparing two clustered samples of censored data.

    abstract::A study of otitis media that requires a test for the comparison of two clustered samples of censored data is described. A method is proposed taking into account the within-subject correlation in the formation of the log-rank statistic. ...

    journal_title:Biometrics

    pub_type: 临床试验,杂志文章,随机对照试验

    doi:

    authors: Le CT,Lindgren BR

    更新日期:1996-03-01 00:00:00

  • A method for estimating incidence rates of onchocerciasis from skin-snip biopsies with consideration of false negatives.

    abstract::The aim of this study is to estimate incidence rates of onchocerciasis from skin-snip biopsies, based on incomplete data obtained in field surveys, with consideration of false negatives. The method of maximum likelihood is employed and the effect of false negatives on the incidence rates is discussed. ...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:

    authors: Yanagawa T,Kasagi F,Yoshimura T

    更新日期:1984-06-01 00:00:00

  • Bayesian experimental design for nonlinear mixed-effects models with application to HIV dynamics.

    abstract::Bayesian experimental design is investigated for Bayesian analysis of nonlinear mixed-effects models. Existence of the posterior risk for parameter estimation is shown. When the same prior distribution is used for both design and inference, existence of the preposterior risk for design is also proven. If the prior dis...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/j.0006-341X.2004.00148.x

    authors: Han C,Chaloner K

    更新日期:2004-03-01 00:00:00

  • An adaptive trial design to optimize dose-schedule regimes with delayed outcomes.

    abstract::This paper proposes a two-stage phase I-II clinical trial design to optimize dose-schedule regimes of an experimental agent within ordered disease subgroups in terms of the toxicity-efficacy trade-off. The design is motivated by settings where prior biological information indicates it is certain that efficacy will imp...

    journal_title:Biometrics

    pub_type: 杂志文章

    doi:10.1111/biom.13116

    authors: Lin R,Thall PF,Yuan Y

    更新日期:2020-03-01 00:00:00