Abstract:
:Multiple testing procedures are commonly used in gene expression studies for the detection of differential expression, where typically thousands of genes are measured over at least two experimental conditions. Given the need for powerful testing procedures, and the attendant danger of false positives in multiple testing, the False Discovery Rate (FDR) controlling procedure of Benjamini and Hochberg (1995) has become a popular tool. When simultaneously testing hypotheses, suppose that R rejections are made, of which Fp are false positives. The Benjamini and Hochberg procedure ensures that the expectation of Fp/R is bounded above by some pre-specified proportion. In practice, the procedure is applied to a single experiment. In this paper we investigate the across-experiment variability of the proportion Fp/R as a function of three experimental parameters. The operational characteristics of the procedure when applied to dependent hypotheses are also considered.
journal_name
Stat Appl Genet Mol Biolauthors
Green GH,Diggle PJdoi
10.2202/1544-6115.1302subject
Has Abstractpub_date
2007-01-01 00:00:00pages
Article27eissn
2194-6302issn
1544-6115journal_volume
6pub_type
杂志文章abstract::In a hidden Markov model, one "estimates" the state of the hidden Markov chain at t by computing via the forwards-backwards algorithm the conditional distribution of the state vector given the observed data. The covariance matrix of this conditional distribution measures the information lost by failure to observe dire...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章,评审
doi:10.2202/1544-6115.1296
更新日期:2007-01-01 00:00:00
abstract::Mass spectrometry is an important high-throughput technique for profiling small molecular compounds in biological samples and is widely used to identify potential diagnostic and prognostic compounds associated with disease. Commonly, this data generated by mass spectrometry has many missing values resulting when a com...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2013-0021
更新日期:2013-12-01 00:00:00
abstract::We address a potential shortcoming of three probabilistic models for detecting interspecific recombination in DNA sequence alignments: the multiple change-point model (MCP) of Suchard et al. (2003), the dual multiple change-point model (DMCP) of Minin et al. (2005), and the phylogenetic factorial hidden Markov model (...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1399
更新日期:2008-01-01 00:00:00
abstract::The ENCODE project has funded the generation of a diverse collection of methylation profiles using reduced representation bisulfite sequencing (RRBS) technology, enabling the analysis of epigenetic variation on a genomic scale at single-site resolution. A standard application of RRBS experiments is in the location of ...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2013-0027
更新日期:2013-12-01 00:00:00
abstract::Output from analysis of a high-throughput 'omics' experiment very often is a ranked list. One commonly encountered example is a ranked list of differentially expressed genes from a gene expression experiment, with a length of many hundreds of genes. There are numerous situations where interest is in the comparison of ...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2016-0036
更新日期:2017-03-01 00:00:00
abstract::Polytomous phenotypes arise when a disease has multiple subtypes or when two dichotomous phenotypes are analyzed simultaneously. Few software programs offer the option to analyze such phenotypes in family studies, and none implements conditional polytomous logistic regression for within-family analysis robust to popul...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2016-0035
更新日期:2017-03-01 00:00:00
abstract::Usually, a pedigree is sampled and included in the sample that is analyzed after following a predefined non-random sampling design comprising several specific procedures. To obtain a pedigree analysis result free from the bias caused by the sampling procedures, a correction is applied to the pedigree likelihood. The s...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1003
更新日期:2003-01-01 00:00:00
abstract::Germline mosaicism is a genetic condition in which some germ cells of an individual contain a mutation. This condition violates the assumptions underlying classic genetic analysis and may lead to failure of such analysis. In this work we extend the statistical model used for genetic linkage analysis in order to incorp...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1709
更新日期:2011-10-04 00:00:00
abstract::Gametic models for fitting breeding values at QTL as random effects in outbred populations have become popular because they require few assumptions about the number and distribution of QTL alleles segregating. The covariance matrix of the gametic effects has an inverse that is sparse and can be constructed rapidly by ...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1048
更新日期:2004-01-01 00:00:00
abstract::We are concerned with statistical inference for 2 × C × K contingency tables in the context of genetic case-control association studies. Multivariate methods based on asymptotic Gaussianity of vectors of test statistics require information about the asymptotic correlation structure among these test statistics under th...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2015-0024
更新日期:2015-11-01 00:00:00
abstract::Many gene- and pathway-based association tests have been proposed in the literature. Among them, the SKAT is widely used, especially for rare variants association studies. In this paper, we investigate the connection between SKAT and a principal component analysis. This investigation leads to a procedure that encompas...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2016-0061
更新日期:2017-07-26 00:00:00
abstract::Integrative analysis of copy number and gene expression data can help in understanding the cis and trans effect of copy number aberrations on transcription levels of genes involved in a pathway. To analyse how these copy number mediated gene-gene interactions differ between groups of samples we propose a new method, n...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2017-0058
更新日期:2018-07-31 00:00:00
abstract::Making sound proteomic inferences using ELISA microarray assay requires both an accurate prediction of protein concentration and a credible estimate of its error. We present a method using monotonic spline statistical models (MS), penalized constrained least squares fitting (PCLS) and Monte Carlo simulation (MC) to pr...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1364
更新日期:2008-01-01 00:00:00
abstract::In recent years, alignment-free methods have been widely applied in comparing genome sequences, as these methods compute efficiently and provide desirable phylogenetic analysis results. These methods have been successfully combined with hierarchical clustering methods for finding phylogenetic trees. However, it may no...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2018-0045
更新日期:2019-02-15 00:00:00
abstract::In candidate gene association studies, usually several elementary hypotheses are tested simultaneously using one particular set of data. The data normally consist of partly correlated SNP information. Every SNP can be tested for association with the disease, e.g., using the Cochran-Armitage test for trend. To account ...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1729
更新日期:2011-01-01 00:00:00
abstract::In cellular biology, node-and-edge graph or "network" data collection often uses bait-prey technologies such as co-immunoprecipitation (CoIP). Bait-prey technologies assay relationships or "interactions" between protein pairs, with CoIP specifically measuring protein complex co-membership. Analyses of CoIP data freque...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2015-0007
更新日期:2015-08-01 00:00:00
abstract::Various discriminant methods have been applied for classification of tumors based on gene expression profiles, among which the nearest neighbor (NN) method has been reported to perform relatively well. Usually cross-validation (CV) is used to select the neighbor size as well as the number of variables for the NN metho...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1054
更新日期:2004-01-01 00:00:00
abstract::We evaluate variable selection by multiple tests controlling the false discovery rate (FDR) to build a linear score for prediction of clinical outcome in high-dimensional data. Quality of prediction is assessed by the receiver operating characteristic curve (ROC) for prediction in independent patients. Thus we try to ...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1462
更新日期:2009-01-01 00:00:00
abstract::We develop an approach for microarray differential expression analysis, i.e. identifying genes whose expression levels differ between two or more groups. Current approaches to inference rely either on full parametric assumptions or on permutation-based techniques for sampling under the null distribution. In some situa...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1333
更新日期:2008-01-01 00:00:00
abstract::In this study, we propose a novel statistical framework for detecting progressive changes in molecular traits as response to a pathogenic stimulus. In particular, we propose to employ Bayesian hierarchical models to analyse changes in mean level, variance and correlation of metabolic traits in relation to covariates. ...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2013-0041
更新日期:2014-04-01 00:00:00
abstract::Multi-color optical mapping is a new technique being developed to obtain detailed physical maps (indicating relative positions of various recognition sites) of DNA molecules. We consider a study design in which the data consist of noisy observations of multiple copies of a DNA molecule marked with colors at recognitio...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1266
更新日期:2007-01-01 00:00:00
abstract::Accurately measuring epigenetic marks such as 5-methylcytosine (5-mC) and 5-hydroxymethylcytosine (5-hmC) at the single-nucleotide level, requires combining data from DNA processing methods including traditional (BS), oxidative (oxBS) or Tet-Assisted (TAB) bisulfite conversion. We introduce the R package MLML2R, which...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2018-0031
更新日期:2019-01-17 00:00:00
abstract::Locus heterogeneity is one of the most important issues in gene mapping and can cause significant reductions in statistical power for gene mapping, yet no research to date has provided power and sample size calculations for family-based association methods in the presence of locus heterogeneity. The purpose of this re...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1501
更新日期:2009-01-01 00:00:00
abstract::The problem of finding periodically expressed genes from time course microarray experiments is at the center of numerous efforts to identify the molecular components of biological clocks. We present a new approach to this problem based on the cyclohedron test, which is a rank test inspired by recent advances in algebr...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1286
更新日期:2007-01-01 00:00:00
abstract::Likelihood-based cross-validation is a statistical tool for selecting a density estimate based on n i.i.d. observations from the true density among a collection of candidate density estimators. General examples are the selection of a model indexing a maximum likelihood estimator, and the selection of a bandwidth index...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1036
更新日期:2004-01-01 00:00:00
abstract::Approaches based upon sequence weights, to construct a position weight matrix of nucleotides from aligned inputs, are popular but little effort has been expended to measure their quality. We derive optimal sequence weights that minimize the sum of the variances of the estimators of base frequency parameters for sequen...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1135
更新日期:2005-01-01 00:00:00
abstract::In many population genetic problems, parameter estimation is obstructed by an intractable likelihood function. Therefore, approximate estimation methods have been developed, and with growing computational power, sampling-based methods became popular. However, these methods such as Approximate Bayesian Computation (ABC...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2017-0016
更新日期:2017-11-27 00:00:00
abstract::There has been increasing interest in predicting patients' survival after therapy by investigating gene expression microarray data. In the regression and classification models with high-dimensional genomic data, boosting has been successfully applied to build accurate predictive models and conduct variable selection s...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1550
更新日期:2010-01-01 00:00:00
abstract::Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis is associated with a test statistic, and large test statistics provide evidence against the null hypotheses. One proposal to provide probabilistic control of Type-I errors is the use of procedures ensuring that the e...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.2202/1544-6115.1148
更新日期:2006-01-01 00:00:00
abstract::The Dirichlet Process (DP) mixture model has become a popular choice for model-based clustering, largely because it allows the number of clusters to be inferred. The sequential updating and greedy search (SUGS) algorithm (Wang & Dunson, 2011) was proposed as a fast method for performing approximate Bayesian inference ...
journal_title:Statistical applications in genetics and molecular biology
pub_type: 杂志文章
doi:10.1515/sagmb-2018-0065
更新日期:2019-12-12 00:00:00