AP-SKAT: highly-efficient genome-wide rare variant association test.

Abstract:

BACKGROUND:Genome-wide association studies have revealed associations between single-nucleotide polymorphisms (SNPs) and phenotypes such as disease symptoms and drug tolerance. To address the small sample size for rare variants, association studies tend to group gene or pathway level variants and evaluate the effect on the set of variants. One of such strategies, known as the sequential kernel association test (SKAT), is a widely used collapsing method. However, the reported p-values from SKAT tend to be biased because the asymptotic property of the statistic is used to calculate the p-value. Although this bias can be corrected by applying permutation procedures for the test statistics, the computational cost of obtaining p-values with high resolution is prohibitive. RESULTS:To address this problem, we devise an adaptive SKAT procedure termed AP-SKAT that efficiently classifies significant SNP sets and ranks them according to the permuted p-values. Our procedure adaptively stops the permutation test when the significance level is outside some confidence interval of the estimated p-value for a binomial distribution. To evaluate the performance, we first compare the power and sample size calculation and the type I error rates estimate of SKAT, SKAT-O, and the proposed procedure using genotype data in the SKAT R package and from 1000 Genome Project. Through computational experiments using whole genome sequencing and SNP array data, we show that our proposed procedure is highly efficient and has comparable accuracy to the standard procedure. CONCLUSIONS:For several types of genetic data, the developed procedure could achieve competitive power and sample size under small and large sample size conditions with controlling considerable type I error rates, and estimate p-values of significant SNP sets that are consistent with those estimated by the standard permutation test within a realistic time. This demonstrates that the procedure is sufficiently powerful for recent whole genome sequencing and SNP array data with increasing numbers of phenotypes. Additionally, this procedure can be used in other association tests by employing alternative methods to calculate the statistics.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Hasegawa T,Kojima K,Kawai Y,Misawa K,Mimori T,Nagasaki M

doi

10.1186/s12864-016-3094-3

subject

Has Abstract

pub_date

2016-09-21 00:00:00

pages

745

issue

1

issn

1471-2164

pii

10.1186/s12864-016-3094-3

journal_volume

17

pub_type

杂志文章
  • Genomic mechanisms for cold tolerance and production of exopolysaccharides in the Arctic cyanobacterium Phormidesmis priestleyi BC1401.

    abstract:BACKGROUND:Cyanobacteria are major primary producers in extreme cold ecosystems. Many lineages of cyanobacteria thrive in these harsh environments, but it is not fully understood how they survive in these conditions and whether they have evolved specific mechanisms of cold adaptation. Phormidesmis priestleyi is a cyano...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2846-4

    authors: Chrismas NA,Barker G,Anesio AM,Sánchez-Baracaldo P

    更新日期:2016-08-02 00:00:00

  • Pseudomonas aeruginosa clinical and environmental isolates constitute a single population with high phenotypic diversity.

    abstract:BACKGROUND:Pseudomonas aeruginosa is an opportunistic pathogen with a high incidence of hospital infections that represents a threat to immune compromised patients. Genomic studies have shown that, in contrast to other pathogenic bacteria, clinical and environmental isolates do not show particular genomic differences. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-318

    authors: Grosso-Becerra MV,Santos-Medellín C,González-Valdez A,Méndez JL,Delgado G,Morales-Espinosa R,Servín-González L,Alcaraz LD,Soberón-Chávez G

    更新日期:2014-04-28 00:00:00

  • Biological networks in Parkinson's disease: an insight into the epigenetic mechanisms associated with this disease.

    abstract:BACKGROUND:Parkinson's disease (PD) is the second most prevalent neurodegenerative disorders in the world. Studying PD from systems biology perspective involving genes and their regulators might provide deeper insights into the complex molecular interactions associated with this disease. RESULT:We have studied gene co...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4098-3

    authors: Chatterjee P,Roy D,Bhattacharyya M,Bandyopadhyay S

    更新日期:2017-09-12 00:00:00

  • Genome-wide SNP scan of pooled DNA reveals nonsense mutation in FGF20 in the scaleless line of featherless chickens.

    abstract:BACKGROUND:Scaleless (sc/sc) chickens carry a single recessive mutation that causes a lack of almost all body feathers, as well as foot scales and spurs, due to a failure of skin patterning during embryogenesis. This spontaneous mutant line, first described in the 1950s, has been used extensively to explore the tissue ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-257

    authors: Wells KL,Hadad Y,Ben-Avraham D,Hillel J,Cahaner A,Headon DJ

    更新日期:2012-06-19 00:00:00

  • Alpha tubulin genes from Leishmania braziliensis: genomic organization, gene structure and insights on their expression.

    abstract:BACKGROUND:Alpha tubulin is a fundamental component of the cytoskeleton which is responsible for cell shape and is involved in cell division, ciliary and flagellar motility and intracellular transport. Alpha tubulin gene expression varies according to the morphological changes suffered by Leishmania in its life cycle. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-454

    authors: Ramírez CA,Requena JM,Puerta CJ

    更新日期:2013-07-06 00:00:00

  • Whole genome sequencing analysis of multiple Salmonella serovars provides insights into phylogenetic relatedness, antimicrobial resistance, and virulence markers across humans, food animals and agriculture environmental sources.

    abstract:BACKGROUND:Salmonella enterica is a significant foodborne pathogen, which can be transmitted via several distinct routes, and reports on acquisition of antimicrobial resistance (AMR) are increasing. To better understand the association between human Salmonella clinical isolates and the potential environmental/animal re...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5137-4

    authors: Pornsukarom S,van Vliet AHM,Thakur S

    更新日期:2018-11-06 00:00:00

  • Analysis of functional variants in mitochondrial DNA of Finnish athletes.

    abstract:BACKGROUND:We have previously reported on paucity of mitochondrial DNA (mtDNA) haplogroups J and K among Finnish endurance athletes. Here we aimed to further explore differences in mtDNA variants between elite endurance and sprint athletes. For this purpose, we determined the rate of functional variants and the mutatio...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6171-6

    authors: Kiiskilä J,Moilanen JS,Kytövuori L,Niemi AK,Majamaa K

    更新日期:2019-10-29 00:00:00

  • Bayesian prediction of bacterial growth temperature range based on genome sequences.

    abstract:BACKGROUND:The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thu...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-S7-S3

    authors: Jensen DB,Vesth TC,Hallin PF,Pedersen AG,Ussery DW

    更新日期:2012-01-01 00:00:00

  • Genome-wide survey and expression profiles of the AP2/ERF family in castor bean (Ricinus communis L.).

    abstract:BACKGROUND:The AP2/ERF transcription factor, one of the largest gene families in plants, plays a crucial role in the regulation of growth and development, metabolism, and responses to biotic and abiotic stresses. Castor bean (Ricinus communis L., Euphobiaceae) is one of most important non-edible oilseed crops and its s...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-785

    authors: Xu W,Li F,Ling L,Liu A

    更新日期:2013-11-13 00:00:00

  • DArT markers: diversity analyses and mapping in Sorghum bicolor.

    abstract:BACKGROUND:The sequential nature of gel-based marker systems entails low throughput and high costs per assay. Commonly used marker systems such as SSR and SNP are also dependent on sequence information. These limitations result in high cost per data point and significantly limit the capacity of breeding programs to obt...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-26

    authors: Mace ES,Xia L,Jordan DR,Halloran K,Parh DK,Huttner E,Wenzl P,Kilian A

    更新日期:2008-01-22 00:00:00

  • The GAMYB gene in rye: sequence, polymorphisms, map location, allele-specific markers, and relationship with α-amylase activity.

    abstract:BACKGROUND:Transcription factor (TF) GAMYB, belonging to MYB family (named after the gene of the avian myeloblastosis virus) is a master gibberellin (GA)-induced regulatory protein that is crucial for development and germination of cereal grain and involved in anther formation. It activates many genes including high-mo...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06991-3

    authors: Bienias A,Góralska M,Masojć P,Milczarski P,Myśków B

    更新日期:2020-08-24 00:00:00

  • A systematic model of the LC-MS proteomics pipeline.

    abstract:MOTIVATION:Mass spectrometry is a complex technique used for large-scale protein profiling with clinical and pharmaceutical applications. While individual components in the system have been studied extensively, little work has been done to integrate various modules and evaluate them from a systems point of view. RESUL...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-S6-S2

    authors: Sun Y,Braga-Neto U,Dougherty ER

    更新日期:2012-01-01 00:00:00

  • σ54-dependent regulome in Desulfovibrio vulgaris Hildenborough.

    abstract:BACKGROUND:The σ(54) subunit controls a unique class of promoters in bacteria. Such promoters, without exception, require enhancer binding proteins (EBPs) for transcription initiation. Desulfovibrio vulgaris Hildenborough, a model bacterium for sulfate reduction studies, has a high number of EBPs, more than most sequen...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2176-y

    authors: Kazakov AE,Rajeev L,Chen A,Luning EG,Dubchak I,Mukhopadhyay A,Novichkov PS

    更新日期:2015-11-10 00:00:00

  • Involvement of potential pathways in malignant transformation from oral leukoplakia to oral squamous cell carcinoma revealed by proteomic analysis.

    abstract:BACKGROUND:Oral squamous cell carcinoma (OSCC) is one of the most common forms of cancer associated with the presence of precancerous oral leukoplakia. Given the poor prognosis associated with oral leukoplakia, and the difficulties in distinguishing it from cancer lesions, there is an urgent need to elucidate the molec...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-383

    authors: Wang Z,Feng X,Liu X,Jiang L,Zeng X,Ji N,Li J,Li L,Chen Q

    更新日期:2009-08-19 00:00:00

  • Analysis of whole genome sequencing for the Escherichia coli O157:H7 typing phages.

    abstract:BACKGROUND:Shiga toxin producing Escherichia coli O157 can cause severe bloody diarrhea and haemolytic uraemic syndrome. Phage typing of E. coli O157 facilitates public health surveillance and outbreak investigations, certain phage types are more likely to occupy specific niches and are associated with specific age gro...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1470-z

    authors: Cowley LA,Beckett SJ,Chase-Topping M,Perry N,Dallman TJ,Gally DL,Jenkins C

    更新日期:2015-04-08 00:00:00

  • Orthonome - a new pipeline for predicting high quality orthologue gene sets applicable to complete and draft genomes.

    abstract:BACKGROUND:Distinguishing orthologous and paralogous relationships between genes across multiple species is essential for comparative genomic analyses. Various computational approaches have been developed to resolve these evolutionary relationships, but strong trade-offs between precision and recall of orthologue predi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4079-6

    authors: Rane RV,Oakeshott JG,Nguyen T,Hoffmann AA,Lee SF

    更新日期:2017-08-31 00:00:00

  • Identification of Nicotiana benthamiana microRNAs and their targets using high throughput sequencing and degradome analysis.

    abstract:BACKGROUND:Nicotiana benthamiana is a widely used model plant species for research on plant-pathogen interactions as well as other areas of plant science. It can be easily transformed or agroinfiltrated, therefore it is commonly used in studies requiring protein localization, interaction, or plant-based systems for pro...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2209-6

    authors: Baksa I,Nagy T,Barta E,Havelda Z,Várallyay É,Silhavy D,Burgyán J,Szittya G

    更新日期:2015-12-01 00:00:00

  • Cotranscription and intergenic splicing of the PPARG and TSEN2 genes in cattle.

    abstract:BACKGROUND:Intergenic splicing resulting in the combination of mRNAs sequences from distinct genes is a newly identified mechanism likely to contribute to protein diversity. Few cases have been described, most of them involving neighboring genes and thus suggesting a cotranscription event presumably due to transcriptio...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-71

    authors: Roux M,Levéziel H,Amarger V

    更新日期:2006-04-04 00:00:00

  • Identification and evolutionary analysis of the nucleolar proteome of Giardia lamblia.

    abstract:BACKGROUND:The nucleoli, including their proteomes, of higher eukaryotes have been extensively studied, while few studies about the nucleoli of the lower eukaryotes - protists were reported. Giardia lamblia, a protist with the controversy of whether it is an extreme primitive eukaryote or just a highly evolved parasite...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-6679-9

    authors: Feng JM,Yang CL,Tian HF,Wang JX,Wen JF

    更新日期:2020-03-30 00:00:00

  • Brucella microti: the genome sequence of an emerging pathogen.

    abstract:BACKGROUND:Using a combination of pyrosequencing and conventional Sanger sequencing, the complete genome sequence of the recently described novel Brucella species, Brucella microti, was determined. B. microti is a member of the genus Brucella within the Alphaproteobacteria, which consists of medically important highly ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-352

    authors: Audic S,Lescot M,Claverie JM,Scholz HC

    更新日期:2009-08-04 00:00:00

  • Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey.

    abstract:BACKGROUND:The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo seq...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-479

    authors: Kerstens HH,Crooijmans RP,Veenendaal A,Dibbits BW,Chin-A-Woeng TF,den Dunnen JT,Groenen MA

    更新日期:2009-10-16 00:00:00

  • OMGene: mutual improvement of gene models through optimisation of evolutionary conservation.

    abstract:BACKGROUND:The accurate determination of the genomic coordinates for a given gene - its gene model - is of vital importance to the utility of its annotation, and the accuracy of bioinformatic analyses derived from it. Currently-available methods of computational gene prediction, while on the whole successful, frequentl...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4704-z

    authors: Dunne MP,Kelly S

    更新日期:2018-04-27 00:00:00

  • Comparative evolutionary genomics of the HADH2 gene encoding Abeta-binding alcohol dehydrogenase/17beta-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10).

    abstract:BACKGROUND:The Abeta-binding alcohol dehydrogenase/17beta-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10) is an enzyme involved in pivotal metabolic processes and in the mitochondrial dysfunction seen in the Alzheimer's disease. Here we use comparative genomic analyses to study the evolution of the HADH2 gene encodin...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-202

    authors: Marques AT,Antunes A,Fernandes PA,Ramos MJ

    更新日期:2006-08-09 00:00:00

  • Transferring knowledge of bacterial protein interaction networks to predict pathogen targeted human genes and immune signaling pathways: a case study on M. tuberculosis.

    abstract:BACKGROUND:Bacterial invasive infection and host immune response is fundamental to the understanding of pathogen pathogenesis and the discovery of effective therapeutic drugs. However, there are very few experimental studies on the signaling cross-talks between bacteria and human host to date. METHODS:In this work, ta...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4873-9

    authors: Mei S,Flemington EK,Zhang K

    更新日期:2018-06-28 00:00:00

  • Analysis of plant-derived miRNAs in animal small RNA datasets.

    abstract:BACKGROUND:Plants contain significant quantities of small RNAs (sRNAs) derived from various sRNA biogenesis pathways. Many of these sRNAs play regulatory roles in plants. Previous analysis revealed that numerous sRNAs in corn, rice and soybean seeds have high sequence similarity to animal genes. However, exogenous RNA ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-381

    authors: Zhang Y,Wiggins BE,Lawrence C,Petrick J,Ivashuta S,Heck G

    更新日期:2012-08-08 00:00:00

  • Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes.

    abstract:BACKGROUND:BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such a region can be approached with next-generation whole-genome sequencing and assembly as if it were an independent small genome. Using the minimum tiling path as a guide, specif...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-379

    authors: Feltus FA,Saski CA,Mockaitis K,Haiminen N,Parida L,Smith Z,Ford J,Staton ME,Ficklin SP,Blackmon BP,Cheng CH,Schnell RJ,Kuhn DN,Motamayor JC

    更新日期:2011-07-27 00:00:00

  • The carotenoid biosynthetic and catabolic genes in wheat and their association with yellow pigments.

    abstract:BACKGROUND:In plants carotenoids play an important role in the photosynthetic process and photo-oxidative protection, and are the substrate for the synthesis of abscisic acid and strigolactones. In addition to their protective role as antioxidants and precursors of vitamin A, in wheat carotenoids are important as they ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3395-6

    authors: Colasuonno P,Lozito ML,Marcotuli I,Nigro D,Giancaspro A,Mangini G,De Vita P,Mastrangelo AM,Pecchioni N,Houston K,Simeone R,Gadaleta A,Blanco A

    更新日期:2017-01-31 00:00:00

  • A multispecies comparison of the metazoan 3'-processing downstream elements and the CstF-64 RNA recognition motif.

    abstract:BACKGROUND:The Cleavage Stimulation Factor (CstF) is a required protein complex for eukaryotic mRNA 3'-processing. CstF interacts with 3'-processing downstream elements (DSEs) through its 64-kDa subunit, CstF-64; however, the exact nature of this interaction has remained unclear. We used EST-to-genome alignments to ide...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-55

    authors: Salisbury J,Hutchison KW,Graber JH

    更新日期:2006-03-16 00:00:00

  • Drosophila melanogaster retrotransposon and inverted repeat-derived endogenous siRNAs are differentially processed in distinct cellular locations.

    abstract:BACKGROUND:Endogenous small interfering (esi)RNAs repress mRNA levels and retrotransposon mobility in Drosophila somatic cells by poorly understood mechanisms. 21 nucleotide esiRNAs are primarily generated from retrotransposons and two inverted repeat (hairpin) loci in Drosophila culture cells in a Dicer2 dependent man...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3692-8

    authors: Harrington AW,McKain MR,Michalski D,Bauer KM,Daugherty JM,Steiniger M

    更新日期:2017-04-17 00:00:00

  • Genome sequencing and assessment of plant growth-promoting properties of a Serratia marcescens strain isolated from vermicompost.

    abstract:BACKGROUND:Plant-bacteria associations have been extensively studied for their potential in increasing crop productivity in a sustainable manner. Serratia marcescens is a species of Enterobacteriaceae found in a wide range of environments, including soil. RESULTS:Here we describe the genome sequencing and assessment o...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5130-y

    authors: Matteoli FP,Passarelli-Araujo H,Reis RJA,da Rocha LO,de Souza EM,Aravind L,Olivares FL,Venancio TM

    更新日期:2018-10-16 00:00:00