The effect of rare variants on inflation of the test statistics in case-control analyses.

Abstract:

BACKGROUND:The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test statistic. This ratio is inflated in the presence of cryptic population structure. However, inflation may also be caused by the properties of the association test itself particularly in the analysis of rare variants. We compared the properties of the three most commonly used association tests: the likelihood ratio test, the Wald test and the score test when testing rare variants for association using simulated data. RESULTS:We found evidence of inflation in the median test statistics of the likelihood ratio and score tests for tests of variants with less than 20 heterozygotes across the sample, regardless of the total sample size. The test statistics for the Wald test were under-inflated at the median for variants below the same minor allele frequency. CONCLUSIONS:In a genetic association study, if a substantial proportion of the genetic variants tested have rare minor allele frequencies, the properties of the association test may mask the presence or absence of bias due to population structure. The use of either the likelihood ratio test or the score test is likely to lead to inflation in the median test statistic in the absence of population structure. In contrast, the use of the Wald test is likely to result in under-inflation of the median test statistic which may mask the presence of population structure.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Pirie A,Wood A,Lush M,Tyrer J,Pharoah PD

doi

10.1186/s12859-015-0496-1

subject

Has Abstract

pub_date

2015-02-20 00:00:00

pages

53

issn

1471-2105

pii

10.1186/s12859-015-0496-1

journal_volume

16

pub_type

杂志文章
  • Quantitative prediction of the effect of genetic variation using hidden Markov models.

    abstract:BACKGROUND:With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-5

    authors: Liu M,Watson LT,Zhang L

    更新日期:2014-01-09 00:00:00

  • Multi-resolution independent component analysis for high-performance tumor classification and biomarker discovery.

    abstract:BACKGROUND:Although high-throughput microarray based molecular diagnostic technologies show a great promise in cancer diagnosis, it is still far from a clinical application due to its low and instable sensitivities and specificities in cancer molecular pattern recognition. In fact, high-dimensional and heterogeneous tu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S1-S7

    authors: Han H,Li XL

    更新日期:2011-02-15 00:00:00

  • Structural analysis on mutation residues and interfacial water molecules for human TIM disease understanding.

    abstract:BACKGROUND:Human triosephosphate isomerase (HsTIM) deficiency is a genetic disease caused often by the pathogenic mutation E104D. This mutation, located at the side of an abnormally large cluster of water in the inter-subunit interface, reduces the thermostability of the enzyme. Why and how these water molecules are di...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S16-S11

    authors: Li Z,He Y,Liu Q,Zhao L,Wong L,Kwoh CK,Nguyen H,Li J

    更新日期:2013-01-01 00:00:00

  • πBUSS: a parallel BEAST/BEAGLE utility for sequence simulation under complex evolutionary scenarios.

    abstract:BACKGROUND:Simulated nucleotide or amino acid sequences are frequently used to assess the performance of phylogenetic reconstruction methods. BEAST, a Bayesian statistical framework that focuses on reconstructing time-calibrated molecular evolutionary processes, supports a wide array of evolutionary models, but lacked ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-133

    authors: Bielejec F,Lemey P,Carvalho LM,Baele G,Rambaut A,Suchard MA

    更新日期:2014-05-07 00:00:00

  • OmicsARules: a R package for integration of multi-omics datasets via association rules mining.

    abstract:BACKGROUND:The improvements of high throughput technologies have produced large amounts of multi-omics experiments datasets. Initial analysis of these data has revealed many concurrent gene alterations within single dataset or/and among multiple omics datasets. Although powerful bioinformatics pipelines have been devel...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3171-0

    authors: Chen D,Zhang F,Zhao Q,Xu J

    更新日期:2019-11-08 00:00:00

  • Correction to: Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage.

    abstract::Following publication of the original article [1], the author reported that there are several errors in the original article. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章,已发布勘误

    doi:10.1186/s12859-019-3318-z

    authors: Ranjard L,Wong TKF,Rodrigo AG

    更新日期:2020-01-22 00:00:00

  • An evidence-based approach to identify aging-related genes in Caenorhabditis elegans.

    abstract:BACKGROUND:Extensive studies have been carried out on Caenorhabditis elegans as a model organism to elucidate mechanisms of aging and the effects of perturbing known aging-related genes on lifespan and behavior. This research has generated large amounts of experimental data that is increasingly difficult to integrate a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0469-4

    authors: Callahan A,Cifuentes JJ,Dumontier M

    更新日期:2015-02-07 00:00:00

  • Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    abstract:BACKGROUND:Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as p...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S11-S2

    authors: Nagar A,Hahsler M

    更新日期:2013-01-01 00:00:00

  • Ciruvis: a web-based tool for rule networks and interaction detection using rule-based classifiers.

    abstract:BACKGROUND:The use of classification algorithms is becoming increasingly important for the field of computational biology. However, not only the quality of the classification, but also its biological interpretation is important. This interpretation may be eased if interacting elements can be identified and visualized, ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-139

    authors: Bornelöv S,Marillet S,Komorowski J

    更新日期:2014-05-12 00:00:00

  • BLAST+: architecture and applications.

    abstract:BACKGROUND:Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-421

    authors: Camacho C,Coulouris G,Avagyan V,Ma N,Papadopoulos J,Bealer K,Madden TL

    更新日期:2009-12-15 00:00:00

  • A novel statistical approach for identification of the master regulator transcription factor.

    abstract:BACKGROUND:Transcription factors are known to play key roles in carcinogenesis and therefore, are gaining popularity as potential therapeutic targets in drug development. A 'master regulator' transcription factor often appears to control most of the regulatory activities of the other transcription factors and the assoc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1499-x

    authors: Sikdar S,Datta S

    更新日期:2017-02-02 00:00:00

  • proTRAC--a software for probabilistic piRNA cluster detection, visualization and analysis.

    abstract:BACKGROUND:Throughout the metazoan lineage, typically gonadal expressed Piwi proteins and their guiding piRNAs (~26-32nt in length) form a protective mechanism of RNA interference directed against the propagation of transposable elements (TEs). Most piRNAs are generated from genomic piRNA clusters. Annotation of experi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-5

    authors: Rosenkranz D,Zischler H

    更新日期:2012-01-10 00:00:00

  • GOPET: a tool for automated predictions of Gene Ontology terms.

    abstract:BACKGROUND:Vast progress in sequencing projects has called for annotation on a large scale. A Number of methods have been developed to address this challenging task. These methods, however, either apply to specific subsets, or their predictions are not formalised, or they do not provide precise confidence values for th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-161

    authors: Vinayagam A,del Val C,Schubert F,Eils R,Glatting KH,Suhai S,König R

    更新日期:2006-03-20 00:00:00

  • PubFocus: semantic MEDLINE/PubMed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm.

    abstract:BACKGROUND:Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. RESULTS:PubFocus w...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-424

    authors: Plikus MV,Zhang Z,Chuong CM

    更新日期:2006-10-02 00:00:00

  • Web-TCGA: an online platform for integrated analysis of molecular cancer data sets.

    abstract:BACKGROUND:The Cancer Genome Atlas (TCGA) is a pool of molecular data sets publicly accessible and freely available to cancer researchers anywhere around the world. However, wide spread use is limited since an advanced knowledge of statistics and statistical software is required. RESULTS:In order to improve accessibil...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-0917-9

    authors: Deng M,Brägelmann J,Schultze JL,Perner S

    更新日期:2016-02-06 00:00:00

  • Connectivity independent protein-structure alignment: a hierarchical approach.

    abstract:BACKGROUND:Protein-structure alignment is a fundamental tool to study protein function, evolution and model building. In the last decade several methods for structure alignment were introduced, but most of them ignore that structurally similar proteins can share the same spatial arrangement of secondary structure eleme...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-510

    authors: Kolbeck B,May P,Schmidt-Goenner T,Steinke T,Knapp EW

    更新日期:2006-11-21 00:00:00

  • Parameterizing sequence alignment with an explicit evolutionary model.

    abstract:BACKGROUND:Inference of sequence homology is inherently an evolutionary question, dependent upon evolutionary divergence. However, the insertion and deletion penalties in the most widely used methods for inferring homology by sequence alignment, including BLAST and profile hidden Markov models (profile HMMs), are not b...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0832-5

    authors: Rivas E,Eddy SR

    更新日期:2015-12-10 00:00:00

  • Modeling of shotgun sequencing of DNA plasmids using experimental and theoretical approaches.

    abstract:BACKGROUND:Processing and analysis of DNA sequences obtained from next-generation sequencing (NGS) face some difficulties in terms of the correct prediction of DNA sequencing outcomes without the implementation of bioinformatics approaches. However, algorithms based on NGS perform inefficiently due to the generation of...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3461-6

    authors: Shityakov S,Bencurova E,Förster C,Dandekar T

    更新日期:2020-04-03 00:00:00

  • Bayesian detection of periodic mRNA time profiles without use of training examples.

    abstract:BACKGROUND:Detection of periodically expressed genes from microarray data without use of known periodic and non-periodic training examples is an important problem, e.g. for identifying genes regulated by the cell-cycle in poorly characterised organisms. Commonly the investigator is only interested in genes expressed at...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-63

    authors: Andersson CR,Isaksson A,Gustafsson MG

    更新日期:2006-02-09 00:00:00

  • Graph-based prediction of Protein-protein interactions with attributed signed graph embedding.

    abstract:BACKGROUND:Protein-protein interactions (PPIs) are central to many biological processes. Considering that the experimental methods for identifying PPIs are time-consuming and expensive, it is important to develop automated computational methods to better predict PPIs. Various machine learning methods have been proposed...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03646-8

    authors: Yang F,Fan K,Song D,Lin H

    更新日期:2020-07-21 00:00:00

  • MapMi: automated mapping of microRNA loci.

    abstract:BACKGROUND:A large effort to discover microRNAs (miRNAs) has been under way. Currently miRBase is their primary repository, providing annotations of primary sequences, precursors and probable genomic loci. In many cases miRNAs are identical or very similar between related (or in some cases more distant) species. Howeve...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-133

    authors: Guerra-Assunção JA,Enright AJ

    更新日期:2010-03-16 00:00:00

  • Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus.

    abstract:BACKGROUND:Identification of differentially methylated regions (DMRs) is the initial step towards the study of DNA methylation-mediated gene regulation. Previous approaches to call DMRs suffer from false prediction, use extreme resources, and/or require library installation and input conversion. RESULTS:We developed a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2037-1

    authors: Condon DE,Tran PV,Lien YC,Schug J,Georgieff MK,Simmons RA,Won KJ

    更新日期:2018-02-05 00:00:00

  • The textual characteristics of traditional and Open Access scientific journals are similar.

    abstract:BACKGROUND:Recent years have seen an increased amount of natural language processing (NLP) work on full text biomedical journal publications. Much of this work is done with Open Access journal articles. Such work assumes that Open Access articles are representative of biomedical publications in general and that methods...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-183

    authors: Verspoor K,Cohen KB,Hunter L

    更新日期:2009-06-15 00:00:00

  • A computational approach for detecting peptidases and their specific inhibitors at the genome level.

    abstract:BACKGROUND:Peptidases are proteolytic enzymes responsible for fundamental cellular activities in all organisms. Apparently about 2-5% of the genes encode for peptidases, irrespectively of the organism source. The basic peptidase function is "protein digestion" and this can be potentially dangerous in living organisms w...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S1-S3

    authors: Bartoli L,Calabrese R,Fariselli P,Mita DG,Casadio R

    更新日期:2007-03-08 00:00:00

  • Identifying module biomarker in type 2 diabetes mellitus by discriminative area of functional activity.

    abstract:BACKGROUND:Identifying diagnosis and prognosis biomarkers from expression profiling data is of great significance for achieving personalized medicine and designing therapeutic strategy in complex diseases. However, the reproducibility of identified biomarkers across tissues and experiments is still a challenge for this...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0519-y

    authors: Zhang X,Gao L,Liu ZP,Chen L

    更新日期:2015-03-18 00:00:00

  • Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data.

    abstract:BACKGROUND:This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulatio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-310

    authors: Barros RC,Winck AT,Machado KS,Basgalupp MP,de Carvalho AC,Ruiz DD,de Souza ON

    更新日期:2012-11-21 00:00:00

  • Identification of sequence motifs significantly associated with antisense activity.

    abstract:BACKGROUND:Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-184

    authors: McQuisten KA,Peek AS

    更新日期:2007-06-07 00:00:00

  • TooT-T: discrimination of transport proteins from non-transport proteins.

    abstract:BACKGROUND:Membrane transport proteins (transporters) play an essential role in every living cell by transporting hydrophilic molecules across the hydrophobic membranes. While the sequences of many membrane proteins are known, their structure and function is still not well characterized and understood, owing to the imm...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3311-6

    authors: Alballa M,Butler G

    更新日期:2020-04-23 00:00:00

  • FITBAR: a web tool for the robust prediction of prokaryotic regulons.

    abstract:BACKGROUND:The binding of regulatory proteins to their specific DNA targets determines the accurate expression of the neighboring genes. The in silico prediction of new binding sites in completely sequenced genomes is a key aspect in the deeper understanding of gene regulatory networks. Several algorithms have been des...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-554

    authors: Oberto J

    更新日期:2010-11-11 00:00:00

  • Discrimination of cell cycle phases in PCNA-immunolabeled cells.

    abstract:BACKGROUND:Protein function in eukaryotic cells is often controlled in a cell cycle-dependent manner. Therefore, the correct assignment of cellular phenotypes to cell cycle phases is a crucial task in cell biology research. Nuclear proteins whose localization varies during the cell cycle are valuable and frequently use...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0618-9

    authors: Schönenberger F,Deutzmann A,Ferrando-May E,Merhof D

    更新日期:2015-05-29 00:00:00