Abstract:
BACKGROUND:Genome-wide association studies (GWAS) have identified many common polymorphisms associated with complex traits. However, these associated common variants explain only a small fraction of the phenotypic variances, leaving a substantial portion of genetic heritability unexplained. As a result, searches for "missing" heritability are drawing increasing attention, particularly for rare variant studies that often require a large sample size and, thus, extensive sequencing effort. Although the development of next generation sequencing (NGS) technologies has made it possible to sequence a large number of reads economically and efficiently, it is still often cost prohibitive to sequence thousands of individuals that are generally required for association studies. A more efficient and cost-effective design would involve pooling the genetic materials of multiple individuals together and then sequencing the pools, instead of the individuals. This pooled sequencing approach has improved the plausibility of association studies for rare variants, while, at the same time, posed a great challenge to the pooled sequencing data analysis, essentially because individual sample identity is lost, and NGS sequencing errors could be hard to distinguish from low frequency alleles. RESULTS:A unified approach for estimating minor allele frequency, SNP calling and association studies based on pooled sequencing data using an expectation maximization (EM) algorithm is developed in this paper. This approach makes it possible to study the effects of minor allele frequency, sequencing error rate, number of pools, number of individuals in each pool, and the sequencing depth on the estimation accuracy of minor allele frequencies. We show that the naive method of estimating minor allele frequencies by taking the fraction of observed minor alleles can be significantly biased, especially for rare variants. In contrast, our EM approach can give an unbiased estimate of the minor allele frequency under all scenarios studied in this paper. A SNP calling approach, EM-SNP, for pooled sequencing data based on the EM algorithm is then developed and compared with another recent SNP calling method, SNVer. We show that EM-SNP outperforms SNVer in terms of the fraction of db-SNPs among the called SNPs, as well as transition/transversion (Ti/Tv) ratio. Finally, the EM approach is used to study the association between variants and type I diabetes. CONCLUSIONS:The EM-based approach for the analysis of pooled sequencing data can accurately estimate minor allele frequencies, call SNPs, and find associations between variants and complex traits. This approach is especially useful for studies involving rare variants.
journal_name
BMC Genomicsjournal_title
BMC genomicsauthors
Chen Q,Sun Fdoi
10.1186/1471-2164-14-S1-S1subject
Has Abstractpub_date
2013-01-01 00:00:00pages
S1issn
1471-2164pii
1471-2164-14-S1-S1journal_volume
14 Suppl 1pub_type
杂志文章相关文献
BMC GENOMICS文献大全abstract:BACKGROUND:The Bacillus cereus sensu lato group currently includes seven species (B. cereus, B. anthracis, B. mycoides, B. pseudomycoides, B. thuringiensis, B. weihenstephanensis and B. cytotoxicus) that recent phylogenetic and phylogenomic analyses suggest are likely a single species, despite their varied phenotypes. ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-564
更新日期:2012-10-22 00:00:00
abstract:BACKGROUND:The genus Populus includes poplars, aspens and cottonwoods, which will be collectively referred to as poplars hereafter unless otherwise specified. Poplars are the dominant tree species in many forest ecosystems in the Northern Hemisphere and are of substantial economic value in plantation forestry. Poplar h...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-57
更新日期:2008-01-29 00:00:00
abstract:BACKGROUND:Drought is a lifestyle disease. Plant metabolomics has been exercised for understanding the fine-tuning of the potential pathways to surmount the adverse effects of drought stress. A broad spectrum of morphological and metabolic responses from seven Triticeae species including wild types with different droug...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-017-4321-2
更新日期:2017-12-15 00:00:00
abstract:BACKGROUND:Cold tolerance is a key determinant of the geographical distribution range of a plant species and crop production. Cold acclimation can enhance freezing-tolerance of plant species through a period of exposure to low nonfreezing temperatures. As a subtropical evergreen broadleaf plant, oil-tea camellia demons...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-017-3570-4
更新日期:2017-02-28 00:00:00
abstract:BACKGROUND:Tumor angiogenesis is a highly regulated process involving intercellular communication as well as the interactions of multiple downstream signal transduction pathways. Disrupting one or even a few angiogenesis pathways is often insufficient to achieve sustained therapeutic benefits due to the complexity of a...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-264
更新日期:2008-06-02 00:00:00
abstract:BACKGROUND:Compromised intestinal barrier (CIB) has been associated with many enteropathies, including colorectal cancer (CRC) and inflammatory bowel disease (IBD). We hypothesized that CIB could lead to increased host-derived contents including epithelial cells into the gut, change its physio-metabolic properties, and...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-6749-z
更新日期:2020-05-11 00:00:00
abstract:BACKGROUND:Cytochrome P450s (CYPs) in animals fall into two categories: those that synthesize or metabolize endogenous molecules and those that interact with exogenous chemicals from the diet or the environment. The latter form a critical component of detoxification systems. RESULTS:Data mining and manual curation of ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-169
更新日期:2009-04-21 00:00:00
abstract:BACKGROUND:Soybean cyst nematode (SCN) is the most economically devastating pathogen of soybean. Two resistance loci, Rhg1 and Rhg4 primarily contribute resistance to SCN race 3 in soybean. Peking and PI 88788 are the two major sources of SCN resistance with Peking requiring both Rhg1 and Rhg4 alleles and PI 88788 only...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-1531-3
更新日期:2015-04-18 00:00:00
abstract:BACKGROUND:Effective bioinformatics solutions are needed to tackle challenges posed by industrial-scale genome annotation. We present Bcheck, a wrapper tool which predicts RNase P RNA genes by combining the speed of pattern matching and sensitivity of covariance models. The core of Bcheck is a library of subfamily spec...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-432
更新日期:2010-07-13 00:00:00
abstract:BACKGROUND:Calcium-dependent protein kinase (CPK) is one of the main Ca2+ combined protein kinase that play significant roles in plant growth, development and response to multiple stresses. Despite an important member of the stress responsive gene family, little is known about the evolutionary history and expression pa...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-6501-8
更新日期:2020-01-23 00:00:00
abstract:BACKGROUND:Fatty liver is a high incidence of perinatal disease in dairy cows caused by negative energy balance, which seriously threatens the postpartum health and milk production. It has been reported that lysine acetylation plays an important role in substance and energy metabolism. Predictably, most metabolic proce...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-06837-y
更新日期:2020-06-26 00:00:00
abstract:BACKGROUND:Though Illumina has largely dominated the RNA-Seq field, the simultaneous availability of Ion Torrent has left scientists wondering which platform is most effective for differential gene expression (DGE) analysis. Previous investigations of this question have typically used reference samples derived from cel...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-017-4011-0
更新日期:2017-08-10 00:00:00
abstract:BACKGROUND:Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-17
更新日期:2009-01-12 00:00:00
abstract:BACKGROUND:In livestock species like the chicken, high throughput single nucleotide polymorphism (SNP) genotyping assays are increasingly being used for whole genome association studies and as a tool in breeding (referred to as genomic selection). To be of value in a wide variety of breeds and populations, the success ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-274
更新日期:2011-05-31 00:00:00
abstract:BACKGROUND:Controlling and managing the breeding of bluefin tuna (Thunnus spp.) in captivity is an imperative step towards obtaining a sustainable supply of these fish in aquaculture production systems. Germ cell transplantation (GCT) is an innovative technology for the production of inter-species surrogates, by transp...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-016-2397-8
更新日期:2016-03-10 00:00:00
abstract:BACKGROUND:Despite the importance of chromosomal translocations in the initiation and/or progression of cancer, a comprehensive catalog of translocation breakpoints in which these are precisely located on the reference sequence of the human genome is not available at present. DESCRIPTION:We have created a database tha...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-8-33
更新日期:2007-01-26 00:00:00
abstract:BACKGROUND:Anthocyanins are a group of flavonoid compounds. As a group of important secondary metabolites, they perform several key biological functions in plants. Anthocyanins also play beneficial health roles as potentially protective factors against cancer and heart disease. To elucidate the anthocyanin biosynthetic...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-15-426
更新日期:2014-06-04 00:00:00
abstract:BACKGROUND:An efficient signal transduction system allows a bacterium to sense environmental cues and then to respond positively or negatively to those signals; this process is referred to as taxis. In addition to external cues, the internal metabolic state of any bacterium plays a major role in determining its ability...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-5151-6
更新日期:2018-10-19 00:00:00
abstract:BACKGROUND:Runs of Homozygosity (ROH) are genomic regions where identical haplotypes are inherited from each parent. Since their first detection due to technological advances in the late 1990s, ROHs have been shedding light on human population history and deciphering the genetic basis of monogenic and complex traits an...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-4489-0
更新日期:2018-01-30 00:00:00
abstract:BACKGROUND:Tandem mass spectrometry (MS/MS) has become a standard method for identification of proteins extracted from biological samples but the huge number and the noise contamination of MS/MS spectra obstruct swift and reliable computer-aided interpretation. Typically, a minor fraction of the spectra per sample (mos...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-S1-S13
更新日期:2010-02-10 00:00:00
abstract:BACKGROUND:Aberrant DNA methylation is a hallmark of many cancers. Classically there are two types of endometrial cancer, endometrioid adenocarcinoma (EAC), or Type I, and uterine papillary serous carcinoma (UPSC), or Type II. However, the whole genome DNA methylation changes in these two classical types of endometrial...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-15-868
更新日期:2014-10-06 00:00:00
abstract:BACKGROUND:The Spemann/Mangold organizer is a transient tissue critical for patterning the gastrula stage vertebrate embryo and formation of the three germ layers. Despite its important role during development, there are still relatively few genes with specific expression in the organizer and its derivatives. Foxa2 is ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-511
更新日期:2008-10-30 00:00:00
abstract:BACKGROUND:Rapid acquisition of accurate genotyping information is essential for all genetic marker-based studies. For species with relatively small genomes, complete genome resequencing is a feasible approach for genotyping; however, for species with large and highly repetitive genomes, the acquisition of whole genome...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-448
更新日期:2013-07-05 00:00:00
abstract:BACKGROUND:We have previously reported on paucity of mitochondrial DNA (mtDNA) haplogroups J and K among Finnish endurance athletes. Here we aimed to further explore differences in mtDNA variants between elite endurance and sprint athletes. For this purpose, we determined the rate of functional variants and the mutatio...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-019-6171-6
更新日期:2019-10-29 00:00:00
abstract:BACKGROUND:Xanthomonas citri pv. citri (Xcc) is a citrus canker causing Gram-negative bacteria. Currently, little is known about the biological and molecular responses of Xcc to low temperatures. RESULTS:Results depicted that low temperature significantly reduced growth and increased biofilm formation and unsaturated ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-019-6193-0
更新日期:2019-11-06 00:00:00
abstract:BACKGROUND:Genomic approaches provide unique opportunities to study interactions of insects with their pathogens. We developed a cDNA microarray to analyze the gene transcription profile of the lepidopteran pest Spodoptera frugiperda in response to injection of the polydnavirus HdIV associated with the ichneumonid wasp...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-7-160
更新日期:2006-06-21 00:00:00
abstract:BACKGROUND:Analysis of single nucleotide polymorphisms (SNPs) derived from whole-genome studies allows for rapid evaluation of genome-wide diversity, and genomic epidemiology studies of Plasmodium falciparum provide insights into parasite population structure, gene flow, drug resistance and vaccine development. In area...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-15-719
更新日期:2014-08-26 00:00:00
abstract:BACKGROUND:Natural accessions of Arabidopsis thaliana are characterized by a high level of phenotypic variation that can be used to investigate the extent and mode of selection on the primary metabolic traits. A collection of 54 A. thaliana natural accession-derived lines were subjected to deep genotyping through Singl...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-188
更新日期:2010-03-20 00:00:00
abstract:BACKGROUND:Cryptocaryon irritans is an obligate parasitic ciliate protozoan that can infect various commercially important mariculture fish species and cause high lethality and economic loss. Current methods of controlling this parasite with chemicals or antibiotics are widely considered to be environmentally harmful. ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-4565-5
更新日期:2018-03-12 00:00:00
abstract:BACKGROUND:To develop evolutionary models for the free living bacterium Alteromonas the genome sequences of isolates of the genus have been extensively analyzed. However, the main genetic exchange drivers in these microbes, conjugative elements (CEs), have not been considered in detail thus far. In this work, CEs have ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-016-3461-0
更新日期:2017-01-05 00:00:00