Abstract:
BACKGROUND:As crucial markers in identifying biological elements and processes in mammalian genomes, CpG islands (CGI) play important roles in DNA methylation, gene regulation, epigenetic inheritance, gene mutation, chromosome inactivation and nuclesome retention. The generally accepted criteria of CGI rely on: (a) %G+C content is ≥ 50%, (b) the ratio of the observed CpG content and the expected CpG content is ≥ 0.6, and (c) the general length of CGI is greater than 200 nucleotides. Most existing computational methods for the prediction of CpG island are programmed on these rules. However, many experimentally verified CpG islands deviate from these artificial criteria. Experiments indicate that in many cases %G+C is < 50%, CpG obs /CpG exp varies, and the length of CGI ranges from eight nucleotides to a few thousand of nucleotides. It implies that CGI detection is not just a straightly statistical task and some unrevealed rules probably are hidden. RESULTS:A novel Gaussian model, GaussianCpG, is developed for detection of CpG islands on human genome. We analyze the energy distribution over genomic primary structure for each CpG site and adopt the parameters from statistics of Human genome. The evaluation results show that the new model can predict CpG islands efficiently by balancing both sensitivity and specificity over known human CGI data sets. Compared with other models, GaussianCpG can achieve better performance in CGI detection. CONCLUSIONS:Our Gaussian model aims to simplify the complex interaction between nucleotides. The model is computed not by the linear statistical method but by the Gaussian energy distribution and accumulation. The parameters of Gaussian function are not arbitrarily designated but deliberately chosen by optimizing the biological statistics. By using the pseudopotential analysis on CpG islands, the novel model is validated on both the real and artificial data sets.
journal_name
BMC Genomicsjournal_title
BMC genomicsauthors
Yu N,Guo X,Zelikovsky A,Pan Ydoi
10.1186/s12864-017-3731-5subject
Has Abstractpub_date
2017-05-24 00:00:00pages
392issue
Suppl 4issn
1471-2164pii
10.1186/s12864-017-3731-5journal_volume
18pub_type
杂志文章相关文献
BMC GENOMICS文献大全abstract:BACKGROUND:Single-cell (sc) sequencing performs unbiased profiling of individual cells and enables evaluation of less prevalent cellular populations, often missed using bulk sequencing. However, the scale and the complexity of the sc datasets poses a great challenge in its utility and this problem is further exacerbate...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-07300-8
更新日期:2021-01-06 00:00:00
abstract:BACKGROUND:Human T cell leukemia virus type 1 (HTLV-1) Tax is a potent activator of viral and cellular gene expression that interacts with a number of cellular proteins. Many reports show that Tax is capable of regulating cell cycle progression and apoptosis both positively and negatively. However, it still remains to ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-275
更新日期:2012-06-22 00:00:00
abstract:BACKGROUND:Clostridium sticklandii belongs to a cluster of non-pathogenic proteolytic clostridia which utilize amino acids as carbon and energy sources. Isolated by T.C. Stadtman in 1954, it has been generally regarded as a "gold mine" for novel biochemical reactions and is used as a model organism for studying metabol...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-555
更新日期:2010-10-11 00:00:00
abstract:BACKGROUND:By assaying hundreds of thousands of single nucleotide polymorphisms, genome wide association studies (GWAS) allow for a powerful, unbiased review of the entire genome to localize common genetic variants that influence health and disease. Although it is widely recognized that some correction for multiple tes...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-516
更新日期:2008-10-31 00:00:00
abstract:BACKGROUND:Alternative polyadenylation (APA) has emerged as a pervasive mechanism that contributes to the transcriptome complexity and dynamics of gene regulation. The current tsunami of whole genome poly(A) site data from various conditions generated by 3' end sequencing provides a valuable data source for the study o...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-019-5433-7
更新日期:2019-01-22 00:00:00
abstract:BACKGROUND:The human microbiome plays a significant role in maintaining normal physiology. Changes in its composition have been associated with bowel disease, metabolic disorders and atherosclerosis. Sequences of microbial origin have been observed within small RNA sequencing data obtained from blood samples. The aim o...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-15-933
更新日期:2014-10-25 00:00:00
abstract:BACKGROUND:The identification of cell type-specific genes (markers) is an essential step for the deconvolution of the cellular fractions, primarily, from the gene expression data of a bulk sample. However, the genes with significant changes identified by pair-wise comparisons cannot indeed represent the specificity of ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-06888-1
更新日期:2020-09-23 00:00:00
abstract:BACKGROUND:The packaging of DNA into chromatin regulates transcription from initiation through 3' end processing. One aspect of transcription in which chromatin plays a poorly understood role is the co-transcriptional splicing of pre-mRNA. RESULTS:Here we provide evidence that H2B monoubiquitylation (H2BK123ub1) marks...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-627
更新日期:2011-12-22 00:00:00
abstract:UNLABELLED:Transcription-induced chimerism, a mechanism involving the transcription and intergenic splicing of two consecutive genes, has recently been estimated to account for approximately 5% of the human transcriptome. Despite this prevalence, the regulation and function of these fused transcripts remains largely un...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-8-348
更新日期:2007-10-01 00:00:00
abstract:BACKGROUND:Escherichia coli infections known as colibacillosis constitute a considerable challenge to poultry farmers worldwide, in terms of decreased animal welfare and production economy. Colibacillosis is caused by avian pathogenic E. coli (APEC). APEC strains are extraintestinal pathogenic E. coli and have in gener...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-016-3415-6
更新日期:2017-01-03 00:00:00
abstract:BACKGROUND:Prostaglandin E2 (PGE2) is involved in several chronic inflammatory diseases including periodontitis, which causes loss of the gingival tissue and alveolar bone supporting the teeth. We have previously shown that tumor necrosis factor alpha (TNFalpha) induces PGE2 synthesis in gingival fibroblasts. In this s...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-241
更新日期:2010-04-15 00:00:00
abstract:BACKGROUND:Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-515
更新日期:2008-10-31 00:00:00
abstract:BACKGROUND:Pitayas are currently attracting considerable interest as a tropical fruit with numerous health benefits. However, as a long-day plant, pitaya plants cannot flower in the winter season from November to April in Hainan, China. To harvest pitayas with high economic value in the winter season, it is necessary t...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-6726-6
更新日期:2020-04-29 00:00:00
abstract:BACKGROUND:Betula platyphylla is a common tree species in northern China that has high economic and medicinal value. Our laboratory has been devoted to genome research on B. platyphylla for approximately 10 years. As primary organelle genomes, the complete genome sequences of chloroplasts are important to study the div...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-5346-x
更新日期:2018-12-20 00:00:00
abstract:BACKGROUND:The monophyly of flatfishes has not been supported in many molecular phylogenetic studies. The monophyly of Pleuronectoidei, which comprises all but one family of flatfishes, is broadly supported. However, the Psettodoidei, comprising the single family Psettodidae, is often found to be most closely related t...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-4788-5
更新日期:2018-05-25 00:00:00
abstract:BACKGROUND:Soybean seed weight is not only a yield component, but also a critical trait for various soybean food products such as sprouts, edamame, soy nuts, natto and miso. Linkage analysis and genome-wide association study (GWAS) are two complementary and powerful tools to connect phenotypic differences to the underl...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-017-3922-0
更新日期:2017-07-12 00:00:00
abstract:BACKGROUND:Tandem repeats are ubiquitous and abundant in higher eukaryotic genomes and constitute, along with transposable elements, much of DNA underlying centromeres and other heterochromatic domains. In maize, centromeric satellite repeat (CentC) and centromeric retrotransposons (CR), a class of Ty3/gypsy retrotrans...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-142
更新日期:2013-03-04 00:00:00
abstract:BACKGROUND:Horizontal gene transfer has shaped the evolution of the ammonium transporter/ammonia permease gene family. Horizontal transfers of ammonium transporter/ammonia permease genes into the fungi include one transfer from archaea to the filamentous ascomycetes associated with the adaptive radiation of the leotiom...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-225
更新日期:2013-04-04 00:00:00
abstract:BACKGROUND:Congenital heart disease (CHD) is the leading non-infectious cause of death in infants. Monozygotic (MZ) twins share nearly all of their genetic variants before and after birth. Nevertheless, MZ twins are sometimes discordant for common complex diseases. The goal of this study is to identify genomic and epig...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-4814-7
更新日期:2018-06-04 00:00:00
abstract:BACKGROUND:Progress in genetics and breeding in pea still suffers from the limited availability of molecular resources. SNP markers that can be identified through affordable sequencing processes, without the need for prior genome reduction or a reference genome to assemble sequencing data would allow the discovery and ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-016-2447-2
更新日期:2016-02-18 00:00:00
abstract:BACKGROUND:The Bacillus cereus sensu lato group consists of six species (B. anthracis, B. cereus, B. mycoides, B. pseudomycoides, B. thuringiensis, and B. weihenstephanensis). While classical microbial taxonomy proposed these organisms as distinct species, newer molecular phylogenies and comparative genome sequencing s...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-430
更新日期:2011-08-24 00:00:00
abstract:BACKGROUND:During embryogenesis the liver is derived from endodermal cells lining the digestive tract. These endodermal progenitor cells contribute to forming the parenchyma of a number of organs including the liver and pancreas. Early in organogenesis the fetal liver is populated by hematopoietic stem cells, the sourc...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-33
更新日期:2012-01-19 00:00:00
abstract:BACKGROUND:Bifidobacterial genome analysis has provided insights as to how these gut commensals adapt to and persist in the human GIT, while also revealing genetic diversity among members of a given bifidobacterial (sub)species. Bifidobacteria are notoriously recalcitrant to genetic modification, which prevents explora...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-1968-4
更新日期:2015-10-21 00:00:00
abstract:BACKGROUND:In a previous genome-wide analysis of FXR binding to hepatic chromatin, we noticed that an extra nuclear receptor (NR) half-site was co-enriched close to the FXR binding IR-1 elements and we provided limited support that the monomeric LRH-1 receptor that binds to NR half-sites might function together with FX...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-51
更新日期:2012-02-01 00:00:00
abstract:BACKGROUND:The eyes and skin are obvious retinoid target organs. Vitamin A deficiency causes night blindness and retinoids are widely used to treat acne and psoriasis. However, more than 90% of total body retinol is stored in liver stellate cells. In addition, hepatocytes produce the largest amount of retinol binding p...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-575
更新日期:2013-08-28 00:00:00
abstract:BACKGROUND:The Pregnancy-associated glycoproteins (PAGs) belong to a large family of aspartic peptidases expressed exclusively in the placenta of species in the Artiodactyla order. In cattle, the PAG gene family is comprised of at least 22 transcribed genes, as well as some variants. Phylogenetic analyses have shown th...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-185
更新日期:2009-04-24 00:00:00
abstract:BACKGROUND:A fundamental requirement for genomic studies is the availability of genetic material of good quality and quantity. The desired quantity and quality are often hard to obtain when target DNA is composed of complex mixtures of relatively short DNA fragments. Here, we sought to develop a method to representativ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-415
更新日期:2008-09-15 00:00:00
abstract:BACKGROUND:Abnormalities of pre-mRNA splicing are increasingly recognized as an important mechanism through which gene mutations cause disease. However, apart from the mutations in the donor and acceptor sites, the effects on splicing of other sequence variations are difficult to predict. Loosely defined exonic and int...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-7-243
更新日期:2006-09-22 00:00:00
abstract:BACKGROUND:Ammonia is one of the most common toxicological environment factors affecting shrimp health. Although ammonia tolerance in shrimp is closely related to successful industrial production, few genetic studies of this trait are available. RESULTS:In this study, we constructed a high-density genetic map of the P...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-07254-x
更新日期:2020-12-02 00:00:00
abstract:BACKGROUND:Calcineurin B-like protein (CBL)-interacting protein kinases (CIPKs) are the primary components of calcium sensors, and play crucial roles in plant developmental processes, hormone signaling transduction, and in the response to exogenous stresses. RESULTS:In this study, 48 CIPK genes (SsCIPKs) were identifi...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-07264-9
更新日期:2020-12-07 00:00:00