Abstract:
BACKGROUND:The amount of gene expression data available in public repositories has grown exponentially in the last years, now requiring new data mining tools to transform them in information easily accessible to biologists. RESULTS:By exploiting expression data publicly available in the Gene Expression Omnibus (GEO) database, we developed a new bioinformatics tool aimed at the identification of genes whose expression appeared simultaneously altered in different experimental conditions, thus suggesting co-regulation or coordinated action in the same biological process. To accomplish this task, we used the 978 human GEO Curated DataSets and we manually performed the selection of 2,109 pair-wise comparisons based on their biological rationale. The lists of differentially expressed genes, obtained from the selected comparisons, were stored in a PostgreSQL database and used as data source for the CorrelaGenes tool. Our application uses a customized Association Rule Mining (ARM) algorithm to identify sets of genes showing expression profiles correlated with a gene of interest. The significance of the correlation is measured coupling the Lift, a well-known standard ARM index, and the χ(2) p value. The manually curated selection of the comparisons and the developed algorithm constitute a new approach in the field of gene expression profiling studies. Simulation performed on 100 randomly selected target genes allowed us to evaluate the efficiency of the procedure and to obtain preliminary data demonstrating the consistency of the results. CONCLUSIONS:The preliminary results of the simulation showed how CorrelaGenes could contribute to the characterization of molecular pathways and biological processes integrating data obtained from other applications and available in public repositories.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Cremaschi P,Rovida S,Sacchi L,Lisa A,Calvi F,Montecucco A,Biamonti G,Bione S,Sacchi Gdoi
10.1186/1471-2105-15-S1-S6subject
Has Abstractpub_date
2014-01-01 00:00:00pages
S6issn
1471-2105pii
1471-2105-15-S1-S6journal_volume
15 Suppl 1pub_type
杂志文章abstract:BACKGROUND:Scaffold proteins are known for being crucial regulators of various cellular functions by assembling multiple proteins involved in signaling and metabolic pathways. Identification of scaffold proteins and the study of their molecular mechanisms can open a new aspect of cellular systemic regulation and the re...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1079-5
更新日期:2016-07-28 00:00:00
abstract:BACKGROUND:To infer gene regulatory networks from time series gene profiles, two important tasks that are related to biological systems must be undertaken. One task is to determine a valid network structure that has topological properties that can influence the network dynamics profoundly. The other task is to optimize...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S15-S8
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each indiv...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1332-y
更新日期:2016-12-23 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs that function as post-transcriptional regulators of messenger RNA (mRNA) through base-pairing to 6-8 nucleotide long target sites, usually located within the mRNA 3' untranslated region. A common approach to validate and probe microRNA-mRNA interact...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1057-y
更新日期:2016-04-27 00:00:00
abstract:BACKGROUND:The Triplex cell vaccine is a cancer cellular vaccine that can prevent almost completely the mammary tumor onset in HER-2/neu transgenic mice. In a translational perspective, the activity of the Triplex vaccine was also investigated against lung metastases showing that the vaccine is an effective treatment a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S7-S13
更新日期:2010-10-15 00:00:00
abstract:BACKGROUND:Protein-protein interactions (PPIs) play crucial roles in virtually every aspect of cellular function within an organism. Over the last decade, the development of novel high-throughput techniques has resulted in enormous amounts of data and provided valuable resources for studying protein interactions. Howev...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S7-S3
更新日期:2012-05-08 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNAs) are recognized as one of the most important families of non-coding RNAs that serve as important sequence-specific post-transcriptional regulators of gene expression. Identification of miRNAs is an important requirement for understanding the mechanisms of post-transcriptional regulation. Hu...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-341
更新日期:2007-09-17 00:00:00
abstract:BACKGROUND:Cross-platform analysis of gene express data requires multiple, intricate processes at different layers with various platforms. However, existing tools handle only a single platform and are not flexible enough to support custom changes, which arise from the new statistical methods, updated versions of refere...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-75
更新日期:2011-03-17 00:00:00
abstract:BACKGROUND:PCR clonal artefacts originating from NGS library preparation can affect both genomic as well as RNA-Seq applications when protocols are pushed to their limits. In RNA-Seq however the artifactual reads are not easy to tell apart from normal read duplication due to natural over-sequencing of highly expressed ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1276-2
更新日期:2016-10-21 00:00:00
abstract:BACKGROUND:The development of high-throughput experimentation has led to astronomical growth in biologically relevant lipids and lipid derivatives identified, screened, and deposited in numerous online databases. Unfortunately, efforts to annotate, classify, and analyze these chemical entities have largely remained in ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-303
更新日期:2011-07-26 00:00:00
abstract:BACKGROUND:Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to populatio...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-255
更新日期:2013-08-21 00:00:00
abstract:BACKGROUND:The application of high-throughput sequencing in a broad range of quantitative genomic assays (e.g., DNA-seq, ChIP-seq) has created a high demand for the analysis of large-scale read-count data. Typically, the genome is divided into tiling windows and windowed read-count data is generated for the entire geno...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2077-6
更新日期:2018-03-01 00:00:00
abstract:BACKGROUND:Protein quality assessment (QA) useful for ranking and selecting protein models has long been viewed as one of the major challenges for protein tertiary structure prediction. Especially, estimating the quality of a single protein model, which is important for selecting a few good models out of a large model ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1405-y
更新日期:2016-12-05 00:00:00
abstract:BACKGROUND:Phenotype prediction problems are usually considered ill-posed, as the amount of samples is very limited with respect to the scrutinized genetic probes. This fact complicates the sampling of the defective genetic pathways due to the high number of possible discriminatory genetic networks involved. In this re...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3356-6
更新日期:2020-03-11 00:00:00
abstract:BACKGROUND:A phylogeny postulates shared ancestry relationships among organisms in the form of a binary tree. Phylogenies attempt to answer an important question posed in biology: what are the ancestor-descendent relationships between organisms? At the core of every biological problem lies a phylogenetic component. The...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-66
更新日期:2013-02-26 00:00:00
abstract:BACKGROUND:Caspases are a family of proteases that have central functions in programmed cell death (apoptosis) and inflammation. Caspases mediate their effects through aspartate-specific cleavage of their target proteins, and at present almost 400 caspase substrates are known. There are several methods developed to pre...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-320
更新日期:2010-06-15 00:00:00
abstract:BACKGROUND:Identification of the recombination hot/cold spots is critical for understanding the mechanism of recombination as well as the genome evolution process. However, experimental identification of recombination spots is both time-consuming and costly. Developing an accurate and automated method for reliably and ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-340
更新日期:2014-11-20 00:00:00
abstract:BACKGROUND:We developed an extendable open-source Loop-mediated isothermal AMPlification (LAMP) signature design program called LAVA (LAMP Assay Versatile Analysis). LAVA was created in response to limitations of existing LAMP signature programs. RESULTS:LAVA identifies combinations of six primer regions for basic LAM...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-240
更新日期:2011-06-16 00:00:00
abstract:BACKGROUND:In bioinformatics community, many tasks associate with matching a set of protein query sequences in large sequence datasets. To conduct multiple queries in the database, a common used method is to run BLAST on each original querey or on the concatenated queries. It is inefficient since it doesn't exploit the...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1938-8
更新日期:2017-11-21 00:00:00
abstract:BACKGROUND:Expression quantitative trait loci (eQTL) mapping is often used to identify genetic loci and candidate genes correlated with traits. Although usually a group of genes affect complex traits, genes in most eQTL mapping methods are considered as independent. Recently, some eQTL mapping methods have accounted fo...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1387-9
更新日期:2016-12-13 00:00:00
abstract:BACKGROUND:Gene expression experiments are common in molecular biology, for example in order to identify genes which play a certain role in a specified biological framework. For that purpose expression levels of several thousand genes are measured simultaneously using DNA microarrays. Comparing two distinct groups of t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-288
更新日期:2011-07-15 00:00:00
abstract:BACKGROUND:The protein-coding regions (coding exons) of a DNA sequence exhibit a triplet periodicity (TP) due to fact that coding exons contain a series of three nucleotide codons that encode specific amino acid residues. Such periodicity is usually not observed in introns and intergenic regions. If a DNA sequence is d...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-550
更新日期:2010-11-08 00:00:00
abstract:BACKGROUND:Word sense disambiguation (WSD) attempts to solve lexical ambiguities by identifying the correct meaning of a word based on its context. WSD has been demonstrated to be an important step in knowledge-based approaches to automatic summarization. However, the correlation between the accuracy of the WSD methods...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-355
更新日期:2011-08-26 00:00:00
abstract:BACKGROUND:Visualization tools for deep learning models typically focus on discovering key input features without considering how such low level features are combined in intermediate layers to make decisions. Moreover, many of these methods examine a network's response to specific input examples that may be insufficien...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2957-4
更新日期:2019-07-19 00:00:00
abstract:BACKGROUND:Antibodies revolutionized cancer treatment over the past decades. Despite their successfully application, there are still challenges to overcome to improve efficacy, such as the heterogeneous distribution of antibodies within tumors. Tumor microenvironment features, such as the distribution of tumor and othe...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03854-2
更新日期:2020-11-17 00:00:00
abstract:BACKGROUND:G-DOC Plus is a data integration and bioinformatics platform that uses cloud computing and other advanced computational tools to handle a variety of biomedical BIG DATA including gene expression arrays, NGS and medical images so that they can be analyzed in the full context of other omics and clinical inform...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1010-0
更新日期:2016-04-30 00:00:00
abstract:BACKGROUND:Long-range interactions between regulatory DNA elements such as enhancers, insulators and promoters play an important role in regulating transcription. As chromatin contacts have been found throughout the human genome and in different cell types, spatial transcriptional control is now viewed as a general mec...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-414
更新日期:2011-10-25 00:00:00
abstract:BACKGROUND:In this paper, it is proposed an optimization approach for producing reduced alphabets for peptide classification, using a Genetic Algorithm. The classification task is performed by a multi-classifier system where each classifier (Linear or Radial Basis function Support Vector Machines) is trained using feat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-45
更新日期:2008-01-24 00:00:00
abstract:BACKGROUND:Cluster analysis is the most common unsupervised method for finding hidden groups in data. Clustering presents two main challenges: (1) finding the optimal number of clusters, and (2) removing "outliers" among the objects being clustered. Few clustering algorithms currently deal directly with the outlier pro...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1998-9
更新日期:2018-01-08 00:00:00
abstract:BACKGROUND:Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the m...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-184
更新日期:2007-06-07 00:00:00