Abstract:
BACKGROUND:Logic Learning Machine (LLM) is an innovative method of supervised analysis capable of constructing models based on simple and intelligible rules. In this investigation the performance of LLM in classifying patients with cancer was evaluated using a set of eight publicly available gene expression databases for cancer diagnosis. LLM accuracy was assessed by summary ROC curve (sROC) analysis and estimated by the area under an sROC curve (sAUC). Its performance was compared in cross validation with that of standard supervised methods, namely: decision tree, artificial neural network, support vector machine (SVM) and k-nearest neighbor classifier. RESULTS:LLM showed an excellent accuracy (sAUC = 0.99, 95%CI: 0.98-1.0) and outperformed any other method except SVM. CONCLUSIONS:LLM is a new powerful tool for the analysis of gene expression data for cancer diagnosis. Simple rules generated by LLM could contribute to a better understanding of cancer biology, potentially addressing therapeutic approaches.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Verda D,Parodi S,Ferrari E,Muselli Mdoi
10.1186/s12859-019-2953-8subject
Has Abstractpub_date
2019-11-22 00:00:00pages
390issue
Suppl 9issn
1471-2105pii
10.1186/s12859-019-2953-8journal_volume
20pub_type
杂志文章abstract:BACKGROUND:The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. Studying the mechanisms determining binding specificity in protein-DNA interactions is thus an important goal. Most current approaches for model...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-436
更新日期:2008-10-16 00:00:00
abstract:BACKGROUND:Biocatalysis in organic solvents is nowadays a common practice with a large potential in Biotechnology. Several studies report that proteins which are co-crystallized or soaked in organic solvents preserve their fold integrity showing almost identical arrangements when compared to their aqueous forms. Howeve...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2044-2
更新日期:2018-01-30 00:00:00
abstract:BACKGROUND:Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1203-6
更新日期:2016-09-05 00:00:00
abstract:BACKGROUND:Comparative genomics has become an essential approach for identifying homologous gene candidates and their functions, and for studying genome evolution. There are many tools available for genome comparisons. Unfortunately, most of them are not applicable for the identification of unique genes and the inferen...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-S4-S18
更新日期:2006-12-12 00:00:00
abstract:BACKGROUND:Although it is not difficult for state-of-the-art gene finders to identify coding regions in prokaryotic genomes, exact prediction of the corresponding translation initiation sites (TIS) is still a challenging problem. Recently a number of post-processing tools have been proposed for improving the annotation...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-121
更新日期:2006-03-09 00:00:00
abstract:BACKGROUND:Somatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next gene...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2412-y
更新日期:2018-11-14 00:00:00
abstract:BACKGROUND:A number of software packages are available to generate DNA multiple sequence alignments (MSAs) evolved under continuous-time Markov processes on phylogenetic trees. On the other hand, methods of simulating the DNA MSA directly from the transition matrices do not exist. Moreover, existing software restricts ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-216
更新日期:2012-08-28 00:00:00
abstract:BACKGROUND:Proteins having similar functions from different sources can be identified by the occurrence in their sequences, a conserved cluster of amino acids referred to as pattern, motif, signature or fingerprint. The wide usage of protein sequence analysis in par with the growth of databases signifies the importance...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-127
更新日期:2004-09-09 00:00:00
abstract:BACKGROUND:High-throughput screens comparing growth rates of arrays of distinct micro-organism cultures on solid agar are useful, rapid methods of quantifying genetic interactions. Growth rate is an informative phenotype which can be estimated by measuring cell densities at one or more times after inoculation. Precise ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-287
更新日期:2010-05-28 00:00:00
abstract:BACKGROUND:Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mutation and incorporat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S2-S2
更新日期:2010-04-16 00:00:00
abstract:BACKGROUND:Multiple data-analytic methods have been proposed for evaluating gene-expression levels in specific biological pathways, assessing differential expression associated with a binary phenotype. Following Goeman and Bühlmann's recent review, we compared statistical performance of three methods, namely Global Tes...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-431
更新日期:2007-11-07 00:00:00
abstract:BACKGROUND:The third edition of the BioNLP Shared Task was held with the grand theme "knowledge base construction (KB)". The Genia Event (GE) task was re-designed and implemented in light of this theme. For its final report, the participating systems were evaluated from a perspective of annotation. To further explore t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-16-S10-S3
更新日期:2015-01-01 00:00:00
abstract:BACKGROUND:Recent studies in computational primary protein sequence analysis have leveraged the power of unlabeled data. For example, predictive models based on string kernels trained on sequences known to belong to particular folds or superfamilies, the so-called labeled data set, can attain significantly improved acc...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S4-S2
更新日期:2009-04-29 00:00:00
abstract:BACKGROUND:The creation of a complete genome-wide map of transcription factor binding sites is essential for understanding gene regulatory networks in vivo. However, current prediction methods generally rely on statistical models that imperfectly model transcription factor binding. Generation of new prediction methods ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-62
更新日期:2011-02-25 00:00:00
abstract:BACKGROUND:Reliable prediction of antibody, or B-cell, epitopes remains challenging yet highly desirable for the design of vaccines and immunodiagnostics. A correlation between antigenicity, solvent accessibility, and flexibility in proteins was demonstrated. Subsequently, Thornton and colleagues proposed a method for ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-514
更新日期:2008-12-02 00:00:00
abstract:BACKGROUND:Caspases are a family of proteases that have central functions in programmed cell death (apoptosis) and inflammation. Caspases mediate their effects through aspartate-specific cleavage of their target proteins, and at present almost 400 caspase substrates are known. There are several methods developed to pre...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-320
更新日期:2010-06-15 00:00:00
abstract:BACKGROUND:Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-94
更新日期:2004-07-13 00:00:00
abstract:BACKGROUND:In microarray experiments the numbers of replicates are often limited due to factors such as cost, availability of sample or poor hybridization. There are currently few choices for the analysis of a pair of microarrays where N = 1 in each condition. In this paper, we demonstrate the effectiveness of a new al...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-489
更新日期:2008-11-21 00:00:00
abstract:BACKGROUND:Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data analysis, little attention has been paid to uncertainty in the results obtained. RESULTS:We present an R/Bioconductor port of a fast novel algorithm for...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-242
更新日期:2009-08-06 00:00:00
abstract:BACKGROUND:Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for predi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-274
更新日期:2009-09-01 00:00:00
abstract:BACKGROUND:The canonical code, although prevailing in complex genomes, is not universal. It was shown the canonical genetic code superior robustness compared to random codes, but it is not clearly determined how it evolved towards its current form. The error minimization theory considers the minimization of point mutat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1608-x
更新日期:2017-03-27 00:00:00
abstract:BACKGROUND:People with an autistic spectrum disorder (ASD) display a variety of characteristic behavioral traits, including impaired social interaction, communication difficulties and repetitive behavior. This complex neurodevelopment disorder is known to be associated with a combination of genetic and environmental fa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0622-0
更新日期:2015-06-06 00:00:00
abstract:BACKGROUND:DNA methylation changes are associated with a wide array of biological processes. Bisulfite conversion of DNA followed by high-throughput sequencing is increasingly being used to assess genome-wide methylation at single-base resolution. The relative slowness of most commonly used aligners for processing such...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-337
更新日期:2014-10-18 00:00:00
abstract:BACKGROUND:Time- and dose-to-event phenotypes used in basic science and translational studies are commonly measured imprecisely or incompletely due to limitations of the experimental design or data collection schema. For example, drug-induced toxicities are not reported by the actual time or dose triggering the event, ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2899-x
更新日期:2019-05-28 00:00:00
abstract:BACKGROUND:Detection of genomic DNA copy number variations (CNVs) can provide a complete and more comprehensive view of human disease. It is interesting to identify and represent relevant CNVs from a genome-wide data due to high data volume and the complexity of interactions. RESULTS:In this paper, we incorporate the ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S5-S4
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:One of the most important goals of the mathematical modeling of gene regulatory networks is to alter their behavior toward desirable phenotypes. Therapeutic techniques are derived for intervention in terms of stationary control policies. In large networks, it becomes computationally burdensome to derive an o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S10-S10
更新日期:2011-10-18 00:00:00
abstract:BACKGROUND:Automated protein function prediction methods are the only practical approach for assigning functions to genes obtained from model organisms. Many of the previously reported function annotation methods are of limited utility for fungal protein annotation. They are often trained only to one species, are not a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-215
更新日期:2010-04-29 00:00:00
abstract:BACKGROUND:Cell-scaffold contact measurements are derived from pairs of co-registered volumetric fluorescent confocal laser scanning microscopy (CLSM) images (z-stacks) of stained cells and three types of scaffolds (i.e., spun coat, large microfiber, and medium microfiber). Our analysis of the acquired terabyte-sized c...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1928-x
更新日期:2017-11-28 00:00:00
abstract:BACKGROUND:The nucleosome is the fundamental packing unit of DNAs in eukaryotic cells. Its detailed positioning on the genome is closely related to chromosome functions. Increasing evidence has shown that genomic DNA sequence itself is highly predictive of nucleosome positioning genome-wide. Therefore a fast software t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-346
更新日期:2010-06-24 00:00:00
abstract:BACKGROUND:Selective pressures at the DNA level shape genes into profiles consisting of patterns of rapidly evolving sites and sites withstanding change. These profiles remain detectable even when protein sequences become extensively diverged. A common task in molecular biology is to infer functional, structural or evo...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0688-8
更新日期:2015-08-14 00:00:00