Abstract:
BACKGROUND:The lack of sufficient training data is the limiting factor for many Machine Learning applications in Computational Biology. If data is available for several different but related problem domains, Multitask Learning algorithms can be used to learn a model based on all available information. In Bioinformatics, many problems can be cast into the Multitask Learning scenario by incorporating data from several organisms. However, combining information from several tasks requires careful consideration of the degree of similarity between tasks. Our proposed method simultaneously learns or refines the similarity between tasks along with the Multitask Learning classifier. This is done by formulating the Multitask Learning problem as Multiple Kernel Learning, using the recently published q-Norm MKL algorithm. RESULTS:We demonstrate the performance of our method on two problems from Computational Biology. First, we show that our method is able to improve performance on a splice site dataset with given hierarchical task structure by refining the task relationships. Second, we consider an MHC-I dataset, for which we assume no knowledge about the degree of task relatedness. Here, we are able to learn the task similarities ab initio along with the Multitask classifiers. In both cases, we outperform baseline methods that we compare against. CONCLUSIONS:We present a novel approach to Multitask Learning that is capable of learning task similarity along with the classifiers. The framework is very general as it allows to incorporate prior knowledge about tasks relationships if available, but is also able to identify task similarities in absence of such prior information. Both variants show promising results in applications from Computational Biology.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Widmer C,Toussaint NC,Altun Y,Rätsch Gdoi
10.1186/1471-2105-11-S8-S5subject
Has Abstractpub_date
2010-10-26 00:00:00pages
S5issn
1471-2105pii
1471-2105-11-S8-S5journal_volume
11 Suppl 8pub_type
杂志文章abstract:BACKGROUND:Sequence comparison is one of the most prominent tools in biological research, and is instrumental in studying gene function and evolution. The rapid development of high-throughput technologies for measuring protein interactions calls for extending this fundamental operation to the level of pathways in prote...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-199
更新日期:2006-04-10 00:00:00
abstract:BACKGROUND:Since RNA molecules regulate genes and control alternative splicing by allostery, it is important to develop algorithms to predict RNA conformational switches. Some tools, such as paRNAss, RNAshapes and RNAbor, can be used to predict potential conformational switches; nevertheless, no existent tool can detec...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S5-S6
更新日期:2012-04-12 00:00:00
abstract:BACKGROUND:In recent times, there has been an exponential rise in the number of protein structures in databases e.g. PDB. So, design of fast algorithms capable of querying such databases is becoming an increasingly important research issue. This paper reports an algorithm, motivated from spectral graph matching techniq...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-S5-S5
更新日期:2006-12-18 00:00:00
abstract:BACKGROUND:Detection of periodically expressed genes from microarray data without use of known periodic and non-periodic training examples is an important problem, e.g. for identifying genes regulated by the cell-cycle in poorly characterised organisms. Commonly the investigator is only interested in genes expressed at...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-63
更新日期:2006-02-09 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNAs) are single-stranded non-coding RNAs known to regulate a wide range of cellular processes by silencing the gene expression at the protein and/or mRNA levels. Computational prediction of miRNA targets is essential for elucidating the detailed functions of miRNA. However, the prediction speci...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-476
更新日期:2010-09-22 00:00:00
abstract:BACKGROUND:Testing the dependence of two variables is one of the fundamental tasks in statistics. In this work, we developed an open-source R package (knnAUC) for detecting nonlinear dependence between one continuous variable X and one binary dependent variables Y (0 or 1). RESULTS:We addressed this problem by using k...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2427-4
更新日期:2018-11-22 00:00:00
abstract:BACKGROUND:Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing. RESULTS:We propose s...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1539-6
更新日期:2017-03-02 00:00:00
abstract:BACKGROUND:Several studies demonstrated the feasibility of predicting bacterial antibiotic resistance phenotypes from whole-genome sequences, the prediction process usually amounting to detecting the presence of genes involved in antibiotic resistance mechanisms, or of specific mutations, previously identified from a t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2403-z
更新日期:2018-10-17 00:00:00
abstract:BACKGROUND:Histopathology image analysis is a gold standard for cancer recognition and diagnosis. Automatic analysis of histopathology images can help pathologists diagnose tumor and cancer subtypes, alleviating the workload of pathologists. There are two basic types of tasks in digital histopathology image analysis: i...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1685-x
更新日期:2017-05-26 00:00:00
abstract:BACKGROUND:Automated protein function prediction methods are the only practical approach for assigning functions to genes obtained from model organisms. Many of the previously reported function annotation methods are of limited utility for fungal protein annotation. They are often trained only to one species, are not a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-215
更新日期:2010-04-29 00:00:00
abstract:BACKGROUND:We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, ma...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-101
更新日期:2004-07-26 00:00:00
abstract:BACKGROUND:The rate of protein structures being deposited in the Protein Data Bank surpasses the capacity to experimentally characterise them and therefore computational methods to analyse these structures have become increasingly important. Identifying the region of the protein most likely to be involved in function i...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-379
更新日期:2009-11-18 00:00:00
abstract:BACKGROUND:Molecular biology data exist on diverse scales, from the level of molecules to -omics. At the same time, the data at each scale can be categorised into multiple layers, such as the genome, transcriptome, proteome, metabolome, and biochemical pathways. Due to the highly multi-layer and multi-dimensional natur...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-31
更新日期:2009-01-23 00:00:00
abstract:BACKGROUND:Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite de...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3211-9
更新日期:2019-12-24 00:00:00
abstract:BACKGROUND:Probing the complex fusion of genetic and environmental interactions, metabolic profiling (or metabolomics/metabonomics), the study of small molecules involved in metabolic reactions, is a rapidly expanding 'omics' field. A major technique for capturing metabolite data is 1H-NMR spectroscopy and this yields ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-496
更新日期:2010-10-06 00:00:00
abstract:BACKGROUND:Amplified fragment length polymorphism (AFLP) is a PCR-based technique that involves restriction of genomic DNA followed by ligation of adaptors to the fragments generated and selective PCR amplification of a subset of these fragments. The amplified fragments are separated on a sequencing gel and visualized ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-4-7
更新日期:2003-02-25 00:00:00
abstract:BACKGROUND:Protein-protein docking is a valuable computational approach for investigating protein-protein interactions. Shape complementarity is the most basic component of a scoring function and plays an important role in protein-protein docking. Despite significant progresses, shape representation remains an open que...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3270-y
更新日期:2019-12-24 00:00:00
abstract:BACKGROUND:Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree b...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-356
更新日期:2009-10-27 00:00:00
abstract:BACKGROUND:The Triplex cell vaccine is a cancer cellular vaccine that can prevent almost completely the mammary tumor onset in HER-2/neu transgenic mice. In a translational perspective, the activity of the Triplex vaccine was also investigated against lung metastases showing that the vaccine is an effective treatment a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S7-S13
更新日期:2010-10-15 00:00:00
abstract:BACKGROUND:Gene expression experiments are common in molecular biology, for example in order to identify genes which play a certain role in a specified biological framework. For that purpose expression levels of several thousand genes are measured simultaneously using DNA microarrays. Comparing two distinct groups of t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-288
更新日期:2011-07-15 00:00:00
abstract:BACKGROUND:Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now on-line and providing an increasing amount of open access content, often free of copyright ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-220
更新日期:2010-04-29 00:00:00
abstract:BACKGROUND:Cryo-electron microscopy (Cryo-EM) is widely used in the determination of the three-dimensional (3D) structures of macromolecules. Particle picking from 2D micrographs remains a challenging early step in the Cryo-EM pipeline due to the diversity of particle shapes and the extremely low signal-to-noise ratio ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03809-7
更新日期:2020-11-09 00:00:00
abstract:BACKGROUND:It is possible to predict whether a tuberculosis (TB) patient will fail to respond to specific antibiotics by sequencing the genome of the infecting Mycobacterium tuberculosis (Mtb) and observing whether the pathogen carries specific mutations at drug-resistance sites. This advancement has led to the collati...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2658-z
更新日期:2019-02-08 00:00:00
abstract:BACKGROUND:A common method for presenting and studying biological interaction networks is visualization. Software tools can enhance our ability to explore network visualizations and improve our understanding of biological systems, particularly when these tools offer analysis capabilities. However, most published networ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-95
更新日期:2009-03-26 00:00:00
abstract:BACKGROUND:Kernel-based learning algorithms are among the most advanced machine learning methods and have been successfully applied to a variety of sequence classification tasks within the field of bioinformatics. Conventional kernels utilized so far do not provide an easy interpretation of the learnt representations i...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-169
更新日期:2004-10-28 00:00:00
abstract:BACKGROUND:Schizophrenia, bipolar disorder, and major depression are devastating mental diseases, each with distinctive yet overlapping epidemiologic characteristics. Microarray and proteomics data have revealed genes which expressed abnormally in patients. Several single nucleotide polymorphisms (SNPs) and mutations a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S13-S20
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0369-z
更新日期:2014-12-14 00:00:00
abstract:BACKGROUND:Metabolomics, petroleum and biodiesel chemistry, biomarker discovery, and other fields which rely on high-resolution profiling of complex chemical mixtures generate datasets which contain millions of detector intensity readings, each uniquely addressed along dimensions of time (e.g., retention time of chemic...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-S9-S15
更新日期:2008-08-12 00:00:00
abstract:BACKGROUND:Protein function prediction is an important problem in the post-genomic era. Recent advances in experimental biology have enabled the production of vast amounts of protein-protein interaction (PPI) data. Thus, using PPI data to functionally annotate proteins has been extensively studied. However, most existi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S12-S4
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Inferring gene regulatory networks (GRNs) from gene expression data remains a challenge in system biology. In past decade, numerous methods have been developed for the inference of GRNs. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of pot...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03639-7
更新日期:2020-07-14 00:00:00