MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction.

Abstract:

BACKGROUND:Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy. RESULTS:We extended our previous MultiLoc predictor by incorporating phylogenetic profiles and Gene Ontology terms. Two different datasets were used for training the system, resulting in two versions of this high-accuracy prediction method. One version is specialized for globular proteins and predicts up to five localizations, whereas a second version covers all eleven main eukaryotic subcellular localizations. In a benchmark study with five localizations, MultiLoc2 performs considerably better than other methods for animal and plant proteins and comparably for fungal proteins. Furthermore, MultiLoc2 performs clearly better when using a second dataset that extends the benchmark study to all eleven main eukaryotic subcellular localizations. CONCLUSION:MultiLoc2 is an extensive high-performance subcellular protein localization prediction system. By incorporating phylogenetic profiles and Gene Ontology terms MultiLoc2 yields higher accuracies compared to its previous version. Moreover, it outperforms other prediction systems in two benchmarks studies. MultiLoc2 is available as user-friendly and free web-service, available at: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc2.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Blum T,Briesemeister S,Kohlbacher O

doi

10.1186/1471-2105-10-274

subject

Has Abstract

pub_date

2009-09-01 00:00:00

pages

274

issn

1471-2105

pii

1471-2105-10-274

journal_volume

10

pub_type

杂志文章
  • Visualising very large phylogenetic trees in three dimensional hyperbolic space.

    abstract:BACKGROUND:Common existing phylogenetic tree visualisation tools are not able to display readable trees with more than a few thousand nodes. These existing methodologies are based in two dimensional space. RESULTS:We introduce the idea of visualising phylogenetic trees in three dimensional hyperbolic space with the Wa...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-48

    authors: Hughes T,Hyun Y,Liberles DA

    更新日期:2004-04-29 00:00:00

  • DNLC: differential network local consistency analysis.

    abstract:BACKGROUND:The biological network is highly dynamic. Functional relations between genes can be activated or deactivated depending on the biological conditions. On the genome-scale network, subnetworks that gain or lose local expression consistency may shed light on the regulatory mechanisms related to the changing biol...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3046-4

    authors: Lu J,Lu Y,Ding Y,Xiao Q,Liu L,Cai Q,Kong Y,Bai Y,Yu T

    更新日期:2019-12-24 00:00:00

  • Using distances between Top-n-gram and residue pairs for protein remote homology detection.

    abstract:BACKGROUND:Protein remote homology detection is one of the central problems in bioinformatics, which is important for both basic research and practical application. Currently, discriminative methods based on Support Vector Machines (SVMs) achieve the state-of-the-art performance. Exploring feature vectors incorporating...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S2-S3

    authors: Liu B,Xu J,Zou Q,Xu R,Wang X,Chen Q

    更新日期:2014-01-01 00:00:00

  • Discovering motifs that induce sequencing errors.

    abstract:BACKGROUND:Elevated sequencing error rates are the most predominant obstacle in single-nucleotide polymorphism (SNP) detection, which is a major goal in the bulk of current studies using next-generation sequencing (NGS). Beyond routinely handled generic sources of errors, certain base calling errors relate to specific ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S5-S1

    authors: Allhoff M,Schönhuth A,Martin M,Costa IG,Rahmann S,Marschall T

    更新日期:2013-01-01 00:00:00

  • A platform for processing expression of short time series (PESTS).

    abstract:BACKGROUND:Time course microarray profiles examine the expression of genes over a time domain. They are necessary in order to determine the complete set of genes that are dynamically expressed under given conditions, and to determine the interaction between these genes. Because of cost and resource issues, most time se...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-13

    authors: Sinha A,Markatou M

    更新日期:2011-01-11 00:00:00

  • ArrayIDer: automated structural re-annotation pipeline for DNA microarrays.

    abstract:BACKGROUND:Systems biology modeling from microarray data requires the most contemporary structural and functional array annotation. However, microarray annotations, especially for non-commercial, non-traditional biomedical model organisms, are often dated. In addition, most microarray analysis tools do not readily acce...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-30

    authors: van den Berg BH,Konieczka JH,McCarthy FM,Burgess SC

    更新日期:2009-01-23 00:00:00

  • Predicting Bevirimat resistance of HIV-1 from genotype.

    abstract:BACKGROUND:Maturation inhibitors are a new class of antiretroviral drugs. Bevirimat (BVM) was the first substance in this class of inhibitors entering clinical trials. While the inhibitory function of BVM is well established, the molecular mechanisms of action and resistance are not well understood. It is known that mu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-37

    authors: Heider D,Verheyen J,Hoffmann D

    更新日期:2010-01-20 00:00:00

  • Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry.

    abstract:BACKGROUND:Comprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC × GC-MS) is a powerful technique which has gained increasing attention over the last two decades. The GC × GC-MS provides much increased separation capacity, chemical selectivity and sensitivity for complex sample analysis an...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-235

    authors: Kim S,Koo I,Fang A,Zhang X

    更新日期:2011-06-15 00:00:00

  • Robust detection of periodic time series measured from biological systems.

    abstract:BACKGROUND:Periodic phenomena are widespread in biology. The problem of finding periodicity in biological time series can be viewed as a multiple hypothesis testing of the spectral content of a given time series. The exact noise characteristics are unknown in many bioinformatics applications. Furthermore, the observed ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-117

    authors: Ahdesmäki M,Lähdesmäki H,Pearson R,Huttunen H,Yli-Harja O

    更新日期:2005-05-13 00:00:00

  • DeepSort: deep convolutional networks for sorting haploid maize seeds.

    abstract:BACKGROUND:Maize is a leading crop in the modern agricultural industry that accounts for more than 40% grain production worldwide. THe double haploid technique that uses fewer breeding generations for generating a maize line has accelerated the pace of development of superior commercial seed varieties and has been tran...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2267-2

    authors: Veeramani B,Raymond JW,Chanda P

    更新日期:2018-08-13 00:00:00

  • DART: Denoising Algorithm based on Relevance network Topology improves molecular pathway activity inference.

    abstract:BACKGROUND:Inferring molecular pathway activity is an important step towards reducing the complexity of genomic data, understanding the heterogeneity in clinical outcome, and obtaining molecular correlates of cancer imaging traits. Increasingly, approaches towards pathway activity inference combine molecular profiles (...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-403

    authors: Jiao Y,Lawler K,Patel GS,Purushotham A,Jones AF,Grigoriadis A,Tutt A,Ng T,Teschendorff AE

    更新日期:2011-10-19 00:00:00

  • Efficient inference of homologs in large eukaryotic pan-proteomes.

    abstract:BACKGROUND:Identification of homologous genes is fundamental to comparative genomics, functional genomics and phylogenomics. Extensive public homology databases are of great value for investigating homology but need to be continually updated to incorporate new sequences. As new sequences are rapidly being generated, th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2362-4

    authors: Sheikhizadeh Anari S,de Ridder D,Schranz ME,Smit S

    更新日期:2018-09-26 00:00:00

  • An automated framework for understanding structural variations in the binding grooves of MHC class II molecules.

    abstract:BACKGROUND:MHC/HLA class II molecules are important components of the immune system and play a critical role in processes such as phagocytosis. Understanding peptide recognition properties of the hundreds of MHC class II alleles is essential to appreciate determinants of antigenicity and ultimately to predict epitopes....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S1-S55

    authors: Yeturu K,Utriainen T,Kemp GJ,Chandra N

    更新日期:2010-01-18 00:00:00

  • Efficient estimation of grouped survival models.

    abstract:BACKGROUND:Time- and dose-to-event phenotypes used in basic science and translational studies are commonly measured imprecisely or incompletely due to limitations of the experimental design or data collection schema. For example, drug-induced toxicities are not reported by the actual time or dose triggering the event, ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2899-x

    authors: Li Z,Lin J,Sibley AB,Truong T,Chua KC,Jiang Y,McCarthy J,Kroetz DL,Allen A,Owzar K

    更新日期:2019-05-28 00:00:00

  • Accurate prediction of protein-lncRNA interactions by diffusion and HeteSim features across heterogeneous network.

    abstract:BACKGROUND:Identifying the interactions between proteins and long non-coding RNAs (lncRNAs) is of great importance to decipher the functional mechanisms of lncRNAs. However, current experimental techniques for detection of lncRNA-protein interactions are limited and inefficient. Many methods have been proposed to predi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2390-0

    authors: Deng L,Wang J,Xiao Y,Wang Z,Liu H

    更新日期:2018-10-11 00:00:00

  • Exploring community structure in biological networks with random graphs.

    abstract:BACKGROUND:Community structure is ubiquitous in biological networks. There has been an increased interest in unraveling the community structure of biological systems as it may provide important insights into a system's functional components and the impact of local structures on dynamics at a global scale. Choosing an a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-220

    authors: Sah P,Singh LO,Clauset A,Bansal S

    更新日期:2014-06-25 00:00:00

  • BRCA-Pathway: a structural integration and visualization system of TCGA breast cancer data on KEGG pathways.

    abstract:BACKGROUND:Bioinformatics research for finding biological mechanisms can be done by analysis of transcriptome data with pathway based interpretation. Therefore, researchers have tried to develop tools to analyze transcriptome data with pathway based interpretation. Over the years, the amount of omics data has become hu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2016-6

    authors: Kim I,Choi S,Kim S

    更新日期:2018-02-19 00:00:00

  • ATMAD: robust image analysis for Automatic Tissue MicroArray De-arraying.

    abstract:BACKGROUND:Over the last two decades, an innovative technology called Tissue Microarray (TMA), which combines multi-tissue and DNA microarray concepts, has been widely used in the field of histology. It consists of a collection of several (up to 1000 or more) tissue samples that are assembled onto a single support - ty...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2111-8

    authors: Nguyen HN,Paveau V,Cauchois C,Kervrann C

    更新日期:2018-04-19 00:00:00

  • InPrePPI: an integrated evaluation method based on genomic context for predicting protein-protein interactions in prokaryotic genomes.

    abstract:BACKGROUND:Although many genomic features have been used in the prediction of protein-protein interactions (PPIs), frequently only one is used in a computational method. After realizing the limited power in the prediction using only one genomic feature, investigators are now moving toward integration. So far, there hav...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-414

    authors: Sun J,Sun Y,Ding G,Liu Q,Wang C,He Y,Shi T,Li Y,Zhao Z

    更新日期:2007-10-26 00:00:00

  • Constructing a meaningful evolutionary average at the phylogenetic center of mass.

    abstract:BACKGROUND:As a consequence of the evolutionary process, data collected from related species tend to be similar. This similarity by descent can obscure subtler signals in the data such as the evidence of constraint on variation due to shared selective pressures. In comparative sequence analysis, for example, sequence s...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-222

    authors: Stone EA,Sidow A

    更新日期:2007-06-26 00:00:00

  • A stochastic context free grammar based framework for analysis of protein sequences.

    abstract:BACKGROUND:In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity of relationship between amino acids have mainly limited...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-323

    authors: Dyrka W,Nebel JC

    更新日期:2009-10-08 00:00:00

  • Big data analysis for evaluating bioinvasion risk.

    abstract:BACKGROUND:Global maritime trade plays an important role in the modern transportation industry. It brings significant economic profit along with bioinvasion risk. Species translocate and establish in a non-native area through ballast water and biofouling. Aiming at aquatic bioinvasion issue, people proposed various sug...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2272-5

    authors: Wang S,Wang C,Wang S,Ma L

    更新日期:2018-08-13 00:00:00

  • IRSS: a web-based tool for automatic layout and analysis of IRES secondary structure prediction and searching system in silico.

    abstract:BACKGROUND:Internal ribosomal entry sites (IRESs) provide alternative, cap-independent translation initiation sites in eukaryotic cells. IRES elements are important factors in viral genomes and are also useful tools for bi-cistronic expression vectors. Most existing RNA structure prediction programs are unable to deal ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-160

    authors: Wu TY,Hsieh CC,Hong JJ,Chen CY,Tsai YS

    更新日期:2009-05-27 00:00:00

  • Meta-aligner: long-read alignment based on genome statistics.

    abstract:BACKGROUND:Current development of sequencing technologies is towards generating longer and noisier reads. Evidently, accurate alignment of these reads play an important role in any downstream analysis. Similarly, reducing the overall cost of sequencing is related to the time consumption of the aligner. The tradeoff bet...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1518-y

    authors: Nashta-Ali D,Aliyari A,Ahmadian Moghadam A,Edrisi MA,Motahari SA,Hossein Khalaj B

    更新日期:2017-02-23 00:00:00

  • Fast and robust group-wise eQTL mapping using sparse graphical models.

    abstract:BACKGROUND:Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. The traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression tra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0421-z

    authors: Cheng W,Shi Y,Zhang X,Wang W

    更新日期:2015-01-16 00:00:00

  • GenomeCAT: a versatile tool for the analysis and integrative visualization of DNA copy number variants.

    abstract:BACKGROUND:The analysis of DNA copy number variants (CNV) has increasing impact in the field of genetic diagnostics and research. However, the interpretation of CNV data derived from high resolution array CGH or NGS platforms is complicated by the considerable variability of the human genome. Therefore, tools for multi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1430-x

    authors: Tebel K,Boldt V,Steininger A,Port M,Ebert G,Ullmann R

    更新日期:2017-01-06 00:00:00

  • Epiviz: a view inside the design of an integrated visual analysis software for genomics.

    abstract:BACKGROUND:Computational and visual data analysis for genomics has traditionally involved a combination of tools and resources, of which the most ubiquitous consist of genome browsers, focused mainly on integrative visualization of large numbers of big datasets, and computational environments, focused on data modeling ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-16-S11-S4

    authors: Chelaru F,Corrada Bravo H

    更新日期:2015-01-01 00:00:00

  • Linear predictive coding representation of correlated mutation for protein sequence alignment.

    abstract:BACKGROUND:Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mutation and incorporat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S2-S2

    authors: Jeong CS,Kim D

    更新日期:2010-04-16 00:00:00

  • Error, reproducibility and sensitivity: a pipeline for data processing of Agilent oligonucleotide expression arrays.

    abstract:BACKGROUND:Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematica...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-344

    authors: Chain B,Bowen H,Hammond J,Posch W,Rasaiyaah J,Tsang J,Noursadeghi M

    更新日期:2010-06-24 00:00:00

  • A MATLAB tool for pathway enrichment using a topology-based pathway regulation score.

    abstract:BACKGROUND:Handling the vast amount of gene expression data generated by genome-wide transcriptional profiling techniques is a challenging task, demanding an informed combination of pre-processing, filtering and analysis methods if meaningful biological conclusions are to be drawn. For example, a range of traditional s...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0358-2

    authors: Ibrahim M,Jassim S,Cawthorne MA,Langlands K

    更新日期:2014-11-04 00:00:00