SPdb--a signal peptide database.

Abstract:

BACKGROUND:The signal peptide plays an important role in protein targeting and protein translocation in both prokaryotic and eukaryotic cells. This transient, short peptide sequence functions like a postal address on an envelope by targeting proteins for secretion or for transfer to specific organelles for further processing. Understanding how signal peptides function is crucial in predicting where proteins are translocated. To support this understanding, we present SPdb signal peptide database http://proline.bic.nus.edu.sg/spdb, a repository of experimentally determined and computationally predicted signal peptides. RESULTS:SPdb integrates information from two sources (a) Swiss-Prot protein sequence database which is now part of UniProt and (b) EMBL nucleotide sequence database. The database update is semi-automated with human checking and verification of the data to ensure the correctness of the data stored. The latest release SPdb release 3.2 contains 18,146 entries of which 2,584 entries are experimentally verified signal sequences; the remaining 15,562 entries are either signal sequences that fail to meet our filtering criteria or entries that contain unverified signal sequences. CONCLUSION:SPdb is a manually curated database constructed to support the understanding and analysis of signal peptides. SPdb tracks the major updates of the two underlying primary databases thereby ensuring that its information remains up-to-date.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Choo KH,Tan TW,Ranganathan S

doi

10.1186/1471-2105-6-249

keywords:

subject

Has Abstract

pub_date

2005-10-13 00:00:00

pages

249

issn

1471-2105

pii

1471-2105-6-249

journal_volume

6

pub_type

杂志文章
  • Multiple sequence alignment accuracy and evolutionary distance estimation.

    abstract:BACKGROUND:Sequence alignment is a common tool in bioinformatics and comparative genomics. It is generally assumed that multiple sequence alignment yields better results than pair wise sequence alignment, but this assumption has rarely been tested, and never with the control provided by simulation analysis. This study ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-278

    authors: Rosenberg MS

    更新日期:2005-11-23 00:00:00

  • Asymmetric bagging and feature selection for activities prediction of drug molecules.

    abstract:BACKGROUND:Activities of drug molecules can be predicted by QSAR (quantitative structure activity relationship) models, which overcomes the disadvantages of high cost and long cycle by employing the traditional experimental method. With the fact that the number of drug molecules with positive activity is rather fewer t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S6-S7

    authors: Li GZ,Meng HH,Lu WC,Yang JY,Yang MQ

    更新日期:2008-05-28 00:00:00

  • Algebraic comparison of metabolic networks, phylogenetic inference, and metabolic innovation.

    abstract:BACKGROUND:Comparison of metabolic networks is typically performed based on the organisms' enzyme contents. This approach disregards functional replacements as well as orthologies that are misannotated. Direct comparison of the structure of metabolic networks can circumvent these problems. RESULTS:Metabolic networks a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-67

    authors: Forst CV,Flamm C,Hofacker IL,Stadler PF

    更新日期:2006-02-14 00:00:00

  • Computing all hybridization networks for multiple binary phylogenetic input trees.

    abstract:BACKGROUND:The computation of phylogenetic trees on the same set of species that are based on different orthologous genes can lead to incongruent trees. One possible explanation for this behavior are interspecific hybridization events recombining genes of different species. An important approach to analyze such events ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0660-7

    authors: Albrecht B

    更新日期:2015-07-30 00:00:00

  • Menoci: lightweight extensible web portal enhancing data management for biomedical research projects.

    abstract:BACKGROUND:Biomedical research projects deal with data management requirements from multiple sources like funding agencies' guidelines, publisher policies, discipline best practices, and their own users' needs. We describe functional and quality requirements based on many years of experience implementing data managemen...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03928-1

    authors: Suhr M,Lehmann C,Bauer CR,Bender T,Knopp C,Freckmann L,Öst Hansen B,Henke C,Aschenbrandt G,Kühlborn LK,Rheinländer S,Weber L,Marzec B,Hellkamp M,Wieder P,Sax U,Kusch H,Nussbeck SY

    更新日期:2020-12-17 00:00:00

  • Phylogenomics and sequence-structure-function relationships in the GmrSD family of Type IV restriction enzymes.

    abstract:BACKGROUND:GmrSD is a modification-dependent restriction endonuclease that specifically targets and cleaves glucosylated hydroxymethylcytosine (glc-HMC) modified DNA. It is encoded either as two separate single-domain GmrS and GmrD proteins or as a single protein carrying both domains. Previous studies suggested that G...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0773-z

    authors: Machnicka MA,Kaminska KH,Dunin-Horkawicz S,Bujnicki JM

    更新日期:2015-10-23 00:00:00

  • Measure of synonymous codon usage diversity among genes in bacteria.

    abstract:BACKGROUND:In many bacteria, intragenomic diversity in synonymous codon usage among genes has been reported. However, no quantitative attempt has been made to compare the diversity levels among different genomes. Here, we introduce a mean dissimilarity-based index (Dmean) for quantifying the level of diversity in synon...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-167

    authors: Suzuki H,Saito R,Tomita M

    更新日期:2009-06-01 00:00:00

  • RocSampler: regularizing overlapping protein complexes in protein-protein interaction networks.

    abstract:BACKGROUND:In recent years, protein-protein interaction (PPI) networks have been well recognized as important resources to elucidate various biological processes and cellular mechanisms. In this paper, we address the problem of predicting protein complexes from a PPI network. This problem has two difficulties. One is r...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1920-5

    authors: Maruyama O,Kuwahara Y

    更新日期:2017-12-06 00:00:00

  • Automating dChip: toward reproducible sharing of microarray data analysis.

    abstract:BACKGROUND:During the past decade, many software packages have been developed for analysis and visualization of various types of microarrays. We have developed and maintained the widely used dChip as a microarray analysis software package accessible to both biologist and data analysts. However, challenges arise when dC...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-231

    authors: Li C

    更新日期:2008-05-08 00:00:00

  • Extending the evaluation of Genia Event task toward knowledge base construction and comparison to Gene Regulation Ontology task.

    abstract:BACKGROUND:The third edition of the BioNLP Shared Task was held with the grand theme "knowledge base construction (KB)". The Genia Event (GE) task was re-designed and implemented in light of this theme. For its final report, the participating systems were evaluated from a perspective of annotation. To further explore t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-16-S10-S3

    authors: Kim JD,Kim JJ,Han X,Rebholz-Schuhmann D

    更新日期:2015-01-01 00:00:00

  • In silico docking of urokinase plasminogen activator and integrins.

    abstract:BACKGROUND:Urokinase, its receptor and the integrins are functionally associated and involved in regulation of cell signaling, migration, adhesion and proliferation. No structural information is available on this potential multimolecular complex. However, the tri-dimensional structure of urokinase, urokinase receptor a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S2-S8

    authors: Degryse B,Fernandez-Recio J,Citro V,Blasi F,Cubellis MV

    更新日期:2008-03-26 00:00:00

  • Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.

    abstract:BACKGROUND:Microarray technologies produced large amount of data. The hierarchical clustering is commonly used to identify clusters of co-expressed genes. However, microarray datasets often contain missing values (MVs) representing a major drawback for the use of the clustering methods. Usually the MVs are not treated,...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-114

    authors: de Brevern AG,Hazout S,Malpertuy A

    更新日期:2004-08-23 00:00:00

  • TMB-Hunt: an amino acid composition based method to screen proteomes for beta-barrel transmembrane proteins.

    abstract:BACKGROUND:Beta-barrel transmembrane (bbtm) proteins are a functionally important and diverse group of proteins expressed in the outer membranes of bacteria (both gram negative and acid fast gram positive), mitochondria and chloroplasts. Despite recent publications describing reasonable levels of accuracy for discrimin...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-56

    authors: Garrow AG,Agnew A,Westhead DR

    更新日期:2005-03-15 00:00:00

  • Towards an automatic classification of protein structural domains based on structural similarity.

    abstract:BACKGROUND:Formal classification of a large collection of protein structures aids the understanding of evolutionary relationships among them. Classifications involving manual steps, such as SCOP and CATH, face the challenge of increasing volume of available structures. Automatic methods such as FSSP or Dali Domain Dict...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-74

    authors: Sam V,Tai CH,Garnier J,Gibrat JF,Lee B,Munson PJ

    更新日期:2008-01-31 00:00:00

  • Random forest versus logistic regression: a large-scale benchmark experiment.

    abstract:BACKGROUND AND GOAL:The Random Forest (RF) algorithm for regression and classification has considerably gained popularity since its introduction in 2001. Meanwhile, it has grown to a standard classification approach competing with logistic regression in many innovation-friendly scientific fields. RESULTS:In this conte...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2264-5

    authors: Couronné R,Probst P,Boulesteix AL

    更新日期:2018-07-17 00:00:00

  • Maximizing Kolmogorov Complexity for accurate and robust bright field cell segmentation.

    abstract:BACKGROUND:Analysis of cellular processes with microscopic bright field defocused imaging has the advantage of low phototoxicity and minimal sample preparation. However bright field images lack the contrast and nuclei reporting available with florescent approaches and therefore present a challenge to methods that segme...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-32

    authors: Mohamadlou H,Shope JC,Flann NS

    更新日期:2014-01-30 00:00:00

  • GenomeBlast: a web tool for small genome comparison.

    abstract:BACKGROUND:Comparative genomics has become an essential approach for identifying homologous gene candidates and their functions, and for studying genome evolution. There are many tools available for genome comparisons. Unfortunately, most of them are not applicable for the identification of unique genes and the inferen...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-S4-S18

    authors: Lu G,Jiang L,Helikar RM,Rowley TW,Zhang L,Chen X,Moriyama EN

    更新日期:2006-12-12 00:00:00

  • Local sequence and sequencing depth dependent accuracy of RNA-seq reads.

    abstract:BACKGROUND:Many biases and spurious effects are inherent in RNA-seq technology, resulting in a non-uniform distribution of sequencing read counts for each base position in a gene. Therefore, a base-level strategy is required to model the non-uniformity. Also, the properties of sequencing read counts can be leveraged to...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1780-z

    authors: Cai G,Liang S,Zheng X,Xiao F

    更新日期:2017-08-09 00:00:00

  • The textual characteristics of traditional and Open Access scientific journals are similar.

    abstract:BACKGROUND:Recent years have seen an increased amount of natural language processing (NLP) work on full text biomedical journal publications. Much of this work is done with Open Access journal articles. Such work assumes that Open Access articles are representative of biomedical publications in general and that methods...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-183

    authors: Verspoor K,Cohen KB,Hunter L

    更新日期:2009-06-15 00:00:00

  • Unsupervised fuzzy pattern discovery in gene expression data.

    abstract:BACKGROUND:Discovering patterns from gene expression levels is regarded as a classification problem when tissue classes of the samples are given and solved as a discrete-data problem by discretizing the expression levels of each gene into intervals maximizing the interdependence between that gene and the class labels. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S5-S5

    authors: Wu GP,Chan KC,Wong AK

    更新日期:2011-01-01 00:00:00

  • Efficient error correction for next-generation sequencing of viral amplicons.

    abstract:BACKGROUND:Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S10-S6

    authors: Skums P,Dimitrova Z,Campo DS,Vaughan G,Rossi L,Forbi JC,Yokosawa J,Zelikovsky A,Khudyakov Y

    更新日期:2012-06-25 00:00:00

  • Predikin and PredikinDB: a computational framework for the prediction of protein kinase peptide specificity and an associated database of phosphorylation sites.

    abstract:BACKGROUND:We have previously described an approach to predicting the substrate specificity of serine-threonine protein kinases. The method, named Predikin, identifies key conserved substrate-determining residues in the kinase catalytic domain that contact the substrate in the region of the phosphorylation site and so ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-245

    authors: Saunders NF,Brinkworth RI,Huber T,Kemp BE,Kobe B

    更新日期:2008-05-26 00:00:00

  • Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashing.

    abstract:BACKGROUND:Clustering of protein sequences is of key importance in predicting the structure and function of newly sequenced proteins and is also of use for their annotation. With the advent of multiple high-throughput sequencing technologies, new protein sequences are becoming available at an extraordinary rate. The ra...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2080-y

    authors: Abnousi A,Broschat SL,Kalyanaraman A

    更新日期:2018-03-05 00:00:00

  • Identification of consensus RNA secondary structures using suffix arrays.

    abstract:BACKGROUND:The identification of a consensus RNA motif often consists in finding a conserved secondary structure with minimum free energy in an ensemble of aligned sequences. However, an alignment is often difficult to obtain without prior structural information. Thus the need for tools to automate this process. RESUL...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-244

    authors: Anwar M,Nguyen T,Turcotte M

    更新日期:2006-05-05 00:00:00

  • Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry.

    abstract:BACKGROUND:Comprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC × GC-MS) is a powerful technique which has gained increasing attention over the last two decades. The GC × GC-MS provides much increased separation capacity, chemical selectivity and sensitivity for complex sample analysis an...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-235

    authors: Kim S,Koo I,Fang A,Zhang X

    更新日期:2011-06-15 00:00:00

  • Partial mixture model for tight clustering of gene expression time-course.

    abstract:BACKGROUND:Tight clustering arose recently from a desire to obtain tighter and potentially more informative clusters in gene expression studies. Scattered genes with relatively loose correlations should be excluded from the clusters. However, in the literature there is little work dedicated to this area of research. On...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-287

    authors: Yuan Y,Li CT,Wilson R

    更新日期:2008-06-18 00:00:00

  • DMDtoolkit: a tool for visualizing the mutated dystrophin protein and predicting the clinical severity in DMD.

    abstract:BACKGROUND:Dystrophinopathy is one of the most common human monogenic diseases which results in Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD). Mutations in the dystrophin gene are responsible for both DMD and BMD. However, the clinical phenotypes and treatments are quite different in these two m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1504-4

    authors: Zhou J,Xin J,Niu Y,Wu S

    更新日期:2017-02-02 00:00:00

  • MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans.

    abstract:BACKGROUND:MicroRNAs (miRNAs) are recognized as one of the most important families of non-coding RNAs that serve as important sequence-specific post-transcriptional regulators of gene expression. Identification of miRNAs is an important requirement for understanding the mechanisms of post-transcriptional regulation. Hu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-341

    authors: Huang TH,Fan B,Rothschild MF,Hu ZL,Li K,Zhao SH

    更新日期:2007-09-17 00:00:00

  • Learning statistical models for annotating proteins with function information using biomedical text.

    abstract:BACKGROUND:The BioCreative text mining evaluation investigated the application of text mining methods to the task of automatically extracting information from text in biomedical research articles. We participated in Task 2 of the evaluation. For this task, we built a system to automatically annotate a given protein wit...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-S1-S18

    authors: Ray S,Craven M

    更新日期:2005-01-01 00:00:00

  • The tumor as an organ: comprehensive spatial and temporal modeling of the tumor and its microenvironment.

    abstract:BACKGROUND:Research related to cancer is vast, and continues in earnest in many directions. Due to the complexity of cancer, a better understanding of tumor growth dynamics can be gleaned from a dynamic computational model. We present a comprehensive, fully executable, spatial and temporal 3D computational model of the...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1168-5

    authors: Bloch N,Harel D

    更新日期:2016-08-24 00:00:00