Abstract:
BACKGROUND:Leishmania and other members of the Trypanosomatidae family diverged early on in eukaryotic evolution and consequently display unique cellular properties. Their apparent lack of transcriptional regulation is compensated by complex post-transcriptional control mechanisms, including the processing of polycistronic transcripts by means of coupled trans-splicing and polyadenylation. Trans-splicing signals are often U-rich polypyrimidine (poly(Y)) tracts, which precede AG splice acceptor sites. However, as opposed to higher eukaryotes there is no consensus polyadenylation signal in trypanosomatid mRNAs. RESULTS:We refined a previously reported method to target 5' splice junctions by incorporating the pyrimidine content of query sequences into a scoring function. We also investigated a novel approach for predicting polyadenylation (poly(A)) sites in-silico, by comparing query sequences to polyadenylated expressed sequence tags (ESTs) using position-specific scanning matrices (PSSMs). An additional analysis of the distribution of putative splice junction to poly(A) distances helped to increase prediction rates by limiting the scanning range. These methods were able to simplify splice junction prediction without loss of precision and to increase polyadenylation site prediction from 22% to 47% within 100 nucleotides. CONCLUSION:We propose a simplified trans-splicing prediction tool and a novel poly(A) prediction tool based on comparative sequence analysis. We discuss the impact of certain regions surrounding the poly(A) sites on prediction rates and contemplate correlating biological mechanisms. This work aims to sharpen the identification of potentially functional untranslated regions (UTRs) in a large-scale, comparative genomics framework.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Smith M,Blanchette M,Papadopoulou Bdoi
10.1186/1471-2105-9-158subject
Has Abstractpub_date
2008-03-20 00:00:00pages
158issn
1471-2105pii
1471-2105-9-158journal_volume
9pub_type
杂志文章abstract:BACKGROUND:Conservation and variation scores are used when evaluating sites in a multiple sequence alignment, in order to identify residues critical for structure or function. A variety of scores are available today but it is not clear how different scores relate to each other. RESULTS:We applied 25 conservation and v...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-388
更新日期:2010-07-21 00:00:00
abstract:BACKGROUND:Time-lapse analysis of cellular images is an important and growing need in biology. Algorithms for cell tracking are widely available; what researchers have been missing is a single open-source software package to visualize standard tracking output (from software like CellProfiler) in a way that allows conve...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0759-x
更新日期:2015-11-04 00:00:00
abstract:BACKGROUND:Graph theory provides a computational framework for modeling a variety of datasets including those emerging from genomics, proteomics, and chemical genetics. Networks of genes, proteins, small molecules, or other objects of study can be represented as graphs of nodes (vertices) and interactions (edges) that ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-260
更新日期:2005-10-19 00:00:00
abstract:BACKGROUND:Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite de...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3211-9
更新日期:2019-12-24 00:00:00
abstract:BACKGROUND:The exponential growth of gigantic biological data from various sources, such as protein-protein interaction (PPI), genome sequences scaffolding, Mass spectrometry (MS) molecular networking and metabolic flux, demands an efficient way for better visualization and interpretation beyond the conventional, two-d...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-322
更新日期:2013-11-14 00:00:00
abstract:BACKGROUND:DNA methylation patterns store epigenetic information in the vast majority of eukaryotic species. The relatively high costs and technical challenges associated with the detection of DNA methylation however have created a bias in the number of methylation studies towards model organisms. Consequently, it rema...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2115-4
更新日期:2018-03-27 00:00:00
abstract:BACKGROUND:SARS coronavirus main proteinase (SARS CoVMpro) is an important enzyme for the replication of Severe Acute Respiratory Syndrome virus. The active site region of SARS CoVMpro is divided into 8 subsites. Understanding the binding mode of SARS CoVMpro with a specific substrate is useful and contributes to struc...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S1-S48
更新日期:2009-01-30 00:00:00
abstract:BACKGROUND:Sequence motifs representing transcription factor binding sites (TFBS) are commonly encoded as position frequency matrices (PFM) or degenerate consensus sequences (CS). These formats are used to represent the characterised TFBS profiles stored in transcription factor databases, as well as to represent the po...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-189
更新日期:2007-06-08 00:00:00
abstract:BACKGROUND:Homology search is still a significant step in functional analysis for genomic data. Profile Hidden Markov Model-based homology search has been widely used in protein domain analysis in many different species. In particular, with the fast accumulation of transcriptomic data of non-model species and metagenom...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1826-2
更新日期:2017-10-16 00:00:00
abstract:BACKGROUND:In recent years, a considerable amount of research effort has been directed to the analysis of biological networks with the availability of genome-scale networks of genes and/or proteins of an increasing number of organisms. A protein-protein interaction (PPI) network is a particular biological network which...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-276
更新日期:2008-06-11 00:00:00
abstract:BACKGROUND:Comparison of metabolic networks is typically performed based on the organisms' enzyme contents. This approach disregards functional replacements as well as orthologies that are misannotated. Direct comparison of the structure of metabolic networks can circumvent these problems. RESULTS:Metabolic networks a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-67
更新日期:2006-02-14 00:00:00
abstract::Transcript quantification is a long-standing problem in genomics and estimating the relative abundance of alternatively-spliced isoforms from the same transcript is an important special case. Both problems have recently been illuminated by high-throughput RNA sequencing experiments which are quickly generating large a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S6-S11
更新日期:2012-04-19 00:00:00
abstract:BACKGROUND:Study on long non-coding RNAs (lncRNAs) has been promoted by high-throughput RNA sequencing (RNA-Seq). However, it is still not trivial to identify lncRNAs from the RNA-Seq data and it remains a challenge to uncover their functions. RESULTS:We present a computational pipeline for detecting novel lncRNAs fro...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-331
更新日期:2012-12-13 00:00:00
abstract:BACKGROUND:Histopathology image analysis is a gold standard for cancer recognition and diagnosis. Automatic analysis of histopathology images can help pathologists diagnose tumor and cancer subtypes, alleviating the workload of pathologists. There are two basic types of tasks in digital histopathology image analysis: i...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1685-x
更新日期:2017-05-26 00:00:00
abstract:BACKGROUND:An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task. RESULTS:The bioinformatics library libcov is a collection of C++ classe...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-138
更新日期:2005-06-06 00:00:00
abstract:BACKGROUND:Deep mutational scanning is a technique to estimate the impacts of mutations on a gene by using deep sequencing to count mutations in a library of variants before and after imposing a functional selection. The impacts of mutations must be inferred from changes in their counts after selection. RESULTS:I desc...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0590-4
更新日期:2015-05-20 00:00:00
abstract:BACKGROUND:Bistability and ability to switch between two stable states is the hallmark of cellular responses. Cellular signaling pathways often contain bistable switches that regulate the transmission of the extracellular information to the nucleus where important biological functions are executed. RESULTS:In this wor...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3155-0
更新日期:2019-11-28 00:00:00
abstract:BACKGROUND:Detection of periodically expressed genes from microarray data without use of known periodic and non-periodic training examples is an important problem, e.g. for identifying genes regulated by the cell-cycle in poorly characterised organisms. Commonly the investigator is only interested in genes expressed at...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-63
更新日期:2006-02-09 00:00:00
abstract:BACKGROUND:High-throughput screens comparing growth rates of arrays of distinct micro-organism cultures on solid agar are useful, rapid methods of quantifying genetic interactions. Growth rate is an informative phenotype which can be estimated by measuring cell densities at one or more times after inoculation. Precise ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-287
更新日期:2010-05-28 00:00:00
abstract:BACKGROUND:DNA methylation changes are associated with a wide array of biological processes. Bisulfite conversion of DNA followed by high-throughput sequencing is increasingly being used to assess genome-wide methylation at single-base resolution. The relative slowness of most commonly used aligners for processing such...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-337
更新日期:2014-10-18 00:00:00
abstract:BACKGROUND:The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-cou...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-254
更新日期:2006-05-15 00:00:00
abstract:BACKGROUND:GmrSD is a modification-dependent restriction endonuclease that specifically targets and cleaves glucosylated hydroxymethylcytosine (glc-HMC) modified DNA. It is encoded either as two separate single-domain GmrS and GmrD proteins or as a single protein carrying both domains. Previous studies suggested that G...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0773-z
更新日期:2015-10-23 00:00:00
abstract:BACKGROUND:Cluster analysis is the most common unsupervised method for finding hidden groups in data. Clustering presents two main challenges: (1) finding the optimal number of clusters, and (2) removing "outliers" among the objects being clustered. Few clustering algorithms currently deal directly with the outlier pro...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1998-9
更新日期:2018-01-08 00:00:00
abstract:BACKGROUND:Dystrophinopathy is one of the most common human monogenic diseases which results in Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD). Mutations in the dystrophin gene are responsible for both DMD and BMD. However, the clinical phenotypes and treatments are quite different in these two m...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1504-4
更新日期:2017-02-02 00:00:00
abstract:BACKGROUND:Current malaria diagnosis relies primarily on microscopic examination of Giemsa-stained thick and thin blood films. This method requires vigorously trained technicians to efficiently detect and classify the malaria parasite species such as Plasmodium falciparum (Pf) and Plasmodium vivax (Pv) for an appropria...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S17-S18
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:The number of protein structures from structural genomics centers dramatically increases in the Protein Data Bank (PDB). Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-53
更新日期:2006-02-06 00:00:00
abstract:BACKGROUND:Recently, revealing the function of proteins with protein-protein interaction (PPI) networks is regarded as one of important issues in bioinformatics. With the development of experimental methods such as the yeast two-hybrid method, the data of protein interaction have been increasing extremely. Many databas...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S1-S39
更新日期:2011-02-15 00:00:00
abstract:BACKGROUND:To understand the evolutionary role of Lateral Gene Transfer (LGT), accurate methods are needed to identify transferred genes and infer their timing of acquisition. Phylogenetic methods are particularly promising for this purpose, but the reconciliation of a gene tree with a reference (species) tree is compu...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-324
更新日期:2010-06-15 00:00:00
abstract:BACKGROUND:Glycation is a one of the post-translational modifications (PTM) where sugar molecules and residues in protein sequences are covalently bonded. It has become one of the clinically important PTM in recent times attributed to many chronic and age related complications. Being a non-enzymatic reaction, it is a g...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2547-x
更新日期:2019-02-04 00:00:00
abstract:BACKGROUND:High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. Studying complete protein networks can reveal more insight about healthy/disease states than studying proteins in isolation. Similarly, a comparative study of pr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03827-5
更新日期:2020-11-04 00:00:00