当前位置： SCI文献检索 > BMC BIOINFORMATICS期刊下所有文献 > MATLIGN: a motif clustering, comparison and matching tool.

MATLIGN: a motif clustering, comparison and matching tool.

Abstract：

BACKGROUND:Sequence motifs representing transcription factor binding sites (TFBS) are commonly encoded as position frequency matrices (PFM) or degenerate consensus sequences (CS). These formats are used to represent the characterised TFBS profiles stored in transcription factor databases, as well as to represent the potential motifs predicted using computational methods. To fill the gap between the known and predicted motifs, methods are needed for the post-processing of prediction results, i.e. for matching, comparison and clustering of pre-selected motifs. The computational identification of over-represented motifs in sets of DNA sequences is, in particular, a task where post-processing can dramatically simplify the analysis. Efficient post-processing, for example, reduces the redundancy of the motifs predicted and enables them to be annotated. RESULTS:In order to facilitate the post-processing of motifs, in both PFM and CS formats, we have developed a tool called Matlign. The tool aligns and evaluates the similarity of motifs using a combination of scoring functions, and visualises the results using hierarchical clustering. By limiting the number of distinct gaps created (though, not their length), the alignment algorithm also correctly aligns motifs with an internal spacer. The method selects the best non-redundant motif set, with repetitive motifs merged together, by cutting the hierarchical tree using silhouette values. Our analyses show that Matlign can reliably discover the most similar analogue from a collection of characterised regulatory elements such that the method is also useful for the annotation of motif predictions by PFM library searches. CONCLUSION:Matlign is a user-friendly tool for post-processing large collections of DNA sequence motifs. Starting from a large number of potential regulatory motifs, Matlign provides a researcher with a non-redundant set of motifs, which can then be further associated to known regulatory elements. A web-server is available at http://ekhidna.biocenter.helsinki.fi/poxo/matlign.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Kankainen M,Löytynoja A

doi

10.1186/1471-2105-8-189

subject

Has Abstract

pub_date

2007-06-08 00:00:00

pages

189

issn

1471-2105

pii

1471-2105-8-189

journal_volume

pub_type

杂志文章

在线工具

MATLIGN: a motif clustering, comparison and matching tool.