Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.

Abstract:

BACKGROUND:One of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to the sensitivity of the resulting profile. The inclusion of highly diverse sequences will presumably produce a more powerful profile, but distantly related sequences can be difficult to align accurately using only sequence information. Therefore, it would be expected that the use of protein structure alignments to improve the selection and alignment of diverse sequence homologs might yield improved profiles. However, the actual utility of such an approach has remained unclear. RESULTS:We explored several iterative protocols for the generation of profile hidden Markov models. These protocols were tailored to allow the inclusion of protein structure alignments in the process, and were used for large-scale creation and benchmarking of structure alignment-enhanced models. We found that models using structure alignments did not provide an overall improvement over sequence-only models for superfamily-level structure predictions. However, the results also revealed that the structure alignment-enhanced models were complimentary to the sequence-only models, particularly at the edge of the "twilight zone". When the two sets of models were combined, they provided improved results over sequence-only models alone. In addition, we found that the beneficial effects of the structure alignment-enhanced models could not be realized if the structure-based alignments were replaced with sequence-based alignments. Our experiments with different iterative protocols for sequence-only models also suggested that simple protocol modifications were unable to yield equivalent improvements to those provided by the structure alignment-enhanced models. Finally, we found that models using structure alignments provided fold-level structure assignments that were superior to those produced by sequence-only models. CONCLUSION:When attempting to predict the structure of remote homologs, we advocate a combined approach in which both traditional models and models incorporating structure alignments are used.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Scheeff ED,Bourne PE

doi

10.1186/1471-2105-7-410

subject

Has Abstract

pub_date

2006-09-14 00:00:00

pages

410

issn

1471-2105

pii

1471-2105-7-410

journal_volume

7

pub_type

杂志文章
  • SitesIdentify: a protein functional site prediction tool.

    abstract:BACKGROUND:The rate of protein structures being deposited in the Protein Data Bank surpasses the capacity to experimentally characterise them and therefore computational methods to analyse these structures have become increasingly important. Identifying the region of the protein most likely to be involved in function i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-379

    authors: Bray T,Chan P,Bougouffa S,Greaves R,Doig AJ,Warwicker J

    更新日期:2009-11-18 00:00:00

  • The exploration of disease-specific gene regulatory networks in esophageal carcinoma and stomach adenocarcinoma.

    abstract:BACKGROUND:Feed-forward loops (FFLs), consisting of miRNAs, transcription factors (TFs) and their common target genes, have been validated to be important for the initialization and development of complex diseases, including cancer. Esophageal Carcinoma (ESCA) and Stomach Adenocarcinoma (STAD) are two types of malignan...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3230-6

    authors: Qin G,Yang L,Ma Y,Liu J,Huo Q

    更新日期:2019-12-30 00:00:00

  • Automating dChip: toward reproducible sharing of microarray data analysis.

    abstract:BACKGROUND:During the past decade, many software packages have been developed for analysis and visualization of various types of microarrays. We have developed and maintained the widely used dChip as a microarray analysis software package accessible to both biologist and data analysts. However, challenges arise when dC...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-231

    authors: Li C

    更新日期:2008-05-08 00:00:00

  • Mining published lists of cancer related microarray experiments: identification of a gene expression signature having a critical role in cell-cycle control.

    abstract:BACKGROUND:Routine application of gene expression microarray technology is rapidly producing large amounts of data that necessitate new approaches of analysis. The analysis of a specific microarray experiment profits enormously from cross-comparing to other experiments. This process is generally performed by numerical ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-S4-S14

    authors: Finocchiaro G,Mancuso F,Muller H

    更新日期:2005-12-01 00:00:00

  • SPECS: a non-parametric method to identify tissue-specific molecular features for unbalanced sample groups.

    abstract:BACKGROUND:To understand biology and differences among various tissues or cell types, one typically searches for molecular features that display characteristic abundance patterns. Several specificity metrics have been introduced to identify tissue-specific molecular features, but these either require an equal number of...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3407-z

    authors: Everaert C,Volders PJ,Morlion A,Thas O,Mestdagh P

    更新日期:2020-02-17 00:00:00

  • MRCQuant- an accurate LC-MS relative isotopic quantification algorithm on TOF instruments.

    abstract:BACKGROUND:Relative isotope abundance quantification, which can be used for peptide identification and differential peptide quantification, plays an important role in liquid chromatography-mass spectrometry (LC-MS)-based proteomics. However, several major issues exist in the relative isotopic quantification of peptides...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-74

    authors: Haskins WE,Petritis K,Zhang J

    更新日期:2011-03-15 00:00:00

  • MATLIGN: a motif clustering, comparison and matching tool.

    abstract:BACKGROUND:Sequence motifs representing transcription factor binding sites (TFBS) are commonly encoded as position frequency matrices (PFM) or degenerate consensus sequences (CS). These formats are used to represent the characterised TFBS profiles stored in transcription factor databases, as well as to represent the po...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-189

    authors: Kankainen M,Löytynoja A

    更新日期:2007-06-08 00:00:00

  • A comprehensive comparison of comparative RNA structure prediction approaches.

    abstract:BACKGROUND:An increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction. Yet, independent benchmarking of these algorithms is rarely performed as is now common practice for protein-folding, gene-finding and multiple-sequence-...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-140

    authors: Gardner PP,Giegerich R

    更新日期:2004-09-30 00:00:00

  • BLAST+: architecture and applications.

    abstract:BACKGROUND:Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-421

    authors: Camacho C,Coulouris G,Avagyan V,Ma N,Papadopoulos J,Bealer K,Madden TL

    更新日期:2009-12-15 00:00:00

  • BAGEL: a computational framework for identifying essential genes from pooled library screens.

    abstract:BACKGROUND:The adaptation of the CRISPR-Cas9 system to pooled library gene knockout screens in mammalian cells represents a major technological leap over RNA interference, the prior state of the art. New methods for analyzing the data and evaluating results are needed. RESULTS:We offer BAGEL (Bayesian Analysis of Gene...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1015-8

    authors: Hart T,Moffat J

    更新日期:2016-04-16 00:00:00

  • PanACEA: a bioinformatics tool for the exploration and visualization of bacterial pan-chromosomes.

    abstract:BACKGROUND:Bacterial pan-genomes, comprised of conserved and variable genes across multiple sequenced bacterial genomes, allow for identification of genomic regions that are phylogenetically discriminating or functionally important. Pan-genomes consist of large amounts of data, which can restrict researchers ability to...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2250-y

    authors: Clarke TH,Brinkac LM,Inman JM,Sutton G,Fouts DE

    更新日期:2018-06-27 00:00:00

  • XenofilteR: computational deconvolution of mouse and human reads in tumor xenograft sequence data.

    abstract:BACKGROUND:Mouse xenografts from (patient-derived) tumors (PDX) or tumor cell lines are widely used as models to study various biological and preclinical aspects of cancer. However, analyses of their RNA and DNA profiles are challenging, because they comprise reads not only from the grafted human cancer but also from t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2353-5

    authors: Kluin RJC,Kemper K,Kuilman T,de Ruiter JR,Iyer V,Forment JV,Cornelissen-Steijger P,de Rink I,Ter Brugge P,Song JY,Klarenbeek S,McDermott U,Jonkers J,Velds A,Adams DJ,Peeper DS,Krijgsman O

    更新日期:2018-10-04 00:00:00

  • Logical development of the cell ontology.

    abstract:BACKGROUND:The Cell Ontology (CL) is an ontology for the representation of in vivo cell types. As biological ontologies such as the CL grow in complexity, they become increasingly difficult to use and maintain. By making the information in the ontology computable, we can use automated reasoners to detect errors and ass...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-6

    authors: Meehan TF,Masci AM,Abdulla A,Cowell LG,Blake JA,Mungall CJ,Diehl AD

    更新日期:2011-01-05 00:00:00

  • BicPAMS: software for biological data analysis with pattern-based biclustering.

    abstract:BACKGROUND:Biclustering has been largely applied for the unsupervised analysis of biological data, being recognised today as a key technique to discover putative modules in both expression data (subsets of genes correlated in subsets of conditions) and network data (groups of coherently interconnected biological entiti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1493-3

    authors: Henriques R,Ferreira FL,Madeira SC

    更新日期:2017-02-02 00:00:00

  • Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation.

    abstract:BACKGROUND:Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic cl...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-63

    authors: Smith DD,Saetrom P,Snøve O Jr,Lundberg C,Rivas GE,Glackin C,Larson GP

    更新日期:2008-01-28 00:00:00

  • Inferring latent task structure for Multitask Learning by Multiple Kernel Learning.

    abstract:BACKGROUND:The lack of sufficient training data is the limiting factor for many Machine Learning applications in Computational Biology. If data is available for several different but related problem domains, Multitask Learning algorithms can be used to learn a model based on all available information. In Bioinformatics...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S8-S5

    authors: Widmer C,Toussaint NC,Altun Y,Rätsch G

    更新日期:2010-10-26 00:00:00

  • TPMS: a set of utilities for querying collections of gene trees.

    abstract:BACKGROUND:The information in large collections of phylogenetic trees is useful for many comparative genomic studies. Therefore, there is a need for flexible tools that allow exploration of such collections in order to retrieve relevant data as quickly as possible. RESULTS:In this paper, we present TPMS (Tree Pattern-...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-109

    authors: Bigot T,Daubin V,Lassalle F,Perrière G

    更新日期:2013-03-27 00:00:00

  • Predicting the interactome of Xanthomonas oryzae pathovar oryzae for target selection and DB service.

    abstract:BACKGROUND:Protein-protein interactions (PPIs) play key roles in various cellular functions. In addition, some critical inter-species interactions such as host-pathogen interactions and pathogenicity occur through PPIs. Phytopathogenic bacteria infect hosts through attachment to host tissue, enzyme secretion, exopolysa...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-41

    authors: Kim JG,Park D,Kim BC,Cho SW,Kim YT,Park YJ,Cho HJ,Park H,Kim KB,Yoon KO,Park SJ,Lee BM,Bhak J

    更新日期:2008-01-24 00:00:00

  • Venn-diaNet : venn diagram based network propagation analysis framework for comparing multiple biological experiments.

    abstract:BACKGROUND:The main research topic in this paper is how to compare multiple biological experiments using transcriptome data, where each experiment is measured and designed to compare control and treated samples. Comparison of multiple biological experiments is usually performed in terms of the number of DEGs in an arbi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3302-7

    authors: Hur B,Kang D,Lee S,Moon JH,Lee G,Kim S

    更新日期:2019-12-27 00:00:00

  • Learning smoothing models of copy number profiles using breakpoint annotations.

    abstract:BACKGROUND:Many models have been proposed to detect copy number alterations in chromosomal copy number profiles, but it is usually not obvious to decide which is most effective for a given data set. Furthermore, most methods have a smoothing parameter that determines the number of breakpoints and must be chosen using v...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-164

    authors: Hocking TD,Schleiermacher G,Janoueix-Lerosey I,Boeva V,Cappo J,Delattre O,Bach F,Vert JP

    更新日期:2013-05-22 00:00:00

  • Correction to: Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage.

    abstract::Following publication of the original article [1], the author reported that there are several errors in the original article. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章,已发布勘误

    doi:10.1186/s12859-019-3318-z

    authors: Ranjard L,Wong TKF,Rodrigo AG

    更新日期:2020-01-22 00:00:00

  • QPath: a method for querying pathways in a protein-protein interaction network.

    abstract:BACKGROUND:Sequence comparison is one of the most prominent tools in biological research, and is instrumental in studying gene function and evolution. The rapid development of high-throughput technologies for measuring protein interactions calls for extending this fundamental operation to the level of pathways in prote...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-199

    authors: Shlomi T,Segal D,Ruppin E,Sharan R

    更新日期:2006-04-10 00:00:00

  • REW-ISA: unveiling local functional blocks in epi-transcriptome profiling data via an RNA expression-weighted iterative signature algorithm.

    abstract:BACKGROUND:Recent studies have shown that N6-methyladenosine (m6A) plays a critical role in numbers of biological processes and complex human diseases. However, the regulatory mechanisms of most methylation sites remain uncharted. Thus, in-depth study of the epi-transcriptomic patterns of m6A may provide insights into ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03787-w

    authors: Zhang L,Chen S,Zhu J,Meng J,Liu H

    更新日期:2020-10-09 00:00:00

  • An integrative method to normalize RNA-Seq data.

    abstract:BACKGROUND:Transcriptome sequencing is a powerful tool for measuring gene expression, but as well as some other technologies, various artifacts and biases affect the quantification. In order to correct some of them, several normalization approaches have emerged, differing both in the statistical strategy employed and i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-188

    authors: Filloux C,Cédric M,Romain P,Lionel F,Christophe K,Dominique R,Abderrahman M,Daniel P

    更新日期:2014-06-14 00:00:00

  • A computational diffusion model to study antibody transport within reconstructed tumor microenvironments.

    abstract:BACKGROUND:Antibodies revolutionized cancer treatment over the past decades. Despite their successfully application, there are still challenges to overcome to improve efficacy, such as the heterogeneous distribution of antibodies within tumors. Tumor microenvironment features, such as the distribution of tumor and othe...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03854-2

    authors: Cartaxo AL,Almeida J,Gualda EJ,Marsal M,Loza-Alvarez P,Brito C,Isidro IA

    更新日期:2020-11-17 00:00:00

  • On pairwise distances and median score of three genomes under DCJ.

    abstract::In comparative genomics, the rearrangement distance between two genomes (equal the minimal number of genome rearrangements required to transform them into a single genome) is often used for measuring their evolutionary remoteness. Generalization of this measure to three genomes is known as the median score (while a re...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S19-S1

    authors: Aganezov S Jr,Alekseyev MA

    更新日期:2012-01-01 00:00:00

  • Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.

    abstract:BACKGROUND:High-throughput sequencing technologies, such as the Illumina Genome Analyzer, are powerful new tools for investigating a wide range of biological and medical questions. Statistical and computational methods are key for drawing meaningful and accurate conclusions from the massive and complex datasets generat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-94

    authors: Bullard JH,Purdom E,Hansen KD,Dudoit S

    更新日期:2010-02-18 00:00:00

  • Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines.

    abstract:BACKGROUND:Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-537

    authors: González AJ,Liao L

    更新日期:2010-10-29 00:00:00

  • PFClust: a novel parameter free clustering algorithm.

    abstract:BACKGROUND:We present the algorithm PFClust (Parameter Free Clustering), which is able automatically to cluster data and identify a suitable number of clusters to group them into without requiring any parameters to be specified by the user. The algorithm partitions a dataset into a number of clusters that share some co...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-213

    authors: Mavridis L,Nath N,Mitchell JB

    更新日期:2013-07-03 00:00:00

  • scDC: single cell differential composition analysis.

    abstract:BACKGROUND:Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite de...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3211-9

    authors: Cao Y,Lin Y,Ormerod JT,Yang P,Yang JYH,Lo KK

    更新日期:2019-12-24 00:00:00