Local search for the generalized tree alignment problem.

Abstract:

BACKGROUND:A phylogeny postulates shared ancestry relationships among organisms in the form of a binary tree. Phylogenies attempt to answer an important question posed in biology: what are the ancestor-descendent relationships between organisms? At the core of every biological problem lies a phylogenetic component. The patterns that can be observed in nature are the product of complex interactions, constrained by the template that our ancestors provide. The problem of simultaneous tree and alignment estimation under Maximum Parsimony is known in combinatorial optimization as the Generalized Tree Alignment Problem (GTAP). The GTAP is the Steiner Tree Problem for the sequence edit distance. Like many biologically interesting problems, the GTAP is NP-Hard. Typically the Steiner Tree is presented under the Manhattan or the Hamming distances. RESULTS:Experimentally, the accuracy of the GTAP has been subjected to evaluation. Results show that phylogenies selected using the GTAP from unaligned sequences are competitive with the best methods and algorithms available. Here, we implement and explore experimentally existing and new local search heuristics for the GTAP using simulated and real data. CONCLUSIONS:The methods presented here improve by more than three orders of magnitude in execution time the best local search heuristics existing to date when applied to real data.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Varón A,Wheeler WC

doi

10.1186/1471-2105-14-66

subject

Has Abstract

pub_date

2013-02-26 00:00:00

pages

66

issn

1471-2105

pii

1471-2105-14-66

journal_volume

14

pub_type

杂志文章
  • The textual characteristics of traditional and Open Access scientific journals are similar.

    abstract:BACKGROUND:Recent years have seen an increased amount of natural language processing (NLP) work on full text biomedical journal publications. Much of this work is done with Open Access journal articles. Such work assumes that Open Access articles are representative of biomedical publications in general and that methods...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-183

    authors: Verspoor K,Cohen KB,Hunter L

    更新日期:2009-06-15 00:00:00

  • Predicting and improving the protein sequence alignment quality by support vector regression.

    abstract:BACKGROUND:For successful protein structure prediction by comparative modeling, in addition to identifying a good template protein with known structure, obtaining an accurate sequence alignment between a query protein and a template protein is critical. It has been known that the alignment accuracy can vary significant...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-471

    authors: Lee M,Jeong CS,Kim D

    更新日期:2007-12-03 00:00:00

  • DNAscan: personal computer compatible NGS analysis, annotation and visualisation.

    abstract:BACKGROUND:Next Generation Sequencing (NGS) is a commonly used technology for studying the genetic basis of biological processes and it underpins the aspirations of precision medicine. However, there are significant challenges when dealing with NGS data. Firstly, a huge number of bioinformatics tools for a wide range o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2791-8

    authors: Iacoangeli A,Al Khleifat A,Sproviero W,Shatunov A,Jones AR,Morgan SL,Pittman A,Dobson RJ,Newhouse SJ,Al-Chalabi A

    更新日期:2019-04-27 00:00:00

  • A mixture of feature experts approach for protein-protein interaction prediction.

    abstract:BACKGROUND:High-throughput methods can directly detect the set of interacting proteins in model species but the results are often incomplete and exhibit high false positive and false negative rates. A number of researchers have recently presented methods for integrating direct and indirect data for predicting interacti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-S10-S6

    authors: Qi Y,Klein-Seetharaman J,Bar-Joseph Z

    更新日期:2007-01-01 00:00:00

  • Homology induction: the use of machine learning to improve sequence similarity searches.

    abstract:BACKGROUND:The inference of homology between proteins is a key problem in molecular biology The current best approaches only identify approximately 50% of homologies (with a false positive rate set at 1/1000). RESULTS:We present Homology Induction (HI), a new approach to inferring homology. HI uses machine learning to...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-3-11

    authors: Karwath A,King RD

    更新日期:2002-04-23 00:00:00

  • A novel approach for predicting protein S-glutathionylation.

    abstract:BACKGROUND:S-glutathionylation is the formation of disulfide bonds between the tripeptide glutathione and cysteine residues of the protein, protecting them from irreversible oxidation and in some cases causing change in their functions. Regulatory glutathionylation of proteins is a controllable and reversible process a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03571-w

    authors: Anashkina AA,Poluektov YM,Dmitriev VA,Kuznetsov EN,Mitkevich VA,Makarov AA,Petrushanko IY

    更新日期:2020-09-14 00:00:00

  • Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data.

    abstract:BACKGROUND:A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially dev...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2217-z

    authors: Chen S,Mar JC

    更新日期:2018-06-19 00:00:00

  • AntiBP2: improved version of antibacterial peptide prediction.

    abstract:BACKGROUND:Antibacterial peptides are one of the effecter molecules of innate immune system. Over the last few decades several antibacterial peptides have successfully approved as drug by FDA, which has prompted an interest in these antibacterial peptides. In our recent study we analyzed 999 antibacterial peptides, whi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S1-S19

    authors: Lata S,Mishra NK,Raghava GP

    更新日期:2010-01-18 00:00:00

  • CLU: a new algorithm for EST clustering.

    abstract:BACKGROUND:The continuous flow of EST data remains one of the richest sources for discoveries in modern biology. The first step in EST data mining is usually associated with EST clustering, the process of grouping of original fragments according to their annotation, similarity to known genomic DNA or each other. Cluste...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-S2-S3

    authors: Ptitsyn A,Hide W

    更新日期:2005-07-15 00:00:00

  • Development and tuning of an original search engine for patent libraries in medicinal chemistry.

    abstract:BACKGROUND:The large increase in the size of patent collections has led to the need of efficient search strategies. But the development of advanced text-mining applications dedicated to patents of the biomedical field remains rare, in particular to address the needs of the pharmaceutical & biotech industry, which inten...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S1-S15

    authors: Pasche E,Gobeill J,Kreim O,Oezdemir-Zaech F,Vachon T,Lovis C,Ruch P

    更新日期:2014-01-01 00:00:00

  • Statistical shape analysis of tap roots: a methodological case study on laser scanned sugar beets.

    abstract:BACKGROUND:The efficient and robust statistical analysis of the shape of plant organs of different cultivars is an important investigation issue in plant breeding and enables a robust cultivar description within the breeding progress. Laserscanning is a highly accurate and high resolution technique to acquire the 3D sh...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03654-8

    authors: Heeren B,Paulus S,Goldbach H,Kuhlmann H,Mahlein AK,Rumpf M,Wirth B

    更新日期:2020-07-29 00:00:00

  • Evaluation of methods for differential expression analysis on multi-group RNA-seq count data.

    abstract:BACKGROUND:RNA-seq is a powerful tool for measuring transcriptomes, especially for identifying differentially expressed genes or transcripts (DEGs) between sample groups. A number of methods have been developed for this task, and several evaluation studies have also been reported. However, those evaluations so far have...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0794-7

    authors: Tang M,Sun J,Shimizu K,Kadota K

    更新日期:2015-11-04 00:00:00

  • Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies.

    abstract:BACKGROUND:The increasing availability of Electronic Health Record (EHR) data and specifically free-text patient notes presents opportunities for phenotype extraction. Text-mining methods in particular can help disease modeling by mapping named-entities mentions to terminologies and clustering semantically related term...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-10

    authors: Cohen R,Elhadad M,Elhadad N

    更新日期:2013-01-16 00:00:00

  • Enhanced CellClassifier: a multi-class classification tool for microscopy images.

    abstract:BACKGROUND:Light microscopy is of central importance in cell biology. The recent introduction of automated high content screening has expanded this technology towards automation of experiments and performing large scale perturbation assays. Nevertheless, evaluation of microscopy data continues to be a bottleneck in man...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-30

    authors: Misselwitz B,Strittmatter G,Periaswamy B,Schlumberger MC,Rout S,Horvath P,Kozak K,Hardt WD

    更新日期:2010-01-14 00:00:00

  • Prediction of heart disease and classifiers' sensitivity analysis.

    abstract:BACKGROUND:Heart disease (HD) is one of the most common diseases nowadays, and an early diagnosis of such a disease is a crucial task for many health care providers to prevent their patients for such a disease and to save lives. In this paper, a comparative analysis of different classifiers was performed for the classi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03626-y

    authors: Almustafa KM

    更新日期:2020-07-02 00:00:00

  • Comparative study of discretization methods of microarray data for inferring transcriptional regulatory networks.

    abstract:BACKGROUND:Microarray data discretization is a basic preprocess for many algorithms of gene regulatory network inference. Some common discretization methods in informatics are used to discretize microarray data. Selection of the discretization method is often arbitrary and no systematic comparison of different discreti...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-520

    authors: Li Y,Liu L,Bai X,Cai H,Ji W,Guo D,Zhu Y

    更新日期:2010-10-19 00:00:00

  • Using mechanistic Bayesian networks to identify downstream targets of the sonic hedgehog pathway.

    abstract:BACKGROUND:The topology of a biological pathway provides clues as to how a pathway operates, but rationally using this topology information with observed gene expression data remains a challenge. RESULTS:We introduce a new general-purpose analytic method called Mechanistic Bayesian Networks (MBNs) that allows for the ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-433

    authors: Shah A,Tenzen T,McMahon AP,Woolf PJ

    更新日期:2009-12-18 00:00:00

  • Normalized N50 assembly metric using gap-restricted co-linear chaining.

    abstract:BACKGROUND:For the development of genome assembly tools, some comprehensive and efficiently computable validation measures are required to assess the quality of the assembly. The mostly used N50 measure summarizes the assembly results by the length of the scaffold (or contig) overlapping the midpoint of the length-orde...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-255

    authors: Mäkinen V,Salmela L,Ylinen J

    更新日期:2012-10-03 00:00:00

  • Multi-view feature selection for identifying gene markers: a diversified biological data driven approach.

    abstract:BACKGROUND:In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene express...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03810-0

    authors: Acharya S,Cui L,Pan Y

    更新日期:2020-12-30 00:00:00

  • VIO: ontology classification and study of vaccine responses given various experimental and analytical conditions.

    abstract:BACKGROUND:Different human responses to the same vaccine were frequently observed. For example, independent studies identified overlapping but different transcriptomic gene expression profiles in Yellow Fever vaccine 17D (YF-17D) immunized human subjects. Different experimental and analysis conditions were likely contr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3194-6

    authors: Ong E,Sun P,Berke K,Zheng J,Wu G,He Y

    更新日期:2019-12-23 00:00:00

  • The G protein-coupled receptors in the pufferfish Takifugu rubripes.

    abstract:BACKGROUND:Guanine protein-coupled receptors (GPCRs) constitute a eukaryotic transmembrane protein family and function as "molecular switches" in the second messenger cascades and are found in all organisms between yeast and humans. They form the single, biggest drug-target family due to their versatility of action and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S1-S3

    authors: Sarkar A,Kumar S,Sundar D

    更新日期:2011-02-15 00:00:00

  • Identification of sequence motifs significantly associated with antisense activity.

    abstract:BACKGROUND:Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-184

    authors: McQuisten KA,Peek AS

    更新日期:2007-06-07 00:00:00

  • GO2MSIG, an automated GO based multi-species gene set generator for gene set enrichment analysis.

    abstract:BACKGROUND:Despite the widespread use of high throughput expression platforms and the availability of a desktop implementation of Gene Set Enrichment Analysis (GSEA) that enables non-experts to perform gene set based analyses, the availability of the necessary precompiled gene sets is rare for species other than human....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-146

    authors: Powell JA

    更新日期:2014-05-17 00:00:00

  • Advances in translational bioinformatics facilitate revealing the landscape of complex disease mechanisms.

    abstract::Advances of high-throughput technologies have rapidly produced more and more data from DNAs and RNAs to proteins, especially large volumes of genome-scale data. However, connection of the genomic information to cellular functions and biological behaviours relies on the development of effective approaches at higher sys...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S17-I1

    authors: Yang JY,Dunker A,Liu JS,Qin X,Arabnia HR,Yang W,Niemierko A,Chen Z,Luo Z,Wang L,Liu Y,Xu D,Deng Y,Tong W,Yang M

    更新日期:2014-01-01 00:00:00

  • Alternative mapping of probes to genes for Affymetrix chips.

    abstract:BACKGROUND:Short oligonucleotide arrays have several probes measuring the expression level of each target transcript. Therefore the selection of probes is a key component for the quality of measurements. However, once probes have been selected and synthesized on an array, it is still possible to re-evaluate the results...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-111

    authors: Gautier L,Møller M,Friis-Hansen L,Knudsen S

    更新日期:2004-08-14 00:00:00

  • CellSim: a novel software to calculate cell similarity and identify their co-regulation networks.

    abstract:BACKGROUND:Cell direct reprogramming technology has been rapidly developed with its low risk of tumor risk and avoidance of ethical issues caused by stem cells, but it is still limited to specific cell types. Direct reprogramming from an original cell to target cell type needs the cell similarity and cell specific regu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2699-3

    authors: Li L,Che D,Wang X,Zhang P,Rahman SU,Zhao J,Yu J,Tao S,Lu H,Liao M

    更新日期:2019-03-04 00:00:00

  • Measuring similarities between transcription factor binding sites.

    abstract:BACKGROUND:Collections of transcription factor binding profiles (Transfac, Jaspar) are essential to identify regulatory elements in DNA sequences. Subsets of highly similar profiles complicate large scale analysis of transcription factor binding sites. RESULTS:We propose to identify and group similar profiles using tw...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-237

    authors: Kielbasa SM,Gonze D,Herzel H

    更新日期:2005-09-28 00:00:00

  • Application of whole genome data for in silico evaluation of primers and probes routinely employed for the detection of viral species by RT-qPCR using dengue virus as a case study.

    abstract:BACKGROUND:Viral infection by dengue virus is a major public health problem in tropical countries. Early diagnosis and detection are increasingly based on quantitative reverse transcriptase real-time polymerase chain reaction (RT-qPCR) directed against genomic regions conserved between different isolates. Genetic varia...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2313-0

    authors: Vanneste K,Garlant L,Broeders S,Van Gucht S,Roosens NH

    更新日期:2018-09-04 00:00:00

  • PFBNet: a priori-fused boosting method for gene regulatory network inference.

    abstract:BACKGROUND:Inferring gene regulatory networks (GRNs) from gene expression data remains a challenge in system biology. In past decade, numerous methods have been developed for the inference of GRNs. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of pot...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03639-7

    authors: Che D,Guo S,Jiang Q,Chen L

    更新日期:2020-07-14 00:00:00

  • Discrimination of cell cycle phases in PCNA-immunolabeled cells.

    abstract:BACKGROUND:Protein function in eukaryotic cells is often controlled in a cell cycle-dependent manner. Therefore, the correct assignment of cellular phenotypes to cell cycle phases is a crucial task in cell biology research. Nuclear proteins whose localization varies during the cell cycle are valuable and frequently use...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0618-9

    authors: Schönenberger F,Deutzmann A,Ferrando-May E,Merhof D

    更新日期:2015-05-29 00:00:00