Abstract:
BACKGROUND:There are a number of different methods for generation of trees and algorithms for phylogenetic analysis in the study of bacterial taxonomy. Genotypic information, such as SSU rRNA gene sequences, now plays a more prominent role in microbial systematics than does phenotypic information. However, the integration of genotypic and phenotypic information for polyphasic studies is necessary for the classification and identification of microbes. Thus, we devised an algorithm that objectively identifies discriminative characteristics for focused clusters on generated trees from a dataset composed of coded data, such as phenotypic information. Moreover, this algorithm has been integrated into the polyphasic analysis software, InforBIO. RESULTS:We developed a differential-character-finding algorithm based on information measures and used this algorithm to identify the characteristic that best discriminates operational taxonomic unit clusters. For all characteristics in a dataset, the algorithm estimates commonality in focused clusters and diversity among clusters by scoring based on Shannon's and relative entropies. All the characteristics selected for scoring are equally weighted. Thresholds for the scores are defined to identify discriminative characteristics for clusters efficiently from a database. The unique feature of the algorithm, which is implemented in the InforBIO software, is that it can identify the phenotypic characteristics that discriminate and are associated with the clusters of a phylogenetic tree. We successfully applied this algorithm to the study of phylogenetic clusters of Pseudomonas species. CONCLUSION:The algorithm in the InforBIO software is a novel and useful approach for microbial polyphasic studies. The algorithm can also be applied to diverse cluster analyses. The InforBIO software is available from the download site http://wdcm.nig.ac.jp/inforbio/. This software is free for personal but not commercial use.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Tanaka N,Uchino M,Miyazaki S,Sugawara Hdoi
10.1186/1471-2105-8-281subject
Has Abstractpub_date
2007-08-02 00:00:00pages
281issn
1471-2105pii
1471-2105-8-281journal_volume
8pub_type
杂志文章abstract:BACKGROUND:Targeted resequencing offers a cost-effective alternative to whole-genome and whole-exome sequencing when investigating regions known to be associated with a trait or disease. There are a number of approaches to targeted resequencing, including microfluidic PCR amplification, which may be enhanced by multipl...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1453-3
更新日期:2017-01-05 00:00:00
abstract:BACKGROUND:One of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to th...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-410
更新日期:2006-09-14 00:00:00
abstract:BACKGROUND:Biological data that are well-organized by an ontology, such as Gene Ontology, enables high-throughput availability of the semantic web. It can also be used to facilitate high throughput classification of biomedical information. However, to our knowledge, no evaluation has been published on automating classi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-S3-S7
更新日期:2007-05-09 00:00:00
abstract:BACKGROUND:DNA methylation changes are associated with a wide array of biological processes. Bisulfite conversion of DNA followed by high-throughput sequencing is increasingly being used to assess genome-wide methylation at single-base resolution. The relative slowness of most commonly used aligners for processing such...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-337
更新日期:2014-10-18 00:00:00
abstract:BACKGROUND:Clustering of unannotated transcripts is an important task to identify novel families of noncoding RNAs (ncRNAs). Several hierarchical clustering methods have been developed using similarity measures based on the scores of structural alignment. However, the high computational cost of exact structural alignme...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S1-S48
更新日期:2011-02-15 00:00:00
abstract:BACKGROUND:Profile-based comparison of multiple sequence alignments is a powerful methodology for the detection remote protein sequence similarity, which is essential for the inference and analysis of protein structure, function, and evolution. Accurate estimation of statistical significance of detected profile similar...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-399
更新日期:2009-12-04 00:00:00
abstract:BACKGROUND:T-cells are key players in regulating a specific immune response. Activation of cytotoxic T-cells requires recognition of specific peptides bound to Major Histocompatibility Complex (MHC) class I molecules. MHC-peptide complexes are potential tools for diagnosis and treatment of pathogens and cancer, as well...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-3-25
更新日期:2002-09-11 00:00:00
abstract:BACKGROUND:Many cancer genomes are extensively rearranged with highly aberrant chromosomal karyotypes. Structural and copy number variations in cancer genomes can be determined via abnormal mapping of sequenced reads to the reference genome. Recently it became possible to reconcile both of these types of large-scale va...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3208-4
更新日期:2019-12-17 00:00:00
abstract:BACKGROUND:Homology is a crucial concept in comparative genomics. The algorithm probably most widely used for homology detection in comparative genomics, is BLAST. Usually a stringent score cutoff is applied to distinguish putative homologs from possible false positive hits. As a consequence, some BLAST hits are discar...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-86
更新日期:2010-02-12 00:00:00
abstract:BACKGROUND:Throughout the metazoan lineage, typically gonadal expressed Piwi proteins and their guiding piRNAs (~26-32nt in length) form a protective mechanism of RNA interference directed against the propagation of transposable elements (TEs). Most piRNAs are generated from genomic piRNA clusters. Annotation of experi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-5
更新日期:2012-01-10 00:00:00
abstract:BACKGROUND:Phenotype prediction problems are usually considered ill-posed, as the amount of samples is very limited with respect to the scrutinized genetic probes. This fact complicates the sampling of the defective genetic pathways due to the high number of possible discriminatory genetic networks involved. In this re...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3356-6
更新日期:2020-03-11 00:00:00
abstract:BACKGROUND:Mixed models have a long and fruitful history in statistics. They are pertinent to genomics problems because they are highly versatile, accommodating a wide variety of situations within the same theoretical and algorithmic framework. RESULTS:Qxpak is a package for versatile statistical genomics, specificall...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-202
更新日期:2011-05-25 00:00:00
abstract:BACKGROUND:Protein structure prediction has achieved a lot of progress during the last few decades and a greater number of models for a certain sequence can be predicted. Consequently, assessing the qualities of predicted protein models in perspective is one of the key components of successful protein structure predict...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1691-z
更新日期:2017-05-25 00:00:00
abstract:BACKGROUND:The advance of next generation sequencing enables higher throughput with lower price, and as the basic of high-throughput sequencing data analysis, variant calling is widely used in disease research, clinical treatment and medicine research. However, current mainstream variant caller tools have a serious pro...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2665-0
更新日期:2019-02-14 00:00:00
abstract:BACKGROUND:The PathoLogic program constructs Pathway/Genome databases by using a genome's annotation to predict the set of metabolic pathways present in an organism. PathoLogic determines the set of reactions composing those pathways from the enzymes annotated in the organism's genome. Most annotation efforts fail to a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-76
更新日期:2004-06-09 00:00:00
abstract:BACKGROUND:Metabolomics, petroleum and biodiesel chemistry, biomarker discovery, and other fields which rely on high-resolution profiling of complex chemical mixtures generate datasets which contain millions of detector intensity readings, each uniquely addressed along dimensions of time (e.g., retention time of chemic...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-S9-S15
更新日期:2008-08-12 00:00:00
abstract:BACKGROUND:The challenge of classifying cortical interneurons is yet to be solved. Data-driven classification into established morphological types may provide insight and practical value. RESULTS:We trained models using 217 high-quality morphologies of rat somatosensory neocortex interneurons reconstructed by a single...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2470-1
更新日期:2018-12-17 00:00:00
abstract:BACKGROUND:Recently there has been a lot of interest in identifying modules at the level of genetic and metabolic networks of organisms, as well as in identifying single genes and reactions that are essential for the organism. A goal of computational and systems biology is to go beyond identification towards an explana...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-118
更新日期:2006-03-08 00:00:00
abstract:BACKGROUND:Data are the evidentiary basis for scientific hypotheses, analyses and publication, for policy formation and for decision-making. They are essential to the evaluation and testing of results by peer scientists both present and future. There is broad consensus in the scientific and conservation communities tha...
journal_title:BMC bioinformatics
pub_type: 指南,杂志文章
doi:10.1186/1471-2105-12-S15-S1
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:Positron Emission Tomography (PET) is increasingly utilized in radiomics studies for treatment evaluation purposes. Nevertheless, lesion volume identification in PET images is a critical and still challenging step in the process of radiomics, due to the low spatial resolution and high noise level of PET imag...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03647-7
更新日期:2020-09-16 00:00:00
abstract:BACKGROUND:Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual c...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03572-9
更新日期:2020-11-18 00:00:00
abstract:BACKGROUND:Prioritizing disease genes is trying to identify potential disease causing genes for a given phenotype, which can be applied to reveal the inherited basis of human diseases and facilitate drug development. Our motivation is inspired by label propagation algorithm and the false positive protein-protein intera...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2040-6
更新日期:2018-02-08 00:00:00
abstract:BACKGROUND:Tight clustering arose recently from a desire to obtain tighter and potentially more informative clusters in gene expression studies. Scattered genes with relatively loose correlations should be excluded from the clusters. However, in the literature there is little work dedicated to this area of research. On...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-287
更新日期:2008-06-18 00:00:00
abstract:BACKGROUND:A drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. The increasing volume of the scientific literature overwhelms health care professionals trying to be kept up-to-date with all published studies on DDI. METHODS:This paper describes a hybrid linguistic approa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S2-S1
更新日期:2011-03-29 00:00:00
abstract:BACKGROUND:Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-94
更新日期:2004-07-13 00:00:00
abstract:BACKGROUND:Activities of drug molecules can be predicted by QSAR (quantitative structure activity relationship) models, which overcomes the disadvantages of high cost and long cycle by employing the traditional experimental method. With the fact that the number of drug molecules with positive activity is rather fewer t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-S6-S7
更新日期:2008-05-28 00:00:00
abstract:BACKGROUND:Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likelihood that the two prote...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-S4-S7
更新日期:2007-05-22 00:00:00
abstract:BACKGROUND:Aptamers are nucleic acids selected for their ability to bind to molecules of interest and may provide the basis for a whole new class of medicines. If the aptamer is simply a dsDNA molecule with a ssDNA overhang (a "sticky" end) then the segment of ssDNA that complements that overhang provides a known bindi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-S7-S11
更新日期:2007-11-01 00:00:00
abstract:BACKGROUND:Biomedical processes can provide essential information about the (mal-) functioning of an organism and are thus frequently represented in biomedical terminologies and ontologies, including the GO Biological Process branch. These processes often need to be described and categorised in terms of their attribute...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-217
更新日期:2012-08-28 00:00:00
abstract:BACKGROUND:Identifying the interactions between proteins and long non-coding RNAs (lncRNAs) is of great importance to decipher the functional mechanisms of lncRNAs. However, current experimental techniques for detection of lncRNA-protein interactions are limited and inefficient. Many methods have been proposed to predi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2390-0
更新日期:2018-10-11 00:00:00