GenHtr: a tool for comparative assessment of genetic heterogeneity in microbial genomes generated by massive short-read sequencing.

Abstract:

BACKGROUND:Microevolution is the study of short-term changes of alleles within a population and their effects on the phenotype of organisms. The result of the below-species-level evolution is heterogeneity, where populations consist of subpopulations with a large number of structural variations. Heterogeneity analysis is thus essential to our understanding of how selective and neutral forces shape bacterial populations over a short period of time. The Solexa Genome Analyzer, a next-generation sequencing platform, allows millions of short sequencing reads to be obtained with great accuracy, allowing for the ability to study the dynamics of the bacterial population at the whole genome level. The tool referred to as GenHtr was developed for genome-wide heterogeneity analysis. RESULTS:For particular bacterial strains, GenHtr relies on a set of Solexa short reads on given bacteria pathogens and their isogenic reference genome to identify heterogeneity sites, the chromosomal positions with multiple variants of genes in the bacterial population, and variations that occur in large gene families. GenHtr accomplishes this by building and comparatively analyzing genome-wide heterogeneity genotypes for both the newly sequenced genomes (using massive short-read sequencing) and their isogenic reference (using simulated data). As proof of the concept, this approach was applied to SRX007711, the Solexa sequencing data for a newly sequenced Staphylococcus aureus subsp. USA300 cell line, and demonstrated that it could predict such multiple variants. They include multiple variants of genes critical in pathogenesis, e.g. genes encoding a LysR family transcriptional regulator, 23 S ribosomal RNA, and DNA mismatch repair protein MutS. The heterogeneity results in non-synonymous and nonsense mutations, leading to truncated proteins for both LysR and MutS. CONCLUSION:GenHtr was developed for genome-wide heterogeneity analysis. Although it is much more time-consuming when compared to Maq, a popular tool for SNP analysis, GenHtr is able to predict potential multiple variants that pre-exist in the bacterial population as well as SNPs that occur in the highly duplicated gene families. It is expected that, with the proper experimental design, this analysis can improve our understanding of the molecular mechanism underlying the dynamics and the evolution of drug-resistant bacterial pathogens.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Yu G

doi

10.1186/1471-2105-11-508

subject

Has Abstract

pub_date

2010-10-12 00:00:00

pages

508

issn

1471-2105

pii

1471-2105-11-508

journal_volume

11

pub_type

杂志文章
  • Ortholog-based protein-protein interaction prediction and its application to inter-species interactions.

    abstract:BACKGROUND:The rapid growth of protein-protein interaction (PPI) data has led to the emergence of PPI network analysis. Despite advances in high-throughput techniques, the interactomes of several model organisms are still far from complete. Therefore, it is desirable to expand these interactomes with ortholog-based and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S12-S11

    authors: Lee SA,Chan CH,Tsai CH,Lai JM,Wang FS,Kao CY,Huang CY

    更新日期:2008-12-12 00:00:00

  • Identification of sequence motifs significantly associated with antisense activity.

    abstract:BACKGROUND:Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-184

    authors: McQuisten KA,Peek AS

    更新日期:2007-06-07 00:00:00

  • Coordinates and intervals in graph-based reference genomes.

    abstract:BACKGROUND:It has been proposed that future reference genomes should be graph structures in order to better represent the sequence diversity present in a species. However, there is currently no standard method to represent genomic intervals, such as the positions of genes or transcription factor binding sites, on graph...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1678-9

    authors: Rand KD,Grytten I,Nederbragt AJ,Storvik GO,Glad IK,Sandve GK

    更新日期:2017-05-18 00:00:00

  • Pushing the accuracy limit of shape complementarity for protein-protein docking.

    abstract:BACKGROUND:Protein-protein docking is a valuable computational approach for investigating protein-protein interactions. Shape complementarity is the most basic component of a scoring function and plays an important role in protein-protein docking. Despite significant progresses, shape representation remains an open que...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3270-y

    authors: Yan Y,Huang SY

    更新日期:2019-12-24 00:00:00

  • Identifying and quantifying metabolites by scoring peaks of GC-MS data.

    abstract:BACKGROUND:Metabolomics is one of most recent omics technologies. It has been applied on fields such as food science, nutrition, drug discovery and systems biology. For this, gas chromatography-mass spectrometry (GC-MS) has been largely applied and many computational tools have been developed to support the analysis of...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0374-2

    authors: Aggio RB,Mayor A,Reade S,Probert CS,Ruggiero K

    更新日期:2014-12-10 00:00:00

  • ImiRP: a computational approach to microRNA target site mutation.

    abstract:BACKGROUND:MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs that function as post-transcriptional regulators of messenger RNA (mRNA) through base-pairing to 6-8 nucleotide long target sites, usually located within the mRNA 3' untranslated region. A common approach to validate and probe microRNA-mRNA interact...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1057-y

    authors: Ryan BC,Werner TS,Howard PL,Chow RL

    更新日期:2016-04-27 00:00:00

  • Graph-representation of oxidative folding pathways.

    abstract:BACKGROUND:The process of oxidative folding combines the formation of native disulfide bond with conformational folding resulting in the native three-dimensional fold. Oxidative folding pathways can be described in terms of disulfide intermediate species (DIS) which can also be isolated and characterized. Each DIS corr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-19

    authors: Agoston V,Cemazar M,Kaján L,Pongor S

    更新日期:2005-01-27 00:00:00

  • Anatomy of enzyme channels.

    abstract:BACKGROUND:Enzyme active sites can be connected to the exterior environment by one or more channels passing through the protein. Despite our current knowledge of enzyme structure and function, surprisingly little is known about how often channels are present or about any structural features such channels may have in co...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-014-0379-x

    authors: Pravda L,Berka K,Svobodová Vařeková R,Sehnal D,Banáš P,Laskowski RA,Koča J,Otyepka M

    更新日期:2014-11-18 00:00:00

  • Methodology capture: discriminating between the "best" and the rest of community practice.

    abstract:BACKGROUND:The methodologies we use both enable and help define our research. However, as experimental complexity has increased the choice of appropriate methodologies has become an increasingly difficult task. This makes it difficult to keep track of available bioinformatics software, let alone the most suitable proto...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-359

    authors: Eales JM,Pinney JW,Stevens RD,Robertson DL

    更新日期:2008-09-01 00:00:00

  • BioNanoAnalyst: a visualisation tool to assess genome assembly quality using BioNano data.

    abstract:BACKGROUND:Reference genome assemblies are valuable, as they provide insights into gene content, genetic evolution and domestication. The higher the quality of a reference genome assembly the more accurate the downstream analysis will be. During the last few years, major efforts have been made towards improving the qua...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1735-4

    authors: Yuan Y,Bayer PE,Scheben A,Chan CK,Edwards D

    更新日期:2017-06-30 00:00:00

  • The Lair: a resource for exploratory analysis of published RNA-Seq data.

    abstract::Increased emphasis on reproducibility of published research in the last few years has led to the large-scale archiving of sequencing data. While this data can, in theory, be used to reproduce results in papers, it is difficult to use in practice. We introduce a series of tools for processing and analyzing RNA-Seq data...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1357-2

    authors: Pimentel H,Sturmfels P,Bray N,Melsted P,Pachter L

    更新日期:2016-12-01 00:00:00

  • A fast indexing approach for protein structure comparison.

    abstract:BACKGROUND:Protein structure comparison is a fundamental task in structural biology. While the number of known protein structures has grown rapidly over the last decade, searching a large database of protein structures is still relatively slow using existing methods. There is a need for new techniques which can rapidly...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S1-S46

    authors: Zhang L,Bailey J,Konagurthu AS,Ramamohanarao K

    更新日期:2010-01-18 00:00:00

  • Extended analysis of benchmark datasets for Agilent two-color microarrays.

    abstract:BACKGROUND:As part of its broad and ambitious mission, the MicroArray Quality Control (MAQC) project reported the results of experiments using External RNA Controls (ERCs) on five microarray platforms. For most platforms, several different methods of data processing were considered. However, there was no similar consid...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-371

    authors: Kerr KF

    更新日期:2007-10-03 00:00:00

  • libgapmis: extending short-read alignments.

    abstract:BACKGROUND:A wide variety of short-read alignment programmes have been published recently to tackle the problem of mapping millions of short reads to a reference genome, focusing on different aspects of the procedure such as time and memory efficiency, sensitivity, and accuracy. These tools allow for a small number of ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S11-S4

    authors: Alachiotis N,Berger S,Flouri T,Pissis SP,Stamatakis A

    更新日期:2013-01-01 00:00:00

  • mRNA:guanine-N7 cap methyltransferases: identification of novel members of the family, evolutionary analysis, homology modeling, and analysis of sequence-structure-function relationships.

    abstract:BACKGROUND:The 5'-terminal cap structure plays an important role in many aspects of mRNA metabolism. Capping enzymes encoded by viruses and pathogenic fungi are attractive targets for specific inhibitors. There is a large body of experimental data on viral and cellular methyltransferases (MTases) that carry out guanine...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-2-2

    authors: Bujnicki JM,Feder M,Radlinska M,Rychlewski L

    更新日期:2001-01-01 00:00:00

  • GSV: a web-based genome synteny viewer for customized data.

    abstract:BACKGROUND:The analysis of genome synteny is a common practice in comparative genomics. With the advent of DNA sequencing technologies, individual biologists can rapidly produce their genomic sequences of interest. Although web-based synteny visualization tools are convenient for biologists to use, none of the existing...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-316

    authors: Revanna KV,Chiu CC,Bierschank E,Dong Q

    更新日期:2011-08-02 00:00:00

  • SAlign-a structure aware method for global PPI network alignment.

    abstract:BACKGROUND:High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. Studying complete protein networks can reveal more insight about healthy/disease states than studying proteins in isolation. Similarly, a comparative study of pr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03827-5

    authors: Ayub U,Haider I,Naveed H

    更新日期:2020-11-04 00:00:00

  • Efficient reconstruction of biological networks via transitive reduction on general purpose graphics processors.

    abstract:BACKGROUND:Techniques for reconstruction of biological networks which are based on perturbation experiments often predict direct interactions between nodes that do not exist. Transitive reduction removes such relations if they can be explained by an indirect path of influences. The existing algorithms for transitive re...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-281

    authors: Bošnački D,Odenbrett MR,Wijs A,Ligtenberg W,Hilbers P

    更新日期:2012-10-30 00:00:00

  • Prior knowledge guided eQTL mapping for identifying candidate genes.

    abstract:BACKGROUND:Expression quantitative trait loci (eQTL) mapping is often used to identify genetic loci and candidate genes correlated with traits. Although usually a group of genes affect complex traits, genes in most eQTL mapping methods are considered as independent. Recently, some eQTL mapping methods have accounted fo...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1387-9

    authors: Wang Y,Richard R,Pan Y

    更新日期:2016-12-13 00:00:00

  • Mining locus tags in PubMed Central to improve microbial gene annotation.

    abstract:BACKGROUND:The scientific literature contains millions of microbial gene identifiers within the full text and tables, but these annotations rarely get incorporated into public sequence databases. We propose to utilize the Open Access (OA) subset of PubMed Central (PMC) as a gene annotation database and have developed a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-43

    authors: Stubben CJ,Challacombe JF

    更新日期:2014-02-05 00:00:00

  • InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams.

    abstract:BACKGROUND:Set comparisons permeate a large number of data analysis workflows, in particular workflows in biological sciences. Venn diagrams are frequently employed for such analysis but current tools are limited. RESULTS:We have developed InteractiVenn, a more flexible tool for interacting with Venn diagrams includin...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0611-3

    authors: Heberle H,Meirelles GV,da Silva FR,Telles GP,Minghim R

    更新日期:2015-05-22 00:00:00

  • Detection of biological switches using the method of Gröebner bases.

    abstract:BACKGROUND:Bistability and ability to switch between two stable states is the hallmark of cellular responses. Cellular signaling pathways often contain bistable switches that regulate the transmission of the extracellular information to the nucleus where important biological functions are executed. RESULTS:In this wor...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3155-0

    authors: Arkun Y

    更新日期:2019-11-28 00:00:00

  • Using iterative cluster merging with improved gap statistics to perform online phenotype discovery in the context of high-throughput RNAi screens.

    abstract:BACKGROUND:The recent emergence of high-throughput automated image acquisition technologies has forever changed how cell biologists collect and analyze data. Historically, the interpretation of cellular phenotypes in different experimental conditions has been dependent upon the expert opinions of well-trained biologist...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-264

    authors: Yin Z,Zhou X,Bakal C,Li F,Sun Y,Perrimon N,Wong ST

    更新日期:2008-06-05 00:00:00

  • Integration of open access literature into the RCSB Protein Data Bank using BioLit.

    abstract:BACKGROUND:Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now on-line and providing an increasing amount of open access content, often free of copyright ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-220

    authors: Prlić A,Martinez MA,Dimitropoulos D,Beran B,Yukich BT,Rose PW,Bourne PE,Fink JL

    更新日期:2010-04-29 00:00:00

  • Identification of markers associated with global changes in DNA methylation regulation in cancers.

    abstract::DNA methylation exhibits different patterns in different cancers. DNA methylation rates at different genomic loci appear to be highly correlated in some samples but not in others. We call such phenomena conditional concordant relationships (CCRs). In this study, we explored DNA methylation patterns in 12 common cancer...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S13-S7

    authors: Qiu P,Zhang L

    更新日期:2012-01-01 00:00:00

  • Deconvolution of gene expression from cell populations across the C. elegans lineage.

    abstract:BACKGROUND:Knowledge of when and in which cells each gene is expressed across multicellular organisms is critical in understanding both gene function and regulation of cell type diversity. However, methods for measuring expression typically involve a trade-off between imaging-based methods, which give the precise locat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-204

    authors: Burdick JT,Murray JI

    更新日期:2013-06-22 00:00:00

  • SegCorr a statistical procedure for the detection of genomic regions of correlated expression.

    abstract:BACKGROUND:Detecting local correlations in expression between neighboring genes along the genome has proved to be an effective strategy to identify possible causes of transcriptional deregulation in cancer. It has been successfully used to illustrate the role of mechanisms such as copy number variation (CNV) or epigene...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1742-5

    authors: Delatola EI,Lebarbier E,Mary-Huard T,Radvanyi F,Robin S,Wong J

    更新日期:2017-07-11 00:00:00

  • Machine learning for discovering missing or wrong protein function annotations : A comparison using updated benchmark datasets.

    abstract:BACKGROUND:A massive amount of proteomic data is generated on a daily basis, nonetheless annotating all sequences is costly and often unfeasible. As a countermeasure, machine learning methods have been used to automatically annotate new protein functions. More specifically, many studies have investigated hierarchical m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1186/s12859-019-3060-6

    authors: Nakano FK,Lietaert M,Vens C

    更新日期:2019-09-23 00:00:00

  • Automated peptide mapping and protein-topographical annotation of proteomics data.

    abstract:BACKGROUND:In quantitative proteomics, peptide mapping is a valuable approach to combine positional quantitative information with topographical and domain information of proteins. Quantitative proteomic analysis of cell surface shedding is an exemplary application area of this approach. RESULTS:We developed ImproViser...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-207

    authors: Videm P,Gunasekaran D,Schröder B,Mayer B,Biniossek ML,Schilling O

    更新日期:2014-06-19 00:00:00

  • Graph based fusion of miRNA and mRNA expression data improves clinical outcome prediction in prostate cancer.

    abstract:BACKGROUND:One of the main goals in cancer studies including high-throughput microRNA (miRNA) and mRNA data is to find and assess prognostic signatures capable of predicting clinical outcome. Both mRNA and miRNA expression changes in cancer diseases are described to reflect clinical characteristics like staging and pro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-488

    authors: Gade S,Porzelius C,Fälth M,Brase JC,Wuttig D,Kuner R,Binder H,Sültmann H,Beissbarth T

    更新日期:2011-12-21 00:00:00