Abstract:
BACKGROUND:New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. RESULTS:We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. CONCLUSION:Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Hajibabaei M,Singer GAdoi
10.1186/1471-2105-10-S14-S4subject
Has Abstractpub_date
2009-11-10 00:00:00pages
S4issn
1471-2105pii
1471-2105-10-S14-S4journal_volume
10 Suppl 14pub_type
杂志文章abstract:BACKGROUND:Protein-protein interactions play a key role in biological processes of proteins within a cell. Recent high-throughput techniques have generated protein-protein interaction data in a genome-scale. A wide range of computational approaches have been applied to interactome network analysis for uncovering functi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S3-S3
更新日期:2010-04-29 00:00:00
abstract:BACKGROUND:Recent advances in genomics and proteomics have allowed us to study the nuances of the Warburg effect--a long-standing puzzle in cancer energy metabolism--at an unprecedented level of detail. While modern next-generation sequencing technologies are extremely powerful, the lack of appropriate data analysis to...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S10-S8
更新日期:2011-10-18 00:00:00
abstract:BACKGROUND:High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the dif...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-167
更新日期:2014-06-04 00:00:00
abstract:BACKGROUND:The structure conservation in various α-helix subclasses reveals the sequence and context dependent factors causing distortions in the α-helix. The sequence-structure relationship in these subclasses can be used to predict structural variations in α-helix purely based on its sequence. We train support vector...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S1-S20
更新日期:2011-02-15 00:00:00
abstract:BACKGROUND:High throughput sequencing technology provides us unprecedented opportunities to study transcriptome dynamics. Compared to microarray-based gene expression profiling, RNA-Seq has many advantages, such as high resolution, low background, and ability to identify novel transcripts. Moreover, for genes with mult...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-290
更新日期:2011-07-19 00:00:00
abstract:BACKGROUND:Reliable prediction of antibody, or B-cell, epitopes remains challenging yet highly desirable for the design of vaccines and immunodiagnostics. A correlation between antigenicity, solvent accessibility, and flexibility in proteins was demonstrated. Subsequently, Thornton and colleagues proposed a method for ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-514
更新日期:2008-12-02 00:00:00
abstract:BACKGROUND:Next Generation Sequencing (NGS) is a commonly used technology for studying the genetic basis of biological processes and it underpins the aspirations of precision medicine. However, there are significant challenges when dealing with NGS data. Firstly, a huge number of bioinformatics tools for a wide range o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2791-8
更新日期:2019-04-27 00:00:00
abstract:BACKGROUND:There are many fewer genes in the human genome than there are expressed transcripts. Alternative splicing is the reason. Alternatively spliced transcripts are often specific to tissue type, developmental stage, environmental condition, or disease state. Accurate analysis of microarray expression data and des...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-75
更新日期:2007-03-05 00:00:00
abstract:BACKGROUND:The advent of RNA interference techniques enables the selective silencing of biologically interesting genes in an efficient way. In combination with DNA microarray technology this enables researchers to gain insights into signaling pathways by observing downstream effects of individual knock-downs on gene ex...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-386
更新日期:2007-10-15 00:00:00
abstract:BACKGROUND:Selecting a parsimonious set of informative genes to build highly generalized performance classifier is the most important task for the analysis of tumor microarray expression data. Many existing gene pair evaluation methods cannot highlight diverse patterns of gene pairs only used one strategy of vertical c...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-0893-0
更新日期:2016-01-20 00:00:00
abstract:BACKGROUND:Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to populatio...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-255
更新日期:2013-08-21 00:00:00
abstract:BACKGROUND:Sharing sets of chemical data (e.g., chemical properties, docking scores, etc.) among collaborators with diverse skill sets is a common task in computer-aided drug design and medicinal chemistry. The ability to associate this data with images of the relevant molecular structures greatly facilitates scientifi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-159
更新日期:2014-05-23 00:00:00
abstract:BACKGROUND:Phylogenies capture the evolutionary ancestry linking extant species. Correlations and similarities among a set of species are mediated by and need to be understood in terms of the phylogenic tree. In a similar way it has been argued that biological networks also induce correlations among sets of interacting...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-470
更新日期:2010-09-20 00:00:00
abstract:BACKGROUND:To further our understanding of immunopeptidomics, improved tools are needed to identify peptides presented by major histocompatibility complex class I (MHC-I). Many existing tools are limited by their reliance upon chemical affinity data, which is less biologically relevant than sampling by mass spectrometr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2561-z
更新日期:2019-01-05 00:00:00
abstract::We developed a new computational technique called Step-Level Differential Response (SLDR) to identify genetic regulatory relationships. Our technique takes advantages of functional genomics data for the same species under different perturbation conditions, therefore complementary to current popular computational techn...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S11-S1
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Numerous functional genomics approaches have been developed to study the model organism yeast, Saccharomyces cerevisiae, with the aim of systematically understanding the biology of the cell. Some of these techniques are based on yeast growth differences under different conditions, such as those generated by ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-117
更新日期:2007-04-04 00:00:00
abstract:BACKGROUND:Maturation inhibitors are a new class of antiretroviral drugs. Bevirimat (BVM) was the first substance in this class of inhibitors entering clinical trials. While the inhibitory function of BVM is well established, the molecular mechanisms of action and resistance are not well understood. It is known that mu...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-37
更新日期:2010-01-20 00:00:00
abstract:BACKGROUND:Biological sequences play a major role in molecular and computational biology. They are studied as information-bearing entities that make up DNA, RNA or proteins. The Sequence Ontology, which is part of the OBO Foundry, contains descriptions and definitions of sequences and their properties. Yet the most bas...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-377
更新日期:2009-11-18 00:00:00
abstract:BACKGROUND:We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S9-S16
更新日期:2011-10-05 00:00:00
abstract:BACKGROUND:Identifying similarities between datasets is a fundamental task in data mining and has become an integral part of modern scientific investigation. Whether the task is to identify co-expressed genes in large-scale expression surveys or to predict combinations of gene knockouts which would elicit a similar phe...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2210-6
更新日期:2018-06-11 00:00:00
abstract:BACKGROUND:Next-generation sequencing technologies allow researchers to obtain millions of sequence reads in a single experiment. One important use of the technology is the sequencing of small non-coding regulatory RNAs and the identification of the genomic locales from which they originate. Currently, there is a pauci...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-93
更新日期:2010-02-18 00:00:00
abstract:BACKGROUND:In current comparative proteomics studies, the large number of images generated by 2D gels is currently compared using spot matching algorithms. Unfortunately, differences in gel migration and sample variability make efficient spot alignment very difficult to obtain, and, as consequence most of the software ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-460
更新日期:2008-10-28 00:00:00
abstract:BACKGROUND:De Bruijn graphs are key data structures for the analysis of next-generation sequencing data. They efficiently represent the overlap between reads and hence, also the underlying genome sequence. However, sequencing errors and repeated subsequences render the identification of the true underlying sequence dif...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03740-x
更新日期:2020-09-14 00:00:00
abstract:BACKGROUND:Manual chemical data curation from publications is error-prone, time consuming, and hard to maintain up-to-date data sets. Automatic information extraction can be used as a tool to reduce these problems. Since chemical structures usually described in images, information extraction needs to combine structure ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S17-S9
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:In many biomedical applications, there is a need for developing classification models based on noisy annotations. Recently, various methods addressed this scenario by relaying on unreliable annotations obtained from multiple sources. RESULTS:We proposed a probabilistic classification algorithm based on labe...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S12-S5
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Amplified fragment length polymorphism (AFLP) is a PCR-based technique that involves restriction of genomic DNA followed by ligation of adaptors to the fragments generated and selective PCR amplification of a subset of these fragments. The amplified fragments are separated on a sequencing gel and visualized ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-4-7
更新日期:2003-02-25 00:00:00
abstract:BACKGROUND:The binding of regulatory proteins to their specific DNA targets determines the accurate expression of the neighboring genes. The in silico prediction of new binding sites in completely sequenced genomes is a key aspect in the deeper understanding of gene regulatory networks. Several algorithms have been des...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-554
更新日期:2010-11-11 00:00:00
abstract:BACKGROUND:The large increase in the size of patent collections has led to the need of efficient search strategies. But the development of advanced text-mining applications dedicated to patents of the biomedical field remains rare, in particular to address the needs of the pharmaceutical & biotech industry, which inten...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S1-S15
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Several features are known to correlate with the GC-content in the human genome, including recombination rate, gene density and distance to telomere. However, by testing for pairwise correlation only, it is impossible to distinguish direct associations from indirect ones and to distinguish between causes and...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S1-S66
更新日期:2009-01-30 00:00:00
abstract:BACKGROUND:Many methods for species tree inference require data from a sufficiently large sample of genomic loci in order to produce accurate estimates. However, few studies have attempted to use analytical theory to quantify "sufficiently large". RESULTS:Using the multispecies coalescent model, we report a general an...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1266-4
更新日期:2016-11-11 00:00:00