Abstract:
BACKGROUND:The rapid pace of bioscience research makes it very challenging to track relevant articles in one's area of interest. MEDLINE, a primary source for biomedical literature, offers access to more than 20 million citations with three-quarters of a million new ones added each year. Thus it is not surprising to see active research in building new document retrieval and sentence retrieval systems. We present Ferret, a prototype retrieval system, designed to retrieve and rank sentences (and their documents) conveying gene-centric relationships of interest to a scientist. The prototype has several features. For example, it is designed to handle gene name ambiguity and perform query expansion. Inputs can be a list of genes with an optional list of keywords. Sentences are retrieved across species but the species discussed in the records are identified. Results are presented in the form of a heat map and sentences corresponding to specific cells of the heat map may be selected for display. Ferret is designed to assist bio scientists at different stages of research from early idea exploration to advanced analysis of results from bench experiments. RESULTS:Three live case studies in the field of plant biology are presented related to Arabidopsis thaliana. The first is to discover genes that may relate to the phenotype of open immature flower in Arabidopsis. The second case is about finding associations reported between ethylene signaling and a set of 300+ Arabidopsis genes. The third case is on searching for potential gene targets of an Arabidopsis transcription factor hypothesized to be involved in plant stress responses. Ferret was successful in finding valuable information in all three cases. In the first case the bZIP family of genes was identified. In the second case sentences indicating relevant associations were found in other species such as potato and jasmine. In the third sentences led to new research questions about the plant hormone salicylic acid. CONCLUSIONS:Ferret successfully retrieved relevant gene-centric sentences from PubMed records. The three case studies demonstrate end user satisfaction with the system.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Srinivasan P,Zhang XN,Bouten R,Chang Cdoi
10.1186/s12859-015-0630-0subject
Has Abstractpub_date
2015-06-20 00:00:00pages
198issn
1471-2105pii
10.1186/s12859-015-0630-0journal_volume
16pub_type
杂志文章abstract:BACKGROUND:Two important challenges in the analysis of molecular biology information are data (multi-omic information) integration and the detection of patterns across large scale molecular networks and sequences. They are are actually coupled beause the integration of omic information may provide better means to detec...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2175-5
更新日期:2018-07-09 00:00:00
abstract:BACKGROUND:Differentially expressed genes are typically identified by analyzing the variation between replicate measurements. These procedures implicitly assume that there are no systematic errors in the data even though several sources of systematic error are known. RESULTS:OpWise estimates the amount of systematic e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-19
更新日期:2006-01-13 00:00:00
abstract:BACKGROUND:Quantitative polymerase chain reaction (qPCR) is the technique of choice for quantifying gene expression. While the technique itself is well established, approaches for the analysis of qPCR data continue to improve. RESULTS:Here we expand on the common base method to develop procedures for testing linear re...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03696-y
更新日期:2020-09-29 00:00:00
abstract:BACKGROUND:Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-94
更新日期:2004-07-13 00:00:00
abstract:BACKGROUND:Dihydrouridine (D) is a modified base found in conserved positions in the D-loop of tRNA in Bacteria, Eukaryota, and some Archaea. Despite the abundant occurrence of D, little is known about its biochemical roles in mediating tRNA function. It is assumed that D may destabilize the structure of tRNA and thus ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-153
更新日期:2012-06-28 00:00:00
abstract:BACKGROUND:During evolution, large-scale genome rearrangements of chromosomes shuffle the order of homologous genome sequences ("synteny blocks") across species. Some years ago, a controversy erupted in genome rearrangement studies over whether rearrangements recur, causing breakpoints to be reused. METHODS:We investi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S9-S1
更新日期:2011-10-05 00:00:00
abstract:BACKGROUND:Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions. RESULTS:A non-parametric Bayesian ap...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0368-0
更新日期:2014-11-21 00:00:00
abstract:BACKGROUND:Identification of expression quantitative trait loci (eQTLs) is an emerging area in genomic study. The task requires an integrated analysis of genome-wide single nucleotide polymorphism (SNP) data and gene expression data, raising a new computational challenge due to the tremendous size of data. RESULTS:We ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S9-S2
更新日期:2010-10-28 00:00:00
abstract:BACKGROUND:Protein-DNA interactions are important for many cellular processes, however structural knowledge for a large fraction of known and putative complexes is still lacking. Computational docking methods aim at the prediction of complex architecture given detailed structures of its constituents. They are becoming ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-228
更新日期:2012-09-11 00:00:00
abstract:BACKGROUND:Data are the evidentiary basis for scientific hypotheses, analyses and publication, for policy formation and for decision-making. They are essential to the evaluation and testing of results by peer scientists both present and future. There is broad consensus in the scientific and conservation communities tha...
journal_title:BMC bioinformatics
pub_type: 指南,杂志文章
doi:10.1186/1471-2105-12-S15-S1
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:Knowledge of when and in which cells each gene is expressed across multicellular organisms is critical in understanding both gene function and regulation of cell type diversity. However, methods for measuring expression typically involve a trade-off between imaging-based methods, which give the precise locat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-204
更新日期:2013-06-22 00:00:00
abstract:BACKGROUND:Interpretation of quantitative metagenomics data is important for our understanding of ecosystem functioning and assessing differences between various environmental samples. There is a need for an easy to use tool to explore the often complex metagenomics data in taxonomic and functional context. RESULTS:He...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-38
更新日期:2013-02-01 00:00:00
abstract:BACKGROUND:Stable isotope tracing can follow individual atoms through metabolic transformations through the detection of the incorporation of stable isotope within metabolites. This resulting data can be interpreted in terms related to metabolic flux. However, detection of a stable isotope in metabolites by mass spectr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3096-7
更新日期:2019-10-28 00:00:00
abstract:BACKGROUND:MLPA method is a potentially useful semi-quantitative method to detect copy number alterations in targeted regions. In this paper, we propose a method for the normalization procedure based on a non-linear mixed-model, as well as a new approach for determining the statistical significance of altered probes ba...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-261
更新日期:2008-06-04 00:00:00
abstract:BACKGROUND:In many biomedical applications, there is a need for developing classification models based on noisy annotations. Recently, various methods addressed this scenario by relaying on unreliable annotations obtained from multiple sources. RESULTS:We proposed a probabilistic classification algorithm based on labe...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S12-S5
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Normalization in real-time qRT-PCR is necessary to compensate for experimental variation. A popular normalization strategy employs reference gene(s), which may introduce additional variability into normalized expression levels due to innate variation (between tissues, individuals, etc). To minimize this inna...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-253
更新日期:2010-05-14 00:00:00
abstract:BACKGROUND:Identification of the recombination hot/cold spots is critical for understanding the mechanism of recombination as well as the genome evolution process. However, experimental identification of recombination spots is both time-consuming and costly. Developing an accurate and automated method for reliably and ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-340
更新日期:2014-11-20 00:00:00
abstract:BACKGROUND:Whole exome sequencing (WES) has become the strategy of choice to identify a coding allelic variant for a rare human monogenic disorder. This approach is a revolution in medical genetics history, impacting both fundamental research, and diagnostic methods leading to personalized medicine. A plethora of effic...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S14-S9
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:Here we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-0887-y
更新日期:2016-01-20 00:00:00
abstract:BACKGROUND:DNA methylation of CpG dinucleotides is an essential epigenetic modification that plays a key role in transcription. Widely used DNA enrichment-based methods offer high coverage for measuring methylated CpG dinucleotides, with the lowest cost per CpG covered genome-wide. However, these methods measure the DN...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2574-7
更新日期:2018-12-22 00:00:00
abstract:BACKGROUND:Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion dise...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S4-S3
更新日期:2012-03-28 00:00:00
abstract:BACKGROUND:Barcode multiplexing is a key strategy for sharing the rising capacity of next-generation sequencing devices: Synthetic DNA tags, called barcodes, are attached to natural DNA fragments within the library preparation procedure. Different libraries, can individually be labeled with barcodes for a joint sequenc...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0482-7
更新日期:2015-02-18 00:00:00
abstract:BACKGROUND:Next (second) generation sequencing is an increasingly important tool for many areas of molecular biology, however, care must be taken when interpreting its output. Even a low error rate can cause a large number of errors due to the high number of nucleotides being sequenced. Identifying sequencing errors fr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-367
更新日期:2013-12-18 00:00:00
abstract:BACKGROUND:Protein-coding gene detection in prokaryotic genomes is considered a much simpler problem than in intron-containing eukaryotic genomes. However there have been reports that prokaryotic gene finder programs have problems with small genes (either over-predicting or under-predicting). Therefore the question ari...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-131
更新日期:2010-03-15 00:00:00
abstract:BACKGROUND:Identifying key components in biological processes and their associations is critical for deciphering cellular functions. Recently, numerous gene expression and molecular interaction experiments have been reported in Saccharomyces cerevisiae, and these have enabled systematic studies. Although a number of ap...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-281
更新日期:2011-07-12 00:00:00
abstract:BACKGROUND:Emerging and re-emerging infectious diseases such as Zika, SARS, ncovid19 and Pertussis, pose a compelling challenge for epidemiologists due to their significant impact on global public health. In this context, computational models and computer simulations are one of the available research tools that epidemi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03648-6
更新日期:2020-09-16 00:00:00
abstract:BACKGROUND:The ability to confidently predict health outcomes from gene expression would catalyze a revolution in molecular diagnostics. Yet, the goal of developing actionable, robust, and reproducible predictive signatures of phenotypes such as clinical outcome has not been attained in almost any disease area. Here, w...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3427-8
更新日期:2020-03-20 00:00:00
abstract::Increased emphasis on reproducibility of published research in the last few years has led to the large-scale archiving of sequencing data. While this data can, in theory, be used to reproduce results in papers, it is difficult to use in practice. We introduce a series of tools for processing and analyzing RNA-Seq data...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1357-2
更新日期:2016-12-01 00:00:00
abstract:BACKGROUND:Leishmaniasis is a virulent parasitic infection that causes a worldwide disease burden. Most treatments have toxic side-effects and efficacy has decreased due to the emergence of resistant strains. The outlook is worsened by the absence of promising drug targets for this disease. We have taken a computationa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-484
更新日期:2010-09-27 00:00:00
abstract:BACKGROUND:Cross-platform analysis of gene express data requires multiple, intricate processes at different layers with various platforms. However, existing tools handle only a single platform and are not flexible enough to support custom changes, which arise from the new statistical methods, updated versions of refere...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-75
更新日期:2011-03-17 00:00:00