Abstract:
BACKGROUND:Journal articles and databases are two major modes of communication in the biological sciences, and thus integrating these critical resources is of urgent importance to increase the pace of discovery. Projects focused on bridging the gap between journals and databases have been on the rise over the last five years and have resulted in the development of automated tools that can recognize entities within a document and link those entities to a relevant database. Unfortunately, automated tools cannot resolve ambiguities that arise from one term being used to signify entities that are quite distinct from one another. Instead, resolving these ambiguities requires some manual oversight. Finding the right balance between the speed and portability of automation and the accuracy and flexibility of manual effort is a crucial goal to making text markup a successful venture. RESULTS:We have established a journal article mark-up pipeline that links GENETICS journal articles and the model organism database (MOD) WormBase. This pipeline uses a lexicon built with entities from the database as a first step. The entity markup pipeline results in links from over nine classes of objects including genes, proteins, alleles, phenotypes and anatomical terms. New entities and ambiguities are discovered and resolved by a database curator through a manual quality control (QC) step, along with help from authors via a web form that is provided to them by the journal. New entities discovered through this pipeline are immediately sent to an appropriate curator at the database. Ambiguous entities that do not automatically resolve to one link are resolved by hand ensuring an accurate link. This pipeline has been extended to other databases, namely Saccharomyces Genome Database (SGD) and FlyBase, and has been implemented in marking up a paper with links to multiple databases. CONCLUSIONS:Our semi-automated pipeline hyperlinks articles published in GENETICS to model organism databases such as WormBase. Our pipeline results in interactive articles that are data rich with high accuracy. The use of a manual quality control step sets this pipeline apart from other hyperlinking tools and results in benefits to authors, journals, readers and databases.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Rangarajan A,Schedl T,Yook K,Chan J,Haenel S,Otis L,Faelten S,DePellegrin-Connelly T,Isaacson R,Skrzypek MS,Marygold SJ,Stefancsik R,Cherry JM,Sternberg PW,Müller HMdoi
10.1186/1471-2105-12-175subject
Has Abstractpub_date
2011-05-19 00:00:00pages
175issn
1471-2105pii
1471-2105-12-175journal_volume
12pub_type
杂志文章abstract:BACKGROUND:A recently proposed method for estimating qPCR amplification efficiency E analyzes fluorescence intensity ratios from pairs of points deemed to lie in the exponential growth region on the amplification curves for all reactions in a dilution series. This method suffers from a serious problem: The resulting ra...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03604-4
更新日期:2020-07-08 00:00:00
abstract:BACKGROUND:Targeted resequencing offers a cost-effective alternative to whole-genome and whole-exome sequencing when investigating regions known to be associated with a trait or disease. There are a number of approaches to targeted resequencing, including microfluidic PCR amplification, which may be enhanced by multipl...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1453-3
更新日期:2017-01-05 00:00:00
abstract:BACKGROUND:Reliable prediction of antibody, or B-cell, epitopes remains challenging yet highly desirable for the design of vaccines and immunodiagnostics. A correlation between antigenicity, solvent accessibility, and flexibility in proteins was demonstrated. Subsequently, Thornton and colleagues proposed a method for ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-514
更新日期:2008-12-02 00:00:00
abstract:BACKGROUND:With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0369-z
更新日期:2014-12-14 00:00:00
abstract:BACKGROUND:Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mutation and incorporat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S2-S2
更新日期:2010-04-16 00:00:00
abstract:BACKGROUND:Fluorescence microscopy is widely used to determine the subcellular location of proteins. Efforts to determine location on a proteome-wide basis create a need for automated methods to analyze the resulting images. Over the past ten years, the feasibility of using machine learning methods to recognize all maj...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-210
更新日期:2007-06-19 00:00:00
abstract:BACKGROUND:Most state-of-the-art biomedical entity normalization systems, such as rule-based systems, merely rely on morphological information of entity mentions, but rarely consider their semantic information. In this paper, we introduce a novel convolutional neural network (CNN) architecture that regards biomedical e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1805-7
更新日期:2017-10-03 00:00:00
abstract:BACKGROUND:Intracellular signal transduction is achieved by networks of proteins and small molecules that transmit information from the cell surface to the nucleus, where they ultimately effect transcriptional changes. Understanding the mechanisms cells use to accomplish this important process requires a detailed molec...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-3-34
更新日期:2002-11-01 00:00:00
abstract:BACKGROUND:Viral infectious diseases are the serious threat for human health. The receptor-binding is the first step for the viral infection of hosts. To more effectively treat human viral infectious diseases, the hidden virus-receptor interactions must be discovered. However, current computational methods for predicti...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3278-3
更新日期:2019-12-27 00:00:00
abstract:BACKGROUND:The ability to confidently predict health outcomes from gene expression would catalyze a revolution in molecular diagnostics. Yet, the goal of developing actionable, robust, and reproducible predictive signatures of phenotypes such as clinical outcome has not been attained in almost any disease area. Here, w...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3427-8
更新日期:2020-03-20 00:00:00
abstract:BACKGROUND:Mouse xenografts from (patient-derived) tumors (PDX) or tumor cell lines are widely used as models to study various biological and preclinical aspects of cancer. However, analyses of their RNA and DNA profiles are challenging, because they comprise reads not only from the grafted human cancer but also from t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2353-5
更新日期:2018-10-04 00:00:00
abstract:BACKGROUND:Discovering patterns from gene expression levels is regarded as a classification problem when tissue classes of the samples are given and solved as a discrete-data problem by discretizing the expression levels of each gene into intervals maximizing the interdependence between that gene and the class labels. ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S5-S5
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:Failure on Highly Active Anti-Retroviral Treatment is often accompanied with development of antiviral resistance to one or more drugs included in the treatment. In general, the virus is more likely to develop resistance to drugs with a lower genetic barrier. Previously, we developed a method to reverse engin...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-409
更新日期:2010-08-03 00:00:00
abstract:BACKGROUND:Systematic mutagenesis studies have shown that only a few interface residues termed hot spots contribute significantly to the binding free energy of protein-protein interactions. Therefore, hot spots prediction becomes increasingly important for well understanding the essence of proteins interactions and hel...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-311
更新日期:2011-07-29 00:00:00
abstract:BACKGROUND:The misregulation of microRNA (miRNA) has been shown to cause diseases. Recently, we have proposed a computational method based on a random walk framework on a miRNA-target gene network to predict disease-associated miRNAs. The prediction performance of our method is better than that of some existing state-o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03578-3
更新日期:2020-06-15 00:00:00
abstract:BACKGROUND:Activities of drug molecules can be predicted by QSAR (quantitative structure activity relationship) models, which overcomes the disadvantages of high cost and long cycle by employing the traditional experimental method. With the fact that the number of drug molecules with positive activity is rather fewer t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-S6-S7
更新日期:2008-05-28 00:00:00
abstract:BACKGROUND:Protein function prediction is an important problem in the post-genomic era. Recent advances in experimental biology have enabled the production of vast amounts of protein-protein interaction (PPI) data. Thus, using PPI data to functionally annotate proteins has been extensively studied. However, most existi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S12-S4
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-cou...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-254
更新日期:2006-05-15 00:00:00
abstract:BACKGROUND:It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opp...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-206
更新日期:2012-08-17 00:00:00
abstract:BACKGROUND:Identifying key components in biological processes and their associations is critical for deciphering cellular functions. Recently, numerous gene expression and molecular interaction experiments have been reported in Saccharomyces cerevisiae, and these have enabled systematic studies. Although a number of ap...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-281
更新日期:2011-07-12 00:00:00
abstract:BACKGROUND:Interpretation of quantitative metagenomics data is important for our understanding of ecosystem functioning and assessing differences between various environmental samples. There is a need for an easy to use tool to explore the often complex metagenomics data in taxonomic and functional context. RESULTS:He...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-38
更新日期:2013-02-01 00:00:00
abstract:BACKGROUND:Protein function in eukaryotic cells is often controlled in a cell cycle-dependent manner. Therefore, the correct assignment of cellular phenotypes to cell cycle phases is a crucial task in cell biology research. Nuclear proteins whose localization varies during the cell cycle are valuable and frequently use...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0618-9
更新日期:2015-05-29 00:00:00
abstract:BACKGROUND:Cell direct reprogramming technology has been rapidly developed with its low risk of tumor risk and avoidance of ethical issues caused by stem cells, but it is still limited to specific cell types. Direct reprogramming from an original cell to target cell type needs the cell similarity and cell specific regu...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2699-3
更新日期:2019-03-04 00:00:00
abstract:BACKGROUND:The importance of biodiversity conservation has been increasing steadily due to its benefits to human beings. Recently, producing and managing biodiversity databases have become much easier because of the information technology (IT) advancement. This made the general public's participation in biodiversity co...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S15-S11
更新日期:2009-12-03 00:00:00
abstract:BACKGROUND:Biological data often originate from samples containing mixtures of subpopulations, corresponding e.g. to distinct cellular phenotypes. However, identification of distinct subpopulations may be difficult if biological measurements yield distributions that are not easily separable. RESULTS:We present Multire...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-240
更新日期:2014-07-11 00:00:00
abstract:BACKGROUND:Throughout the metazoan lineage, typically gonadal expressed Piwi proteins and their guiding piRNAs (~26-32nt in length) form a protective mechanism of RNA interference directed against the propagation of transposable elements (TEs). Most piRNAs are generated from genomic piRNA clusters. Annotation of experi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-5
更新日期:2012-01-10 00:00:00
abstract:BACKGROUND:The Codon Adaptation Index (CAI) is a measure of the synonymous codon usage bias for a DNA or RNA sequence. It quantifies the similarity between the synonymous codon usage of a gene and the synonymous codon frequency of a reference set. Extreme values in the nucleotide or in the amino acid composition have a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-65
更新日期:2008-01-29 00:00:00
abstract:BACKGROUND:Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as p...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S11-S2
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:The sequencing of many genomes and tiling arrays consisting of millions of DNA segments spanning entire genomes have made high-resolution copy number analysis possible. Microarray-based comparative genomic hybridization (array CGH) has enabled the high-resolution detection of DNA copy number aberrations. Whi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-203
更新日期:2007-06-14 00:00:00
abstract:BACKGROUND:Metabolomics is one of most recent omics technologies. It has been applied on fields such as food science, nutrition, drug discovery and systems biology. For this, gas chromatography-mass spectrometry (GC-MS) has been largely applied and many computational tools have been developed to support the analysis of...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0374-2
更新日期:2014-12-10 00:00:00