Abstract:
BACKGROUND:Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run are heterogeneous, 2) their web interface is not machine-friendly, 3) they use a non-standard format for data input and output, 4) they do not exploit standards to define application interface and message exchange, and 5) existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. RESULTS:To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD) that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH) category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates with these web services using a web services choreography language (BPEL4WS). CONCLUSION:While it is relatively straightforward to implement and publish web services, the use of web services choreography engines is still in its infancy. However, industry-wide support and push for web services standards is quickly increasing the chance of success in using web services to unify heterogeneous bioinformatics applications. Due to the immaturity of currently available web services engines, it is still most practical to implement a simple, ad-hoc XML-based workflow by hard coding the workflow as a Java application. For advanced web service users the Collaxa BPEL engine facilitates a configuration and management environment that can fully handle XML-based workflow.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
de Knikker R,Guo Y,Li JL,Kwan AK,Yip KY,Cheung DW,Cheung KHdoi
10.1186/1471-2105-5-25keywords:
subject
Has Abstractpub_date
2004-03-10 00:00:00pages
25issn
1471-2105pii
1471-2105-5-25journal_volume
5pub_type
杂志文章abstract:BACKGROUND:Tight clustering arose recently from a desire to obtain tighter and potentially more informative clusters in gene expression studies. Scattered genes with relatively loose correlations should be excluded from the clusters. However, in the literature there is little work dedicated to this area of research. On...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-287
更新日期:2008-06-18 00:00:00
abstract:BACKGROUND:In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene express...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03810-0
更新日期:2020-12-30 00:00:00
abstract:BACKGROUND:Over one hundred different types of post-transcriptional RNA modifications have been identified in human. Researchers discovered that RNA modifications can regulate various biological processes, and RNA methylation, especially N6-methyladenosine, has become one of the most researched topics in epigenetics. ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2840-3
更新日期:2019-05-02 00:00:00
abstract:BACKGROUND:Barcode multiplexing is a key strategy for sharing the rising capacity of next-generation sequencing devices: Synthetic DNA tags, called barcodes, are attached to natural DNA fragments within the library preparation procedure. Different libraries, can individually be labeled with barcodes for a joint sequenc...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0482-7
更新日期:2015-02-18 00:00:00
abstract:BACKGROUND:The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the va...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-160
更新日期:2004-10-26 00:00:00
abstract:BACKGROUND:Biomedical processes can provide essential information about the (mal-) functioning of an organism and are thus frequently represented in biomedical terminologies and ontologies, including the GO Biological Process branch. These processes often need to be described and categorised in terms of their attribute...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-217
更新日期:2012-08-28 00:00:00
abstract:BACKGROUND:Pattern matching is the core of bioinformatics; it is used in database searching, restriction enzyme mapping, and finding open reading frames. It is done repeatedly over increasingly long sequences, thus codes must be efficient and insensitive to sequence length. Such patterns of interest include simple moti...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-3-8
更新日期:2002-01-01 00:00:00
abstract:BACKGROUND:Several features are known to correlate with the GC-content in the human genome, including recombination rate, gene density and distance to telomere. However, by testing for pairwise correlation only, it is impossible to distinguish direct associations from indirect ones and to distinguish between causes and...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S1-S66
更新日期:2009-01-30 00:00:00
abstract:BACKGROUND:Creating a scalable computational infrastructure to analyze the wealth of information contained in data repositories is difficult due to significant barriers in organizing, extracting and analyzing relevant data. Shared data science infrastructures like Boag is needed to efficiently process and parse data co...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2967-2
更新日期:2019-08-22 00:00:00
abstract:BACKGROUND:The scientific literature contains millions of microbial gene identifiers within the full text and tables, but these annotations rarely get incorporated into public sequence databases. We propose to utilize the Open Access (OA) subset of PubMed Central (PMC) as a gene annotation database and have developed a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-43
更新日期:2014-02-05 00:00:00
abstract:BACKGROUND:An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task. RESULTS:The bioinformatics library libcov is a collection of C++ classe...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-138
更新日期:2005-06-06 00:00:00
abstract::Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initiatives to generate E...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S5-S2
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Alternative splicing is the critical process in a single gene coding, which removes introns and joins exons, and splicing branchpoints are indicators for the alternative splicing. Wet experiments have identified a great number of human splicing branchpoints, but many branchpoints are still unknown. In order ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1875-6
更新日期:2017-12-01 00:00:00
abstract:BACKGROUND:Tiling-arrays are applicable to multiple types of biological research questions. Due to its advantages (high sensitivity, resolution, unbiased), the technology is often employed in genome-wide investigations. A major challenge in the analysis of tiling-array data is to define regions-of-interest, i.e., conti...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-275
更新日期:2010-05-21 00:00:00
abstract:BACKGROUND:Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S10-S6
更新日期:2012-06-25 00:00:00
abstract:BACKGROUND:In this paper, it is proposed an optimization approach for producing reduced alphabets for peptide classification, using a Genetic Algorithm. The classification task is performed by a multi-classifier system where each classifier (Linear or Radial Basis function Support Vector Machines) is trained using feat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-45
更新日期:2008-01-24 00:00:00
abstract:BACKGROUND:Machine learning has been utilized to predict cancer drug response from multi-omics data generated from sensitivities of cancer cell lines to different therapeutic compounds. Here, we build machine learning models using gene expression data from patients' primary tumor tissues to predict whether a patient wi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03690-4
更新日期:2020-09-30 00:00:00
abstract:BACKGROUND:Inferring molecular pathway activity is an important step towards reducing the complexity of genomic data, understanding the heterogeneity in clinical outcome, and obtaining molecular correlates of cancer imaging traits. Increasingly, approaches towards pathway activity inference combine molecular profiles (...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-403
更新日期:2011-10-19 00:00:00
abstract:BACKGROUND:Many statistical algorithms combine microarray expression data and genome sequence data to identify transcription factor binding motifs in the low eukaryotic genomes. Finding cis-regulatory elements in higher eukaryote genomes, however, remains a challenge, as searching in the promoter regions of genes with ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-44
更新日期:2006-01-26 00:00:00
abstract:BACKGROUND:Most known eukaryotic genomes contain mobile copied elements called transposable elements. In some species, these elements account for the majority of the genome sequence. They have been subject to many mutations and other genomic events (copies, deletions, captures) during transposition. The identification ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-474
更新日期:2010-09-22 00:00:00
abstract:BACKGROUND:Biological molecules are often asymmetric with respect to stereochemistry, and correct stereochemistry is essential to their function. Molecular dynamics simulations of biomolecules have increasingly become an integral part of biophysical research. However, stereochemical errors in biomolecular structures ca...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-190
更新日期:2011-05-23 00:00:00
abstract:BACKGROUND:Great strides have been made in the effective treatment of HIV-1 with the development of second-generation protease inhibitors (PIs) that are effective against historically multi-PI-resistant HIV-1 variants. Nevertheless, mutation patterns that confer decreasing susceptibility to available PIs continue to ar...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-477
更新日期:2011-12-15 00:00:00
abstract:BACKGROUND:Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each indiv...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1332-y
更新日期:2016-12-23 00:00:00
abstract:BACKGROUND:It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opp...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-206
更新日期:2012-08-17 00:00:00
abstract:BACKGROUND:Amino acids in proteins are not used equally. Some of the differences in the amino acid composition of proteins are between species (mainly due to nucleotide composition and lifestyle) and some are between proteins from the same species (related to protein function, expression or subcellular localization, fo...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-257
更新日期:2006-05-18 00:00:00
abstract:BACKGROUND:The improvements of high throughput technologies have produced large amounts of multi-omics experiments datasets. Initial analysis of these data has revealed many concurrent gene alterations within single dataset or/and among multiple omics datasets. Although powerful bioinformatics pipelines have been devel...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3171-0
更新日期:2019-11-08 00:00:00
abstract:BACKGROUND:Protein domains coordinate to perform multifaceted cellular functions, and domain combinations serve as the functional building blocks of the cell. The available methods to identify functional domain combinations are limited in their scope, e.g. to the identification of combinations falling within individual...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-390
更新日期:2007-10-16 00:00:00
abstract:BACKGROUND:Event sequences where different types of events often occur close together arise, e.g., when studying potential transcription factor binding sites (TFBS, events) of certain transcription factors (TF, types) in a DNA sequence. These events tend to occur in bursts: in some genomic regions there are more genes ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-336
更新日期:2008-08-08 00:00:00
abstract:BACKGROUND:Feed-forward loops (FFLs), consisting of miRNAs, transcription factors (TFs) and their common target genes, have been validated to be important for the initialization and development of complex diseases, including cancer. Esophageal Carcinoma (ESCA) and Stomach Adenocarcinoma (STAD) are two types of malignan...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3230-6
更新日期:2019-12-30 00:00:00
abstract:BACKGROUND:Visualization tools for deep learning models typically focus on discovering key input features without considering how such low level features are combined in intermediate layers to make decisions. Moreover, many of these methods examine a network's response to specific input examples that may be insufficien...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2957-4
更新日期:2019-07-19 00:00:00