Abstract:
BACKGROUND:Bioinformatics software quality assurance is essential in genomic medicine. Systematic verification and validation of bioinformatics software is difficult because it is often not possible to obtain a realistic "gold standard" for systematic evaluation. Here we apply a technique that originates from the software testing literature, namely Metamorphic Testing (MT), to systematically test three widely used short-read sequence alignment programs. RESULTS:MT alleviates the problems associated with the lack of gold standard by checking that the results from multiple executions of a program satisfy a set of expected or desirable properties that can be derived from the software specification or user expectations. We tested BWA, Bowtie and Bowtie2 using simulated data and one HapMap dataset. It is interesting to observe that multiple executions of the same aligner using slightly modified input FASTQ sequence file, such as after randomly re-ordering of the reads, may affect alignment results. Furthermore, we found that the list of variant calls can be affected unless strict quality control is applied during variant calling. CONCLUSION:Thorough testing of bioinformatics software is important in delivering clinical genomic medicine. This paper demonstrates a different framework to test a program that involves checking its properties, thus greatly expanding the number and repertoire of test cases we can apply in practice.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Giannoulatou E,Park SH,Humphreys DT,Ho JWdoi
10.1186/1471-2105-15-S16-S15subject
Has Abstractpub_date
2014-01-01 00:00:00pages
S15issn
1471-2105pii
1471-2105-15-S16-S15journal_volume
15 Suppl 16pub_type
杂志文章abstract:BACKGROUND:Most state-of-the-art biomedical entity normalization systems, such as rule-based systems, merely rely on morphological information of entity mentions, but rarely consider their semantic information. In this paper, we introduce a novel convolutional neural network (CNN) architecture that regards biomedical e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1805-7
更新日期:2017-10-03 00:00:00
abstract:BACKGROUND:Contact-guided protein structure prediction methods are becoming more and more successful because of the latest advances in residue-residue contact prediction. To support contact-driven structure prediction, effective tools that can quickly build tertiary structural models of good quality from predicted cont...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2032-6
更新日期:2018-01-25 00:00:00
abstract:BACKGROUND:DNA methylation is an important heritable epigenetic mark that plays a crucial role in transcriptional regulation and the pathogenesis of various human disorders. The commonly used DNA methylation measurement approaches, e.g., Illumina Infinium HumanMethylation-27 and -450 BeadChip arrays (27 K and 450 K arr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03865-z
更新日期:2020-12-01 00:00:00
abstract:BACKGROUND:Most known eukaryotic genomes contain mobile copied elements called transposable elements. In some species, these elements account for the majority of the genome sequence. They have been subject to many mutations and other genomic events (copies, deletions, captures) during transposition. The identification ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-474
更新日期:2010-09-22 00:00:00
abstract:BACKGROUND:Mechanistic models that describe the dynamical behaviors of biochemical systems are common in computational systems biology, especially in the realm of cellular signaling. The development of families of such models, either by a single research group or by different groups working within the same area, presen...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-316
更新日期:2014-09-25 00:00:00
abstract:BACKGROUND:The imputation of genotypes increases the power of genome-wide association studies. However, the imputation quality should be assessed in each particular case. Nevertheless, not all imputation softwares control the error of output, e.g., the last release of fastPHASE program (1.4.8) lacks such an option. In ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03589-0
更新日期:2020-07-24 00:00:00
abstract:BACKGROUND:Gene expression microarray experiments are expensive to conduct and guidelines for acceptable quality control at intermediate steps before and after the samples are hybridised to chips are vague. We conducted an experiment hybridising RNA from human brain to 117 U133A Affymetrix GeneChips and used these data...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-211
更新日期:2006-04-19 00:00:00
abstract:BACKGROUND:Recently, measuring phenotype similarity began to play an important role in disease diagnosis. Researchers have begun to pay attention to develop phenotype similarity measurement. However, existing methods ignore the interactions between phenotype-associated proteins, which may lead to inaccurate phenotype s...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2102-9
更新日期:2018-04-11 00:00:00
abstract:BACKGROUND:Abruptness of pigment patterns at the periphery of a skin lesion is one of the most important dermoscopic features for detection of malignancy. In current clinical setting, abrupt cutoff of a skin lesion determined by an examination of a dermatologist. This process is subjective, nonquantitative, and error-p...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1892-5
更新日期:2017-12-28 00:00:00
abstract:BACKGROUND:Typical evolutionary events like recombination, hybridization or gene transfer make necessary the use of phylogenetic networks to properly depict the evolution of DNA and protein sequences. Although several theoretical classes have been proposed to characterize these networks, they make stringent assumptions...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-268
更新日期:2010-05-20 00:00:00
abstract:BACKGROUND:Mecp2 null mice model Rett syndrome (RTT) a human neurological disorder affecting females after apparent normal pre- and peri-natal developmental periods. Neuroanatomical studies in cerebral cortex of RTT mouse models revealed delayed maturation of neuronal morphology and autonomous as well as non-cell auton...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0859-7
更新日期:2016-01-20 00:00:00
abstract:BACKGROUND:Bioluminescent proteins (BLPs) widely exist in many living organisms. As BLPs are featured by the capability of emitting lights, they can be served as biomarkers and easily detected in biomedical research, such as gene expression analysis and signal transduction pathways. Therefore, accurate identification o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1709-6
更新日期:2017-06-05 00:00:00
abstract:BACKGROUND:The goal of class prediction studies is to develop rules to accurately predict the class membership of new samples. The rules are derived using the values of the variables available for each subject: the main characteristic of high-dimensional data is that the number of variables greatly exceeds the number o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-523
更新日期:2010-10-20 00:00:00
abstract:BACKGROUND:Identifying protein complexes from protein-protein interaction (PPI) network is one of the most important tasks in proteomics. Existing computational methods try to incorporate a variety of biological evidences to enhance the quality of predicted complexes. However, it is still a challenge to integrate diffe...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2555-x
更新日期:2018-12-20 00:00:00
abstract:BACKGROUND:RNA sequencing (RNA-seq) has become the standard means of analyzing gene and transcript expression in high-throughput. While previously sequence alignment was a time demanding step, fast alignment methods and even more so transcript counting methods which avoid mapping and quantify gene and transcript expres...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2799-0
更新日期:2019-05-03 00:00:00
abstract:BACKGROUND:Zebrafish is a widely used model organism for studying heart development and cardiac-related pathogenesis. With the ability of surviving without a functional circulation at larval stages, strong genetic similarity between zebrafish and mammals, prolific reproduction and optically transparent embryos, zebrafi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2166-6
更新日期:2018-05-09 00:00:00
abstract:BACKGROUND:The central element of each enzyme is the catalytic site, which commonly catalyzes a single biochemical reaction with high specificity. It was unclear to us how often sites that catalyze the same or highly similar reactions evolved on different, i. e. non-homologous protein folds and how similar their 3D pos...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0807-6
更新日期:2015-11-04 00:00:00
abstract::Complexes of physically interacting proteins are one of the fundamental functional units responsible for driving key biological mechanisms within the cell. With the advent of high-throughput techniques, significant amount of protein interaction (PPI) data has been catalogued for organisms such as yeast, which has in t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S17-S16
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:Beta-barrel transmembrane (bbtm) proteins are a functionally important and diverse group of proteins expressed in the outer membranes of bacteria (both gram negative and acid fast gram positive), mitochondria and chloroplasts. Despite recent publications describing reasonable levels of accuracy for discrimin...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-56
更新日期:2005-03-15 00:00:00
abstract:BACKGROUND:Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree b...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-356
更新日期:2009-10-27 00:00:00
abstract:BACKGROUND:Accurate annotation of translation initiation sites (TISs) is essential for understanding the translation initiation mechanism. However, the reliability of TIS annotation in widely used databases such as RefSeq is uncertain due to the lack of experimental benchmarks. RESULTS:Based on a homogeneity assumptio...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-160
更新日期:2008-03-25 00:00:00
abstract:BACKGROUND:Handling the vast amount of gene expression data generated by genome-wide transcriptional profiling techniques is a challenging task, demanding an informed combination of pre-processing, filtering and analysis methods if meaningful biological conclusions are to be drawn. For example, a range of traditional s...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0358-2
更新日期:2014-11-04 00:00:00
abstract:BACKGROUND:Mass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-163
更新日期:2008-03-26 00:00:00
abstract:BACKGROUND:Fluorescent and luminescent gene reporters allow us to dynamically quantify changes in molecular species concentration over time on the single cell level. The mathematical modeling of their interaction through multivariate dynamical models requires the development of effective statistical methods to calibrat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-343
更新日期:2009-10-19 00:00:00
abstract:BACKGROUND:Complex networks are studied across many fields of science and are particularly important to understand biological processes. Motifs in networks are small connected sub-graphs that occur significantly in higher frequencies than in random networks. They have recently gathered much attention as a useful concep...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-318
更新日期:2009-10-04 00:00:00
abstract:BACKGROUND:The diagnosis of many diseases can be often formulated as a decision problem; uncertainty affects these problems so that many computerized Diagnostic Decision Support Systems (in the following, DDSSs) have been developed to aid the physician in interpreting clinical data and thus to improve the quality of th...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S1-S4
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:In recent times, there has been an exponential rise in the number of protein structures in databases e.g. PDB. So, design of fast algorithms capable of querying such databases is becoming an increasingly important research issue. This paper reports an algorithm, motivated from spectral graph matching techniq...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-S5-S5
更新日期:2006-12-18 00:00:00
abstract:BACKGROUND:Many biases and spurious effects are inherent in RNA-seq technology, resulting in a non-uniform distribution of sequencing read counts for each base position in a gene. Therefore, a base-level strategy is required to model the non-uniformity. Also, the properties of sequencing read counts can be leveraged to...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1780-z
更新日期:2017-08-09 00:00:00
abstract:BACKGROUND:A number of tools for the examination of linkage disequilibrium (LD) patterns between nearby alleles exist, but none are available for quickly and easily investigating LD at longer ranges (>500 kb). We have developed a web-based query tool (GLIDERS: Genome-wide LInkage DisEquilibrium Repository and Search en...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-367
更新日期:2009-10-31 00:00:00
abstract:: ...
journal_title:BMC bioinformatics
pub_type: 历史文章,杂志文章
doi:10.1186/s12859-019-2618-7
更新日期:2019-03-14 00:00:00