Abstract:
BACKGROUND:Infectious disease modeling and computational power have evolved such that large-scale agent-based models (ABMs) have become feasible. However, the increasing hardware complexity requires adapted software designs to achieve the full potential of current high-performance workstations. RESULTS:We have found large performance differences with a discrete-time ABM for close-contact disease transmission due to data locality. Sorting the population according to the social contact clusters reduced simulation time by a factor of two. Data locality and model performance can also be improved by storing person attributes separately instead of using person objects. Next, decreasing the number of operations by sorting people by health status before processing disease transmission has also a large impact on model performance. Depending of the clinical attack rate, target population and computer hardware, the introduction of the sort phase decreased the run time from 26% up to more than 70%. We have investigated the application of parallel programming techniques and found that the speedup is significant but it drops quickly with the number of cores. We observed that the effect of scheduling and workload chunk size is model specific and can make a large difference. CONCLUSIONS:Investment in performance optimization of ABM simulator code can lead to significant run time reductions. The key steps are straightforward: the data structure for the population and sorting people on health status before effecting disease propagation. We believe these conclusions to be valid for a wide range of infectious disease ABMs. We recommend that future studies evaluate the impact of data management, algorithmic procedures and parallelization on model performance.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Willem L,Stijven S,Tijskens E,Beutels P,Hens N,Broeckhove Jdoi
10.1186/s12859-015-0612-2subject
Has Abstractpub_date
2015-06-02 00:00:00pages
183issn
1471-2105pii
10.1186/s12859-015-0612-2journal_volume
16pub_type
杂志文章abstract:BACKGROUND:Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2321-0
更新日期:2018-08-29 00:00:00
abstract:BACKGROUND:Using phylogenomic analysis tools for tracking pathogens has become standard practice in academia, public health agencies, and large industries. Using the same raw read genomic data as input, there are several different approaches being used to infer phylogenetic tree. These include many different SNP pipeli...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1592-1
更新日期:2017-03-20 00:00:00
abstract:BACKGROUND:Here we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-0887-y
更新日期:2016-01-20 00:00:00
abstract:BACKGROUND:The adaptive immune response intrinsically depends on hypervariable human leukocyte antigen (HLA) genes. Concomitantly, correct HLA phenotyping is crucial for successful donor-patient matching in organ transplantation. The cost and technical limitations of current laboratory techniques, together with advance...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2239-6
更新日期:2018-06-25 00:00:00
abstract:BACKGROUND:The amount of gene expression data available in public repositories has grown exponentially in the last years, now requiring new data mining tools to transform them in information easily accessible to biologists. RESULTS:By exploiting expression data publicly available in the Gene Expression Omnibus (GEO) d...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S1-S6
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:In recent years, protein-protein interaction (PPI) networks have been well recognized as important resources to elucidate various biological processes and cellular mechanisms. In this paper, we address the problem of predicting protein complexes from a PPI network. This problem has two difficulties. One is r...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1920-5
更新日期:2017-12-06 00:00:00
abstract:BACKGROUND:The low success rate and high cost of drug discovery requires the development of new paradigms to identify molecules of therapeutic value. The Anatomical Therapeutic Chemical (ATC) Code System is a World Health Organization (WHO) proposed classification that assigns multi-level codes to compounds based on th...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1660-6
更新日期:2017-06-07 00:00:00
abstract:BACKGROUND:Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for exa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1254-8
更新日期:2016-10-18 00:00:00
abstract:BACKGROUND:The biological network is highly dynamic. Functional relations between genes can be activated or deactivated depending on the biological conditions. On the genome-scale network, subnetworks that gain or lose local expression consistency may shed light on the regulatory mechanisms related to the changing biol...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3046-4
更新日期:2019-12-24 00:00:00
abstract:BACKGROUND:Identifying the interactions between proteins and long non-coding RNAs (lncRNAs) is of great importance to decipher the functional mechanisms of lncRNAs. However, current experimental techniques for detection of lncRNA-protein interactions are limited and inefficient. Many methods have been proposed to predi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2390-0
更新日期:2018-10-11 00:00:00
abstract:BACKGROUND:Long-range interactions between regulatory DNA elements such as enhancers, insulators and promoters play an important role in regulating transcription. As chromatin contacts have been found throughout the human genome and in different cell types, spatial transcriptional control is now viewed as a general mec...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-414
更新日期:2011-10-25 00:00:00
abstract:BACKGROUND:Phosphorylated histone H2AX, also known as γH2AX, forms μm-sized nuclear foci at the sites of DNA double-strand breaks (DSBs) induced by ionizing radiation and other agents. Due to their specificity and sensitivity, γH2AX immunoassays have become the gold standard for studying DSB induction and repair. One o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3370-8
更新日期:2020-01-28 00:00:00
abstract:BACKGROUND:Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that one has to face are integrity, consistency, redundancy, connectivity, expressiveness and updatability. DESCRIPTION:Here we present...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-70
更新日期:2006-02-15 00:00:00
abstract:BACKGROUND:Accurate somatic mutation-calling is essential for insightful mutation analyses in cancer studies. Several mutation-callers are publicly available and more are likely to appear. Nonetheless, mutation-calling is still challenging and there is unlikely to be one established caller that systematically outperfor...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-154
更新日期:2014-05-21 00:00:00
abstract:BACKGROUND:Non-negative matrix factorisation (NMF), a machine learning algorithm, has been applied to the analysis of microarray data. A key feature of NMF is the ability to identify patterns that together explain the data as a linear combination of expression signatures. Microarray data generally includes individual e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-175
更新日期:2006-03-28 00:00:00
abstract:BACKGROUND:Numerous tools have been developed to predict the fitness effects (i.e., neutral, deleterious, or beneficial) of genetic variants on corresponding proteins. However, prediction in terms of whether a variant causes the variant bearing protein to lose the original function or gain new function is also needed f...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0781-z
更新日期:2015-10-30 00:00:00
abstract:BACKGROUND:Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion dise...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S4-S3
更新日期:2012-03-28 00:00:00
abstract:BACKGROUND:A recently proposed method for estimating qPCR amplification efficiency E analyzes fluorescence intensity ratios from pairs of points deemed to lie in the exponential growth region on the amplification curves for all reactions in a dilution series. This method suffers from a serious problem: The resulting ra...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03604-4
更新日期:2020-07-08 00:00:00
abstract:BACKGROUND:Cross-platform analysis of gene express data requires multiple, intricate processes at different layers with various platforms. However, existing tools handle only a single platform and are not flexible enough to support custom changes, which arise from the new statistical methods, updated versions of refere...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-75
更新日期:2011-03-17 00:00:00
abstract:BACKGROUND:Regulation of gene expression, protein synthesis, replication and assembly of many viruses involve RNA-protein interactions. Although some successful computational tools have been reported to recognize RNA binding sites in proteins, the problem of specificity remains poorly investigated. After the nucleotide...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S13-S5
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the va...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-160
更新日期:2004-10-26 00:00:00
abstract:BACKGROUND:Tumors have been hypothesized to be the result of a mixture of oncogenic events, some of which will be reflected in the gene expression of the tumor. Based on this hypothesis a variety of data-driven methods have been employed to decompose tumor expression profiles into component profiles, hypothetically lin...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S1-S20
更新日期:2009-01-30 00:00:00
abstract:BACKGROUND:RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more rece...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-260
更新日期:2012-10-09 00:00:00
abstract:BACKGROUND:To infer gene regulatory networks from time series gene profiles, two important tasks that are related to biological systems must be undertaken. One task is to determine a valid network structure that has topological properties that can influence the network dynamics profoundly. The other task is to optimize...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S15-S8
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Over the last decade, next generation sequencing (NGS) has become widely available, and is now the sequencing technology of choice for most researchers. Nonetheless, NGS presents a challenge for the evolutionary biologists who wish to estimate evolutionary genetic parameters from a mixed sample of unlabelled...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0810-y
更新日期:2015-11-04 00:00:00
abstract:BACKGROUND:Atomic details of protein-DNA complexes can provide insightful information for better understanding of the function and binding specificity of DNA binding proteins. In addition to experimental methods for solving protein-DNA complex structures, protein-DNA docking can be used to predict native or near-native...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2538-y
更新日期:2018-12-21 00:00:00
abstract:BACKGROUND:There are many fewer genes in the human genome than there are expressed transcripts. Alternative splicing is the reason. Alternatively spliced transcripts are often specific to tissue type, developmental stage, environmental condition, or disease state. Accurate analysis of microarray expression data and des...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-75
更新日期:2007-03-05 00:00:00
abstract:BACKGROUND:Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. RESULTS:PubFocus w...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-424
更新日期:2006-10-02 00:00:00
abstract::We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists work...
journal_title:BMC bioinformatics
pub_type:
doi:10.1186/1471-2105-9-S1-S1
更新日期:2008-01-01 00:00:00
abstract:BACKGROUND:Recently, measuring phenotype similarity began to play an important role in disease diagnosis. Researchers have begun to pay attention to develop phenotype similarity measurement. However, existing methods ignore the interactions between phenotype-associated proteins, which may lead to inaccurate phenotype s...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2102-9
更新日期:2018-04-11 00:00:00