Abstract:
:The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function. As such, parallelised solutions have been proposed but many exhibit scalability limitations and are incapable of effectively processing "Big Data" - the name attributed to datasets that are extremely large, complex and require rapid processing. The Hadoop framework, comprised of distributed storage and a parallelised programming framework known as MapReduce, is specifically designed to work with such datasets but it is not trivial to efficiently redesign and implement bioinformatics algorithms according to this paradigm. The parallelisation strategy of "divide and conquer" for alignment algorithms can be applied to both data sets and input query sequences. However, scalability is still an issue due to memory constraints or large databases, with very large database segmentation leading to additional performance decline. Herein, we present Hadoop Blast (HBlast), a parallelised BLAST algorithm that proposes a flexible method to partition both databases and input query sequences using "virtual partitioning". HBlast presents improved scalability over existing solutions and well balanced computational work load while keeping database segmentation and recompilation to a minimum. Enhanced BLAST search performance on cheap memory constrained hardware has significant implications for in field clinical diagnostic testing; enabling faster and more accurate identification of pathogenic DNA in human blood or tissue samples.
journal_name
J Biomed Informjournal_title
Journal of biomedical informaticsauthors
O'Driscoll A,Belogrudov V,Carroll J,Kropp K,Walsh P,Ghazal P,Sleator RDdoi
10.1016/j.jbi.2015.01.008subject
Has Abstractpub_date
2015-04-01 00:00:00pages
58-64eissn
1532-0464issn
1532-0480pii
S1532-0464(15)00010-6journal_volume
54pub_type
杂志文章abstract::Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with great...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章,meta分析
doi:10.1016/j.jbi.2008.01.011
更新日期:2008-12-01 00:00:00
abstract::Simulators for honing procedural skills (such as surgical skills and central venous catheter placement) have proven to be valuable tools for medical educators and students. While such simulations represent an effective paradigm in surgical education, there is an opportunity to add a layer of cognitive exercises to the...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2009.02.008
更新日期:2009-08-01 00:00:00
abstract::To understand how healthcare technologies are used in practice and evaluate them, researchers have argued for adopting the theoretical framework of Distributed Cognition (DC). This paper describes the methods and results of a study in which a DC methodology, Distributed Cognition for Teamwork (DiCoT), was applied to s...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2012.02.003
更新日期:2012-06-01 00:00:00
abstract::Computer simulations have been used to model infectious diseases to examine the outcomes of alternative strategies for managing their spread. Methicillin resistant Staphylococcus aureus (MRSA) skin and soft tissue infections have become prominent in many communities and efforts are underway to reduce the spread of thi...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2010.03.013
更新日期:2010-08-01 00:00:00
abstract::We analyze the discriminatory power of k-nearest neighbors, logistic regression, artificial neural networks (ANNs), decision tress, and support vector machines (SVMs) on the task of classifying pigmented skin lesions as common nevi, dysplastic nevi, or melanoma. Three different classification tasks were used as benchm...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1006/jbin.2001.1004
更新日期:2001-02-01 00:00:00
abstract:OBJECTIVE:Most existing controlled terminologies can be characterized as collections of terms, wherein the terms are arranged in a simple list or organized in a hierarchy. These kinds of terminologies are considered useful for standardizing terms and encoding data and are currently used in many existing information sys...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2011.01.005
更新日期:2011-04-01 00:00:00
abstract:BACKGROUND:The realization of the potential benefits of health information exchange systems (HIEs) for emergency departments (EDs) depends on the way these systems are actually used. The attributes of volume of information and duration of information processing are important for the study of HIE use patterns in the ED,...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2017.05.007
更新日期:2017-07-01 00:00:00
abstract::In oncology, the reuse of data is confronted with the heterogeneity of terminologies. It is necessary to semantically integrate these distinct terminologies. The semantic integration by using a third terminology as a support is a conventional approach for the integration of two terminologies that are not very structur...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2017.08.013
更新日期:2017-10-01 00:00:00
abstract:BACKGROUND:Allowing patients to access their own electronic health record (EHR) notes through online patient portals has the potential to improve patient-centered care. However, EHR notes contain abundant medical jargon that can be difficult for patients to comprehend. One way to help patients is to reduce information ...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2017.02.016
更新日期:2017-04-01 00:00:00
abstract:BACKGROUND:One of the significant problems in the field of healthcare is the low survival rate of people who have experienced sudden cardiac arrest. Early prediction of cardiac arrest can provide the time required for intervening and preventing its onset in order to reduce mortality. Traditional statistical methods hav...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.10.008
更新日期:2018-12-01 00:00:00
abstract::One of the challenging problems in drug discovery is to identify the novel targets for drugs. Most of the traditional methods for drug targets optimization focused on identifying the particular families of "druggable targets", but ignored their topological properties based on the biological pathways. In this study, we...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.02.007
更新日期:2015-04-01 00:00:00
abstract::Neural embeddings are a popular set of methods for representing words, phrases or text as a low dimensional vector (typically 50-500 dimensions). However, it is difficult to interpret these dimensions in a meaningful manner, and creating neural embeddings requires extensive training and tuning of multiple parameters a...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103096
更新日期:2019-02-01 00:00:00
abstract::The explosive growth of biomedical literature has created a rich source of knowledge, such as that on protein-protein interactions (PPIs) and drug-drug interactions (DDIs), locked in unstructured free text. Biomedical relation classification aims to automatically detect and classify biomedical relations, which has gre...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章,评审
doi:10.1016/j.jbi.2019.103294
更新日期:2019-11-01 00:00:00
abstract::This work investigates, whether openEHR with its reference model, archetypes and templates is suitable for the digital representation of demographic as well as clinical data. Moreover, it elaborates openEHR as a tool for modelling Hospital Information Systems on a regional level based on a national logical infrastruct...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2015.04.004
更新日期:2015-06-01 00:00:00
abstract::Data analytics is routinely used to support biomedical research in all areas, with particular focus on the most relevant clinical conditions, such as cancer. Bioinformatics approaches, in particular, have been used to characterize the molecular aspects of diseases. In recent years, numerous studies have been performed...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章,评审
doi:10.1016/j.jbi.2020.103466
更新日期:2020-07-01 00:00:00
abstract:BACKGROUND:The association of genotyping information with common traits is not satisfactorily solved. One of the most complex traits is pain and association studies have failed so far to provide reproducible predictions of pain phenotypes from genotypes in the general population despite a well-established genetic basis...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2013.07.010
更新日期:2013-10-01 00:00:00
abstract::Identification of co-referent entity mentions inside text has significant importance for other natural language processing (NLP) tasks (e.g. event linking). However, this task, known as co-reference resolution, remains a complex problem, partly because of the confusion over different evaluation metrics and partly beca...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2013.03.007
更新日期:2013-06-01 00:00:00
abstract::Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. g...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.07.015
更新日期:2018-09-01 00:00:00
abstract::In the past, algorithms exploiting varying semantics in interactions between biological objects such as genes and diseases have been used in bioinformatics to uncover latent relationships within biological datasets. In this paper, we consider the algorithm Medusa in parallel with binary classification in order to find...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2018.07.005
更新日期:2018-08-01 00:00:00
abstract:BACKGROUND:Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2013.11.010
更新日期:2014-04-01 00:00:00
abstract::Information extraction is the process of scanning text for information relevant to some interest, including extracting entities, relations, and events. It requires deeper analysis than key word searches, but its aims fall short of the very hard and long-term problem of full text understanding. Information extraction r...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/s1532-0464(03)00015-7
更新日期:2002-08-01 00:00:00
abstract::Community pharmacists are critically placed in the patient care chain being an extended frontline within primary healthcare networks across Europe. They are trained to ensure safe and effective medication use, a crucial and responsible role, extending beyond the common misconception limited to just providing timely ac...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2019.103336
更新日期:2019-12-01 00:00:00
abstract::Medication errors are common and cause serious health issues during care transitions, particularly for older adults with multiple chronic conditions. In this paper, we discuss the design and evaluation of the Colorado Care Tablet, a Personal Health Application (PHA) that helps older adults and their lay caregivers man...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2010.05.007
更新日期:2010-10-01 00:00:00
abstract::The discovery of implicit connections between terms that do not occur together in any scientific document underlies the model of literature-based knowledge discovery first proposed by Swanson. Corpus-derived statistical models of semantic distance such as Latent Semantic Analysis (LSA) have been evaluated previously a...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2009.09.003
更新日期:2010-04-01 00:00:00
abstract:OBJECTIVE:To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. MATERIALS AND METHODS:We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transfor...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2013.01.005
更新日期:2013-06-01 00:00:00
abstract::Investigators in the translational research and systems medicine domains require highly usable, efficient and integrative tools and methods that allow for the navigation of and reasoning over emerging large-scale data sets. Such resources must cover a spectrum of granularity from bio-molecules to population phenotypes...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2011.07.006
更新日期:2011-12-01 00:00:00
abstract::The threat of bioterrorism has stimulated interest in enhancing public health surveillance to detect disease outbreaks more rapidly than is currently possible. To advance research on improving the timeliness of outbreak detection, the Defense Advanced Research Project Agency sponsored the Bio-event Advanced Leading In...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2004.11.007
更新日期:2005-04-01 00:00:00
abstract::Gene selection is an important task in bioinformatics studies, because the accuracy of cancer classification generally depends upon the genes that have biological relevance to the classifying problems. In this work, randomization test (RT) is used as a gene selection method for dealing with gene expression data. In th...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2013.03.009
更新日期:2013-08-01 00:00:00
abstract::Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general-purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approa...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章
doi:10.1016/j.jbi.2012.02.012
更新日期:2012-10-01 00:00:00
abstract:INTRODUCTION:The trend of an ageing and growing world population, particularly in developed countries, is expected to continue for decades to come causing an increase in demand for healthcare resources and services. Consequently, demand is growing faster than rises in funding. The UK government, in partnership with the...
journal_title:Journal of biomedical informatics
pub_type: 杂志文章,评审
doi:10.1016/j.jbi.2019.103102
更新日期:2019-02-01 00:00:00