Abstract:
BACKGROUND:Searching for similar compounds in a database is the most important process for in-silico drug screening. Since a query compound is an important starting point for the new drug, a query holder, who is afraid of the query being monitored by the database server, usually downloads all the records in the database and uses them in a closed network. However, a serious dilemma arises when the database holder also wants to output no information except for the search results, and such a dilemma prevents the use of many important data resources. RESULTS:In order to overcome this dilemma, we developed a novel cryptographic protocol that enables database searching while keeping both the query holder's privacy and database holder's privacy. Generally, the application of cryptographic techniques to practical problems is difficult because versatile techniques are computationally expensive while computationally inexpensive techniques can perform only trivial computation tasks. In this study, our protocol is successfully built only from an additive-homomorphic cryptosystem, which allows only addition performed on encrypted values but is computationally efficient compared with versatile techniques such as general purpose multi-party computation. In an experiment searching ChEMBL, which consists of more than 1,200,000 compounds, the proposed method was 36,900 times faster in CPU time and 12,000 times as efficient in communication size compared with general purpose multi-party computation. CONCLUSION:We proposed a novel privacy-preserving protocol for searching chemical compound databases. The proposed method, easily scaling for large-scale databases, may help to accelerate drug discovery research by making full use of unused but valuable data that includes sensitive information.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Shimizu K,Nuida K,Arai H,Mitsunari S,Attrapadung N,Hamada M,Tsuda K,Hirokawa T,Sakuma J,Hanaoka G,Asai Kdoi
10.1186/1471-2105-16-S18-S6subject
Has Abstractpub_date
2015-01-01 00:00:00pages
S6issn
1471-2105pii
1471-2105-16-S18-S6journal_volume
16 Suppl 18pub_type
杂志文章abstract:BACKGROUND:Protein remote homology detection is one of the central problems in bioinformatics, which is important for both basic research and practical application. Currently, discriminative methods based on Support Vector Machines (SVMs) achieve the state-of-the-art performance. Exploring feature vectors incorporating...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S2-S3
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual c...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03572-9
更新日期:2020-11-18 00:00:00
abstract:BACKGROUND:High-throughput "omics" based data analysis play emerging roles in life sciences and molecular diagnostics. This emphasizes the urgent need for user-friendly windows-based software interfaces that could process the diversity of large tab-delimited raw data files generated by these methods. Depending on the s...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-235
更新日期:2009-07-29 00:00:00
abstract:BACKGROUND:Recently, DNA methylation has drawn great attention due to its strong correlation with abnormal gene activities and informative representation of the cancer status. As a number of studies focus on DNA methylation signatures in cancer, demand for utilizing publicly available methylome dataset has been increas...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3516-8
更新日期:2020-05-11 00:00:00
abstract:BACKGROUND:Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as p...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S11-S2
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Gas chromatography coupled with mass spectrometry (GC-MS) is one of the technologies widely used for qualitative and quantitative analysis of small molecules. In particular, GC coupled to single quadrupole MS can be utilized for targeted analysis by selected ion monitoring (SIM). However, to our knowledge, t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0681-2
更新日期:2015-08-19 00:00:00
abstract:BACKGROUND:REX1 and REX2 are protein components of the RNA editing complex (the editosome) and function as exouridylylases. The exact roles of REX1 and REX2 in the editosome are unclear and the consequences of the presence of two related proteins are not fully understood. Here, a variety of computational studies were p...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-305
更新日期:2006-06-16 00:00:00
abstract:BACKGROUND:The adaptive immune response intrinsically depends on hypervariable human leukocyte antigen (HLA) genes. Concomitantly, correct HLA phenotyping is crucial for successful donor-patient matching in organ transplantation. The cost and technical limitations of current laboratory techniques, together with advance...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2239-6
更新日期:2018-06-25 00:00:00
abstract:BACKGROUND:It has been proposed that future reference genomes should be graph structures in order to better represent the sequence diversity present in a species. However, there is currently no standard method to represent genomic intervals, such as the positions of genes or transcription factor binding sites, on graph...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1678-9
更新日期:2017-05-18 00:00:00
abstract:BACKGROUND:Genes occurring co-localized in multiple genomes can be strong indicators for either functional constraints on the genome organization or remnant ancestral gene order. The computational detection of these patterns, which are usually referred to as gene clusters, has become increasingly sensitive over the pas...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S15-S14
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:The spatial configuration of chromosomes is essential to various cellular processes, notably gene regulation, while architecture related alterations, such as translocations and gene fusions, are often cancer drivers. Thus, eliciting chromatin conformation is important, yet challenging due to compaction, dyna...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3424-y
更新日期:2020-02-24 00:00:00
abstract:BACKGROUND:Genome-wide transcriptional profiling of patient blood samples offers a powerful tool to investigate underlying disease mechanisms and personalized treatment decisions. Most studies are based on analysis of total peripheral blood mononuclear cells (PBMCs), a mixed population. In this case, accuracy is inhere...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-258
更新日期:2011-06-24 00:00:00
abstract:BACKGROUND:Internal ribosomal entry sites (IRESs) provide alternative, cap-independent translation initiation sites in eukaryotic cells. IRES elements are important factors in viral genomes and are also useful tools for bi-cistronic expression vectors. Most existing RNA structure prediction programs are unable to deal ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-160
更新日期:2009-05-27 00:00:00
abstract:BACKGROUND:Cancer is caused through a multistep process, in which a succession of genetic changes, each conferring a competitive advantage for growth and proliferation, leads to the progressive conversion of normal human cells into malignant cancer cells. Interrogation of cancer genomes holds the promise of understandi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-189
更新日期:2010-04-14 00:00:00
abstract:BACKGROUND:The toxic effects of many simple organic compounds stem from their biotransformation to chemically reactive metabolites which bind covalently to cellular proteins. To understand the mechanisms of cytotoxic responses it may be important to know which proteins become adducted and whether some may be common tar...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-95
更新日期:2007-03-16 00:00:00
abstract:BACKGROUND:The explosive growth of biological data provides opportunities for new statistical and comparative analyses of large information sets, such as alignments comprising tens of thousands of sequences. In such studies, sequence annotations frequently play an essential role, and reliable results depend on metadata...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-S1-S7
更新日期:2008-01-01 00:00:00
abstract:BACKGROUND:The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can h...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2025-5
更新日期:2018-01-25 00:00:00
abstract:BACKGROUND:SNP genotyping microarrays have revolutionized the study of complex disease. The current range of commercially available genotyping products contain extensive catalogues of low frequency and rare variants. Existing SNP calling algorithms have difficulty dealing with these low frequency variants, as the under...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-158
更新日期:2014-05-23 00:00:00
abstract:BACKGROUND:Reverse transcription followed by real-time PCR is widely used for quantification of specific mRNA, and with the use of double-stranded DNA binding dyes it is becoming a standard for microarray data validation. Despite the kinetic information generated by real-time PCR, most popular analysis methods assume c...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-85
更新日期:2007-03-09 00:00:00
abstract:BACKGROUND:Bioinformatics software quality assurance is essential in genomic medicine. Systematic verification and validation of bioinformatics software is difficult because it is often not possible to obtain a realistic "gold standard" for systematic evaluation. Here we apply a technique that originates from the softw...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S16-S15
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of molecular sequence and profiling data. Here, the potential of such modeling is demonstrated by examining the 5,225 free-text items in the Caeno...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-250
更新日期:2006-05-08 00:00:00
abstract:BACKGROUND:We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S9-S16
更新日期:2011-10-05 00:00:00
abstract:BACKGROUND:Protein subcellular localization is crucial for genome annotation, protein function prediction, and drug discovery. Determination of subcellular localization using experimental approaches is time-consuming; thus, computational approaches become highly desirable. Extensive studies of localization prediction h...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-330
更新日期:2007-09-08 00:00:00
abstract:BACKGROUND:The improvements of high throughput technologies have produced large amounts of multi-omics experiments datasets. Initial analysis of these data has revealed many concurrent gene alterations within single dataset or/and among multiple omics datasets. Although powerful bioinformatics pipelines have been devel...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3171-0
更新日期:2019-11-08 00:00:00
abstract:BACKGROUND:In bioinformatics community, many tasks associate with matching a set of protein query sequences in large sequence datasets. To conduct multiple queries in the database, a common used method is to run BLAST on each original querey or on the concatenated queries. It is inefficient since it doesn't exploit the...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1938-8
更新日期:2017-11-21 00:00:00
abstract:BACKGROUND:Conservation and variation scores are used when evaluating sites in a multiple sequence alignment, in order to identify residues critical for structure or function. A variety of scores are available today but it is not clear how different scores relate to each other. RESULTS:We applied 25 conservation and v...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-388
更新日期:2010-07-21 00:00:00
abstract:BACKGROUND:Protein-protein interactions (PPIs) play important roles in various cellular processes. However, the low quality of current PPI data detected from high-throughput screening techniques has diminished the potential usefulness of the data. We need to develop a method to address the high data noise and incomplet...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S7-S8
更新日期:2010-10-15 00:00:00
abstract:BACKGROUND:G-DOC Plus is a data integration and bioinformatics platform that uses cloud computing and other advanced computational tools to handle a variety of biomedical BIG DATA including gene expression arrays, NGS and medical images so that they can be analyzed in the full context of other omics and clinical inform...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1010-0
更新日期:2016-04-30 00:00:00
abstract:BACKGROUND:Sequence mutations represent a driving force of adaptive evolution in bacterial pathogens. It is especially evident in reductive genome evolution where bacteria underwent lifestyles shifting from a free-living to a strictly intracellular or host-depending life. It resulted in loss-of-function mutations and/o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S1-S3
更新日期:2009-01-30 00:00:00
abstract:BACKGROUND:Parametric feature selection methods for machine learning and association studies based on genetic data are not robust with respect to outliers or influential observations. While rank-based, distribution-free statistics offer a robust alternative to parametric methods, their practical utility can be limited,...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2869-3
更新日期:2019-06-13 00:00:00