Abstract:
BACKGROUND:Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual changes in the data and represents a so-called active module. Usually methods implementing this approach find a single subnetwork and thus solve a hard classification problem for vertices. This subnetwork inherently contains erroneous vertices, while no instrument is provided to estimate the confidence level of any particular vertex inclusion. To address this issue, in the current study we consider the active module problem as a soft classification problem. RESULTS:We propose a method to estimate probabilities of each vertex to belong to the active module based on Markov chain Monte Carlo (MCMC) subnetwork sampling. As an example of the performance of our method on real data, we run it on two gene expression datasets. For the first many-replicate expression dataset we show that the proposed approach is consistent with an existing resampling-based method. On the second dataset the jackknife resampling method is inapplicable due to the small number of biological replicates, but the MCMC method can be run and shows high classification performance. CONCLUSIONS:The proposed method allows to estimate the probability that an individual vertex belongs to the active module as well as the false discovery rate (FDR) for a given set of vertices. Given the estimated probabilities, it becomes possible to provide a connected subgraph in a consistent manner for any given FDR level: no vertex can disappear when the FDR level is relaxed. We show, on both simulated and real datasets, that the proposed method has good computational performance and high classification accuracy.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Alexeev N,Isomurodov J,Sukhov V,Korotkevich G,Sergushichev Adoi
10.1186/s12859-020-03572-9subject
Has Abstractpub_date
2020-11-18 00:00:00pages
261issue
Suppl 6issn
1471-2105pii
10.1186/s12859-020-03572-9journal_volume
21pub_type
杂志文章abstract:BACKGROUND:Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS:A Java program was developed for retrieval of protein and nucleic acid s...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-3-40
更新日期:2002-12-19 00:00:00
abstract:BACKGROUND:For the development of genome assembly tools, some comprehensive and efficiently computable validation measures are required to assess the quality of the assembly. The mostly used N50 measure summarizes the assembly results by the length of the scaffold (or contig) overlapping the midpoint of the length-orde...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-255
更新日期:2012-10-03 00:00:00
abstract:BACKGROUND:Transposable elements (TEs) are DNA sequences that are able to move from their location in the genome by cutting or copying themselves to another locus. As such, they are increasingly recognized as impacting all aspects of genome function. With the dramatic reduction in cost of DNA sequencing, it is now poss...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0377-z
更新日期:2014-11-19 00:00:00
abstract:BACKGROUND:Pattern recognition receptors of the immune system have key roles in the regulation of pathways after the recognition of microbial- and danger-associated molecular patterns in vertebrates. Members of NOD-like receptor (NLR) family typically function intracellularly. The NOD-like receptor family CARD domain c...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-275
更新日期:2013-09-17 00:00:00
abstract:BACKGROUND:Protein function prediction is an important problem in the post-genomic era. Recent advances in experimental biology have enabled the production of vast amounts of protein-protein interaction (PPI) data. Thus, using PPI data to functionally annotate proteins has been extensively studied. However, most existi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S12-S4
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Profile-based comparison of multiple sequence alignments is a powerful methodology for the detection remote protein sequence similarity, which is essential for the inference and analysis of protein structure, function, and evolution. Accurate estimation of statistical significance of detected profile similar...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-399
更新日期:2009-12-04 00:00:00
abstract:BACKGROUND:Inferring molecular pathway activity is an important step towards reducing the complexity of genomic data, understanding the heterogeneity in clinical outcome, and obtaining molecular correlates of cancer imaging traits. Increasingly, approaches towards pathway activity inference combine molecular profiles (...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-403
更新日期:2011-10-19 00:00:00
abstract:BACKGROUND:The immune system is multifaceted, structured by diverse components that interconnect using multilayered dynamic cellular processes. Genomic technologies provide a means for investigating, at the molecular level, the adaptations of the immune system in host defense and its dysregulation in pathological condi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1012-y
更新日期:2016-04-18 00:00:00
abstract:BACKGROUND:The aberrant expression of microRNAs is closely connected to the occurrence and development of a great deal of human diseases. To study human diseases, numerous effective computational models that are valuable and meaningful have been presented by researchers. RESULTS:Here, we present a computational framew...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3409-x
更新日期:2020-02-18 00:00:00
abstract:BACKGROUND:DNA copy number variation (CNV) has been recognized as an important source of genetic variation. Array comparative genomic hybridization (aCGH) is commonly used for CNV detection, but the microarray platform has a number of inherent limitations. RESULTS:Here, we describe a method to detect copy number varia...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-80
更新日期:2009-03-06 00:00:00
abstract:BACKGROUND:The landscape of biological and biomedical research is being changed rapidly with the invention of microarrays which enables simultaneous view on the transcription levels of a huge number of genes across different experimental conditions or time points. Using microarray data sets, clustering algorithms have ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-27
更新日期:2009-01-20 00:00:00
abstract:BACKGROUND:Drug resistance testing is mandatory in antiretroviral therapy in human immunodeficiency virus (HIV) infected patients for successful treatment. The emergence of resistances against antiretroviral agents remains the major obstacle in inhibition of viral replication and thus to control infection. Due to the h...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1179-2
更新日期:2016-08-22 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNAs) are a kind of small noncoding RNA molecules that are direct posttranscriptional regulations of mRNA targets. Studies have indicated that miRNAs play key roles in complex diseases by taking part in many biological processes, such as cell growth, cell death and so on. Therefore, in order to ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3426-9
更新日期:2020-03-18 00:00:00
abstract:BACKGROUND:The efficient and robust statistical analysis of the shape of plant organs of different cultivars is an important investigation issue in plant breeding and enables a robust cultivar description within the breeding progress. Laserscanning is a highly accurate and high resolution technique to acquire the 3D sh...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03654-8
更新日期:2020-07-29 00:00:00
abstract:BACKGROUND:We present the algorithm PFClust (Parameter Free Clustering), which is able automatically to cluster data and identify a suitable number of clusters to group them into without requiring any parameters to be specified by the user. The algorithm partitions a dataset into a number of clusters that share some co...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-213
更新日期:2013-07-03 00:00:00
abstract:BACKGROUND:Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we describe FastGroup, ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-2-9
更新日期:2001-01-01 00:00:00
abstract:BACKGROUND:Relative isotope abundance quantification, which can be used for peptide identification and differential peptide quantification, plays an important role in liquid chromatography-mass spectrometry (LC-MS)-based proteomics. However, several major issues exist in the relative isotopic quantification of peptides...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-74
更新日期:2011-03-15 00:00:00
abstract:BACKGROUND:Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles. However, from gene expression profile data alone, statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study. In this paper, we develop a nove...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-416
更新日期:2008-10-06 00:00:00
abstract:BACKGROUND:Mass spectrometry (MS) coupled with online separation methods is commonly applied for differential and quantitative profiling of biological samples in metabolomic as well as proteomic research. Such approaches are used for systems biology, functional genomics, and biomarker discovery, among others. An ongoin...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-395
更新日期:2010-07-23 00:00:00
abstract:BACKGROUND:Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite de...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3211-9
更新日期:2019-12-24 00:00:00
abstract:BACKGROUND:In current comparative proteomics studies, the large number of images generated by 2D gels is currently compared using spot matching algorithms. Unfortunately, differences in gel migration and sample variability make efficient spot alignment very difficult to obtain, and, as consequence most of the software ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-460
更新日期:2008-10-28 00:00:00
abstract:BACKGROUND:The Bioinformatics Resource Manager (BRM) is a web-based tool developed to facilitate identifier conversion and data integration for Homo sapiens (human), Mus musculus (mouse), Rattus norvegicus (rat), Danio rerio (zebrafish), and Macaca mulatta (macaque), as well as perform orthologous conversions among the...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2805-6
更新日期:2019-05-17 00:00:00
abstract:BACKGROUND:Protein-protein interactions (PPIs) are of great importance in cellular systems of organisms, since they are the basis of cellular structure and function and many essential cellular processes are related to that. Most proteins perform their functions by interacting with other proteins, so predicting PPIs acc...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03896-6
更新日期:2020-12-16 00:00:00
abstract:BACKGROUND:Positron Emission Tomography (PET) is increasingly utilized in radiomics studies for treatment evaluation purposes. Nevertheless, lesion volume identification in PET images is a critical and still challenging step in the process of radiomics, due to the low spatial resolution and high noise level of PET imag...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03647-7
更新日期:2020-09-16 00:00:00
abstract:BACKGROUND:Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. Th...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-99
更新日期:2009-03-30 00:00:00
abstract:BACKGROUND:Proteins are comprised of one or several building blocks, known as domains. Such domains can be classified into families according to their evolutionary origin. Whereas sequencing technologies have advanced immensely in recent years, there are no matching computational methodologies for large-scale determina...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-277
更新日期:2006-06-02 00:00:00
abstract:BACKGROUND:Human triosephosphate isomerase (HsTIM) deficiency is a genetic disease caused often by the pathogenic mutation E104D. This mutation, located at the side of an abnormally large cluster of water in the inter-subunit interface, reduces the thermostability of the enzyme. Why and how these water molecules are di...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S16-S11
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Although it is not difficult for state-of-the-art gene finders to identify coding regions in prokaryotic genomes, exact prediction of the corresponding translation initiation sites (TIS) is still a challenging problem. Recently a number of post-processing tools have been proposed for improving the annotation...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-121
更新日期:2006-03-09 00:00:00
abstract:BACKGROUND:Scaffold proteins are known for being crucial regulators of various cellular functions by assembling multiple proteins involved in signaling and metabolic pathways. Identification of scaffold proteins and the study of their molecular mechanisms can open a new aspect of cellular systemic regulation and the re...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1079-5
更新日期:2016-07-28 00:00:00
abstract:BACKGROUND:The analysis of sequence-structure relations of RNA is based on a specific notion and folding of RNA structure. The notion of coarse grained structure employed here is that of canonical RNA pseudoknot contact-structures with at most two mutually crossing bonds (3-noncrossing). These structures are folded by ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S1-S39
更新日期:2009-01-30 00:00:00