Abstract:
BACKGROUND:In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene expression data sets. In that context, designing an efficient feature selection algorithm exploiting knowledge from multiple potential biological resources may be an effective way to understand the spectrum of cancer or other diseases with applications in specific epidemiology for a particular population. RESULTS:In the current article, we design the feature selection and marker gene detection as a multi-view multi-objective clustering problem. Regarding that, we propose an Unsupervised Multi-View Multi-Objective clustering-based gene selection approach called UMVMO-select. Three important resources of biological data (gene ontology, protein interaction data, protein sequence) along with gene expression values are collectively utilized to design two different views. UMVMO-select aims to reduce gene space without/minimally compromising the sample classification efficiency and determines relevant and non-redundant gene markers from three cancer gene expression benchmark data sets. CONCLUSION:A thorough comparative analysis has been performed with five clustering and nine existing feature selection methods with respect to several internal and external validity metrics. Obtained results reveal the supremacy of the proposed method. Reported results are also validated through a proper biological significance test and heatmap plotting.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Acharya S,Cui L,Pan Ydoi
10.1186/s12859-020-03810-0subject
Has Abstractpub_date
2020-12-30 00:00:00pages
483issue
Suppl 18issn
1471-2105pii
10.1186/s12859-020-03810-0journal_volume
21pub_type
杂志文章abstract:BACKGROUND:Interest in de novo genome assembly has been renewed in the past decade due to rapid advances in high-throughput sequencing (HTS) technologies which generate relatively short reads resulting in highly fragmented assemblies consisting of contigs. Additional long-range linkage information is typically used to ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S9-S9
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Peptidases are proteolytic enzymes responsible for fundamental cellular activities in all organisms. Apparently about 2-5% of the genes encode for peptidases, irrespectively of the organism source. The basic peptidase function is "protein digestion" and this can be potentially dangerous in living organisms w...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-S1-S3
更新日期:2007-03-08 00:00:00
abstract:BACKGROUND:The analysis of DNA copy number variants (CNV) has increasing impact in the field of genetic diagnostics and research. However, the interpretation of CNV data derived from high resolution array CGH or NGS platforms is complicated by the considerable variability of the human genome. Therefore, tools for multi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1430-x
更新日期:2017-01-06 00:00:00
abstract:BACKGROUND:De Bruijn graphs are key data structures for the analysis of next-generation sequencing data. They efficiently represent the overlap between reads and hence, also the underlying genome sequence. However, sequencing errors and repeated subsequences render the identification of the true underlying sequence dif...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03740-x
更新日期:2020-09-14 00:00:00
abstract:BACKGROUND:T-cells are key players in regulating a specific immune response. Activation of cytotoxic T-cells requires recognition of specific peptides bound to Major Histocompatibility Complex (MHC) class I molecules. MHC-peptide complexes are potential tools for diagnosis and treatment of pathogens and cancer, as well...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-3-25
更新日期:2002-09-11 00:00:00
abstract:BACKGROUND:Modules of interacting components arranged in specific network topologies have evolved to perform a diverse array of cellular functions. For a network with a constant topological structure, its function within a cell may still be tuned by changing the number of instances of a particular component (e.g., gene...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2866-6
更新日期:2019-05-14 00:00:00
abstract:BACKGROUND:The Triplex cell vaccine is a cancer cellular vaccine that can prevent almost completely the mammary tumor onset in HER-2/neu transgenic mice. In a translational perspective, the activity of the Triplex vaccine was also investigated against lung metastases showing that the vaccine is an effective treatment a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S7-S13
更新日期:2010-10-15 00:00:00
abstract:BACKGROUND:The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. RESULTS:...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2103-8
更新日期:2018-03-09 00:00:00
abstract:BACKGROUND:Feed-forward loops (FFLs), consisting of miRNAs, transcription factors (TFs) and their common target genes, have been validated to be important for the initialization and development of complex diseases, including cancer. Esophageal Carcinoma (ESCA) and Stomach Adenocarcinoma (STAD) are two types of malignan...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3230-6
更新日期:2019-12-30 00:00:00
abstract:BACKGROUND:Long short-term memory (LSTM) is one of the most attractive deep learning methods to learn time series or contexts of input data. Increasing studies, including biological sequence analyses in bioinformatics, utilize this architecture. Amino acid sequence profiles are widely used for bioinformatics studies, s...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2284-1
更新日期:2018-07-18 00:00:00
abstract:BACKGROUND:In omics data integration studies, it is common, for a variety of reasons, for some individuals to not be present in all data tables. Missing row values are challenging to deal with because most statistical methods cannot be directly applied to incomplete datasets. To overcome this issue, we propose a multip...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1273-5
更新日期:2016-10-03 00:00:00
abstract:BACKGROUND:Identifying the interactions between proteins and long non-coding RNAs (lncRNAs) is of great importance to decipher the functional mechanisms of lncRNAs. However, current experimental techniques for detection of lncRNA-protein interactions are limited and inefficient. Many methods have been proposed to predi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2390-0
更新日期:2018-10-11 00:00:00
abstract:BACKGROUND:Systematic mutagenesis studies have shown that only a few interface residues termed hot spots contribute significantly to the binding free energy of protein-protein interactions. Therefore, hot spots prediction becomes increasingly important for well understanding the essence of proteins interactions and hel...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-311
更新日期:2011-07-29 00:00:00
abstract:BACKGROUND:Bioinformatics research for finding biological mechanisms can be done by analysis of transcriptome data with pathway based interpretation. Therefore, researchers have tried to develop tools to analyze transcriptome data with pathway based interpretation. Over the years, the amount of omics data has become hu...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2016-6
更新日期:2018-02-19 00:00:00
abstract:BACKGROUND:Copy number alterations (CNAs), due to their large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task. RESULTS:We introduce Accucopy, a method t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03924-5
更新日期:2021-01-15 00:00:00
abstract:BACKGROUND:Isocitrate Dehydrogenases (IDHs) are important enzymes present in all living cells. Three subfamilies of functionally dimeric IDHs (subfamilies I, II, III) are known. Subfamily I are well-studied bacterial IDHs, like that of Escherischia coli. Subfamily II has predominantly eukaryotic members, but it also ha...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S17-S2
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:Polychromatic flow cytometry is a popular technique that has wide usage in the medical sciences, especially for studying phenotypic properties of cells. The high-dimensionality of data generated by flow cytometry usually makes it difficult to visualize. The naive solution of simply plotting two-dimensional g...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1662-4
更新日期:2017-06-07 00:00:00
abstract:BACKGROUND:Managing and organizing biological knowledge remains a major challenge, due to the complexity of living systems. Recently, systemic representations have been promising in tackling such a challenge at the whole-cell scale. In such representations, the cell is considered as a system composed of interlocked sub...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03637-9
更新日期:2020-07-23 00:00:00
abstract:BACKGROUND:Emerging and re-emerging infectious diseases such as Zika, SARS, ncovid19 and Pertussis, pose a compelling challenge for epidemiologists due to their significant impact on global public health. In this context, computational models and computer simulations are one of the available research tools that epidemi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03648-6
更新日期:2020-09-16 00:00:00
abstract:BACKGROUND:Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Di...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-278
更新日期:2012-10-30 00:00:00
abstract:BACKGROUND:We introduce Approximate Entropy as a mathematical method of analysis for microarray data. Approximate entropy is applied here as a method to classify the complex gene expression patterns resultant of a clinical sample set. Since Entropy is a measure of disorder in a system, we believe that by choosing genes...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-66
更新日期:2009-02-20 00:00:00
abstract:BACKGROUND:The signal peptide plays an important role in protein targeting and protein translocation in both prokaryotic and eukaryotic cells. This transient, short peptide sequence functions like a postal address on an envelope by targeting proteins for secretion or for transfer to specific organelles for further proc...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-249
更新日期:2005-10-13 00:00:00
abstract:BACKGROUND:Non-negative matrix factorisation (NMF), a machine learning algorithm, has been applied to the analysis of microarray data. A key feature of NMF is the ability to identify patterns that together explain the data as a linear combination of expression signatures. Microarray data generally includes individual e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-175
更新日期:2006-03-28 00:00:00
abstract:BACKGROUND:The double-cut-and-join (DCJ) is a model that is able to efficiently sort a genome into another, generalizing the typical mutations (inversions, fusions, fissions, translocations) to which genomes are subject, but allowing the existence of circular chromosomes at the intermediate steps. In the general model ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S19-S14
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:A professional recognition mechanism is required to encourage expedited publishing of an adequate volume of 'fit-for-use' biodiversity data. As a component of such a recognition mechanism, we propose the development of the Data Usage Index (DUI) to demonstrate to data publishers that their efforts of creatin...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S15-S3
更新日期:2011-01-01 00:00:00
abstract:BACKGROUND:Protein domains coordinate to perform multifaceted cellular functions, and domain combinations serve as the functional building blocks of the cell. The available methods to identify functional domain combinations are limited in their scope, e.g. to the identification of combinations falling within individual...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-390
更新日期:2007-10-16 00:00:00
abstract:BACKGROUND:Prognosis is of critical interest in breast cancer research. Biomedical studies suggest that genomic measurements may have independent predictive power for prognosis. Gene profiling studies have been conducted to search for predictive genomic measurements. Genes have the inherent pathway structure, where pat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-1
更新日期:2010-01-01 00:00:00
abstract:BACKGROUND:Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S10-S6
更新日期:2012-06-25 00:00:00
abstract:BACKGROUND:HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. RESULTS:A unified...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1114-6
更新日期:2016-08-31 00:00:00
abstract:BACKGROUND:Innovations in biological and biomedical imaging produce complex high-content and multivariate image data. For decision-making and generation of hypotheses, scientists need novel information technology tools that enable them to visually explore and analyze the data and to discuss and communicate results or f...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-297
更新日期:2011-07-21 00:00:00