BicPAMS: software for biological data analysis with pattern-based biclustering.

Abstract:

BACKGROUND:Biclustering has been largely applied for the unsupervised analysis of biological data, being recognised today as a key technique to discover putative modules in both expression data (subsets of genes correlated in subsets of conditions) and network data (groups of coherently interconnected biological entities). However, given its computational complexity, only recent breakthroughs on pattern-based biclustering enabled efficient searches without the restrictions that state-of-the-art biclustering algorithms place on the structure and homogeneity of biclusters. As a result, pattern-based biclustering provides the unprecedented opportunity to discover non-trivial yet meaningful biological modules with putative functions, whose coherency and tolerance to noise can be tuned and made problem-specific. METHODS:To enable the effective use of pattern-based biclustering by the scientific community, we developed BicPAMS (Biclustering based on PAttern Mining Software), a software that: 1) makes available state-of-the-art pattern-based biclustering algorithms (BicPAM (Henriques and Madeira, Alg Mol Biol 9:27, 2014), BicNET (Henriques and Madeira, Alg Mol Biol 11:23, 2016), BicSPAM (Henriques and Madeira, BMC Bioinforma 15:130, 2014), BiC2PAM (Henriques and Madeira, Alg Mol Biol 11:1-30, 2016), BiP (Henriques and Madeira, IEEE/ACM Trans Comput Biol Bioinforma, 2015), DeBi (Serin and Vingron, AMB 6:1-12, 2011) and BiModule (Okada et al., IPSJ Trans Bioinf 48(SIG5):39-48, 2007)); 2) consistently integrates their dispersed contributions; 3) further explores additional accuracy and efficiency gains; and 4) makes available graphical and application programming interfaces. RESULTS:Results on both synthetic and real data confirm the relevance of BicPAMS for biological data analysis, highlighting its essential role for the discovery of putative modules with non-trivial yet biologically significant functions from expression and network data. CONCLUSIONS:BicPAMS is the first biclustering tool offering the possibility to: 1) parametrically customize the structure, coherency and quality of biclusters; 2) analyze large-scale biological networks; and 3) tackle the restrictive assumptions placed by state-of-the-art biclustering algorithms. These contributions are shown to be key for an adequate, complete and user-assisted unsupervised analysis of biological data. SOFTWARE:BicPAMS and its tutorial available in http://www.bicpams.com .

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Henriques R,Ferreira FL,Madeira SC

doi

10.1186/s12859-017-1493-3

subject

Has Abstract

pub_date

2017-02-02 00:00:00

pages

82

issue

1

issn

1471-2105

pii

10.1186/s12859-017-1493-3

journal_volume

18

pub_type

杂志文章
  • Finding motif pairs in the interactions between heterogeneous proteins via bootstrapping and boosting.

    abstract:BACKGROUND:Supervised learning and many stochastic methods for predicting protein-protein interactions require both negative and positive interactions in the training data set. Unlike positive interactions, negative interactions cannot be readily obtained from interaction data, so these must be generated. In protein-pr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S1-S57

    authors: Kim J,Huang DS,Han K

    更新日期:2009-01-30 00:00:00

  • PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine.

    abstract:BACKGROUND:The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND) seeks to capture these data in a machine-rea...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-11

    authors: Donaldson I,Martin J,de Bruijn B,Wolting C,Lay V,Tuekam B,Zhang S,Baskin B,Bader GD,Michalickova K,Pawson T,Hogue CW

    更新日期:2003-03-27 00:00:00

  • Predicting anatomic therapeutic chemical classification codes using tiered learning.

    abstract:BACKGROUND:The low success rate and high cost of drug discovery requires the development of new paradigms to identify molecules of therapeutic value. The Anatomical Therapeutic Chemical (ATC) Code System is a World Health Organization (WHO) proposed classification that assigns multi-level codes to compounds based on th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1660-6

    authors: Olson T,Singh R

    更新日期:2017-06-07 00:00:00

  • Reranking candidate gene models with cross-species comparison for improved gene prediction.

    abstract:BACKGROUND:Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-433

    authors: Liu Q,Crammer K,Pereira FC,Roos DS

    更新日期:2008-10-14 00:00:00

  • Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data.

    abstract:BACKGROUND:This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulatio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-310

    authors: Barros RC,Winck AT,Machado KS,Basgalupp MP,de Carvalho AC,Ruiz DD,de Souza ON

    更新日期:2012-11-21 00:00:00

  • Recovering rearranged cancer chromosomes from karyotype graphs.

    abstract:BACKGROUND:Many cancer genomes are extensively rearranged with highly aberrant chromosomal karyotypes. Structural and copy number variations in cancer genomes can be determined via abnormal mapping of sequenced reads to the reference genome. Recently it became possible to reconcile both of these types of large-scale va...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3208-4

    authors: Aganezov S,Zban I,Aksenov V,Alexeev N,Schatz MC

    更新日期:2019-12-17 00:00:00

  • An improved classification of G-protein-coupled receptors using sequence-derived features.

    abstract:BACKGROUND:G-protein-coupled receptors (GPCRs) play a key role in diverse physiological processes and are the targets of almost two-thirds of the marketed drugs. The 3 D structures of GPCRs are largely unavailable; however, a large number of GPCR primary sequences are known. To facilitate the identification and charact...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-420

    authors: Peng ZL,Yang JY,Chen X

    更新日期:2010-08-09 00:00:00

  • Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests.

    abstract:BACKGROUND:To understand the evolutionary role of Lateral Gene Transfer (LGT), accurate methods are needed to identify transferred genes and infer their timing of acquisition. Phylogenetic methods are particularly promising for this purpose, but the reconciliation of a gene tree with a reference (species) tree is compu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-324

    authors: Abby SS,Tannier E,Gouy M,Daubin V

    更新日期:2010-06-15 00:00:00

  • Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation.

    abstract:BACKGROUND:Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic cl...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-63

    authors: Smith DD,Saetrom P,Snøve O Jr,Lundberg C,Rivas GE,Glackin C,Larson GP

    更新日期:2008-01-28 00:00:00

  • GraphCrunch: a tool for large network analyses.

    abstract:BACKGROUND:The recent explosion in biological and other real-world network data has created the need for improved tools for large network analyses. In addition to well established global network properties, several new mathematical techniques for analyzing local structural properties of large networks have been develop...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-70

    authors: Milenković T,Lai J,Przulj N

    更新日期:2008-01-30 00:00:00

  • Investigating the concordance of Gene Ontology terms reveals the intra- and inter-platform reproducibility of enrichment analysis.

    abstract:BACKGROUND:Reliability and Reproducibility of differentially expressed genes (DEGs) are essential for the biological interpretation of microarray data. The microarray quality control (MAQC) project launched by US Food and Drug Administration (FDA) elucidated that the lists of DEGs generated by intra- and inter-platform...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-143

    authors: Zhang L,Zhang J,Yang G,Wu D,Jiang L,Wen Z,Li M

    更新日期:2013-04-29 00:00:00

  • Semantically linking molecular entities in literature through entity relationships.

    abstract:BACKGROUND:Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attrib...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S11-S6

    authors: Van Landeghem S,Björne J,Abeel T,De Baets B,Salakoski T,Van de Peer Y

    更新日期:2012-06-26 00:00:00

  • JNets: exploring networks by integrating annotation.

    abstract:BACKGROUND:A common method for presenting and studying biological interaction networks is visualization. Software tools can enhance our ability to explore network visualizations and improve our understanding of biological systems, particularly when these tools offer analysis capabilities. However, most published networ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-95

    authors: Macpherson JI,Pinney JW,Robertson DL

    更新日期:2009-03-26 00:00:00

  • ImmunoGlobe: enabling systems immunology with a manually curated intercellular immune interaction network.

    abstract:BACKGROUND:While technological advances have made it possible to profile the immune system at high resolution, translating high-throughput data into knowledge of immune mechanisms has been challenged by the complexity of the interactions underlying immune processes. Tools to explore the immune network are critical for ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03702-3

    authors: Atallah MB,Tandon V,Hiam KJ,Boyce H,Hori M,Atallah W,Spitzer MH,Engleman E,Mallick P

    更新日期:2020-08-10 00:00:00

  • Clinical phenotype-based gene prioritization: an initial study using semantic similarity and the human phenotype ontology.

    abstract:BACKGROUND:Exome sequencing is a promising method for diagnosing patients with a complex phenotype. However, variant interpretation relative to patient phenotype can be challenging in some scenarios, particularly clinical assessment of rare complex phenotypes. Each patient's sequence reveals many possibly damaging vari...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-248

    authors: Masino AJ,Dechene ET,Dulik MC,Wilkens A,Spinner NB,Krantz ID,Pennington JW,Robinson PN,White PS

    更新日期:2014-07-21 00:00:00

  • Analysis and prediction of antibacterial peptides.

    abstract:BACKGROUND:Antibacterial peptides are important components of the innate immune system, used by the host to protect itself from different types of pathogenic bacteria. Over the last few decades, the search for new drugs and drug targets has prompted an interest in these antibacterial peptides. We analyzed 486 antibacte...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-263

    authors: Lata S,Sharma BK,Raghava GP

    更新日期:2007-07-23 00:00:00

  • Systematic integration of experimental data and models in systems biology.

    abstract:BACKGROUND:The behaviour of biological systems can be deduced from their mathematical models. However, multiple sources of data in diverse forms are required in the construction of a model in order to define its components and their biochemical reactions, and corresponding parameters. Automating the assembly and use of...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-582

    authors: Li P,Dada JO,Jameson D,Spasic I,Swainston N,Carroll K,Dunn W,Khan F,Malys N,Messiha HL,Simeonidis E,Weichart D,Winder C,Wishart J,Broomhead DS,Goble CA,Gaskell SJ,Kell DB,Westerhoff HV,Mendes P,Paton NW

    更新日期:2010-11-29 00:00:00

  • Reverse engineering gene regulatory networks: coupling an optimization algorithm with a parameter identification technique.

    abstract:BACKGROUND:To infer gene regulatory networks from time series gene profiles, two important tasks that are related to biological systems must be undertaken. One task is to determine a valid network structure that has topological properties that can influence the network dynamics profoundly. The other task is to optimize...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S15-S8

    authors: Hsiao YT,Lee WP

    更新日期:2014-01-01 00:00:00

  • Predicting nucleosome positioning using a duration Hidden Markov Model.

    abstract:BACKGROUND:The nucleosome is the fundamental packing unit of DNAs in eukaryotic cells. Its detailed positioning on the genome is closely related to chromosome functions. Increasing evidence has shown that genomic DNA sequence itself is highly predictive of nucleosome positioning genome-wide. Therefore a fast software t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-346

    authors: Xi L,Fondufe-Mittendorf Y,Xia L,Flatow J,Widom J,Wang JP

    更新日期:2010-06-24 00:00:00

  • Machine learning for discovering missing or wrong protein function annotations : A comparison using updated benchmark datasets.

    abstract:BACKGROUND:A massive amount of proteomic data is generated on a daily basis, nonetheless annotating all sequences is costly and often unfeasible. As a countermeasure, machine learning methods have been used to automatically annotate new protein functions. More specifically, many studies have investigated hierarchical m...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1186/s12859-019-3060-6

    authors: Nakano FK,Lietaert M,Vens C

    更新日期:2019-09-23 00:00:00

  • Machine-learning scoring functions for identifying native poses of ligands docked to known and novel proteins.

    abstract:BACKGROUND:Molecular docking is a widely-employed method in structure-based drug design. An essential component of molecular docking programs is a scoring function (SF) that can be used to identify the most stable binding pose of a ligand, when bound to a receptor protein, from among a large set of candidate poses. Des...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-16-S6-S3

    authors: Ashtawy HM,Mahapatra NR

    更新日期:2015-01-01 00:00:00

  • MGOGP: a gene module-based heuristic algorithm for cancer-related gene prioritization.

    abstract:BACKGROUND:Prioritizing genes according to their associations with a cancer allows researchers to explore genes in more informed ways. By far, Gene-centric or network-centric gene prioritization methods are predominated. Genes and their protein products carry out cellular processes in the context of functional modules....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2216-0

    authors: Su L,Liu G,Bai T,Meng X,Ma Q

    更新日期:2018-06-05 00:00:00

  • A unifying model of genome evolution under parsimony.

    abstract:BACKGROUND:Parsimony and maximum likelihood methods of phylogenetic tree estimation and parsimony methods for genome rearrangements are central to the study of genome evolution yet to date they have largely been pursued in isolation. RESULTS:We present a data structure called a history graph that offers a practical ba...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-206

    authors: Paten B,Zerbino DR,Hickey G,Haussler D

    更新日期:2014-06-19 00:00:00

  • Predicting peptide presentation by major histocompatibility complex class I: an improved machine learning approach to the immunopeptidome.

    abstract:BACKGROUND:To further our understanding of immunopeptidomics, improved tools are needed to identify peptides presented by major histocompatibility complex class I (MHC-I). Many existing tools are limited by their reliance upon chemical affinity data, which is less biologically relevant than sampling by mass spectrometr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2561-z

    authors: Boehm KM,Bhinder B,Raja VJ,Dephoure N,Elemento O

    更新日期:2019-01-05 00:00:00

  • Predicting blood pressure from physiological index data using the SVR algorithm.

    abstract:BACKGROUND:Blood pressure diseases have increasingly been identified as among the main factors threatening human health. How to accurately and conveniently measure blood pressure is the key to the implementation of effective prevention and control measures for blood pressure diseases. Traditional blood pressure measure...

    journal_title:BMC bioinformatics

    pub_type: 临床试验,杂志文章

    doi:10.1186/s12859-019-2667-y

    authors: Zhang B,Ren H,Huang G,Cheng Y,Hu C

    更新日期:2019-02-28 00:00:00

  • Predicting substrates of the human breast cancer resistance protein using a support vector machine method.

    abstract:BACKGROUND:Human breast cancer resistance protein (BCRP) is an ATP-binding cassette (ABC) efflux transporter that confers multidrug resistance in cancers and also plays an important role in the absorption, distribution and elimination of drugs. Prediction as to if drugs or new molecular entities are BCRP substrates sho...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-130

    authors: Hazai E,Hazai I,Ragueneau-Majlessi I,Chung SP,Bikadi Z,Mao Q

    更新日期:2013-04-15 00:00:00

  • Inferring topology from clustering coefficients in protein-protein interaction networks.

    abstract:BACKGROUND:Although protein-protein interaction networks determined with high-throughput methods are incomplete, they are commonly used to infer the topology of the complete interactome. These partial networks often show a scale-free behavior with only a few proteins having many and the majority having only a few conne...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-519

    authors: Friedel CC,Zimmer R

    更新日期:2006-11-30 00:00:00

  • Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules'.

    abstract:BACKGROUND:Metabolomics experiments using Mass Spectrometry (MS) technology measure the mass to charge ratio (m/z) and intensity of ionised molecules in crude extracts of complex biological samples to generate high dimensional metabolite 'fingerprint' or metabolite 'profile' data. High resolution MS instruments perform...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-227

    authors: Draper J,Enot DP,Parker D,Beckmann M,Snowdon S,Lin W,Zubair H

    更新日期:2009-07-21 00:00:00

  • Structural analysis on mutation residues and interfacial water molecules for human TIM disease understanding.

    abstract:BACKGROUND:Human triosephosphate isomerase (HsTIM) deficiency is a genetic disease caused often by the pathogenic mutation E104D. This mutation, located at the side of an abnormally large cluster of water in the inter-subunit interface, reduces the thermostability of the enzyme. Why and how these water molecules are di...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S16-S11

    authors: Li Z,He Y,Liu Q,Zhao L,Wong L,Kwoh CK,Nguyen H,Li J

    更新日期:2013-01-01 00:00:00

  • Hierarchical modularity of nested bow-ties in metabolic networks.

    abstract:BACKGROUND:The exploration of the structural topology and the organizing principles of genome-based large-scale metabolic networks is essential for studying possible relations between structure and functionality of metabolic networks. Topological analysis of graph models has often been applied to study the structural c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-386

    authors: Zhao J,Yu H,Luo JH,Cao ZW,Li YX

    更新日期:2006-08-18 00:00:00