Markovian encoding models in human splice site recognition using SVM.

Abstract:

:Splice site recognition is among the most significant and challenging tasks in bioinformatics due to its key role in gene annotation. Effective prediction of splice site requires nucleotide encoding methods that reveal the characteristics of DNA sequences to provide appropriate features to serve as input of machine learning classifiers. Markovian models are the most influential encoding methods that highly used for pattern recognition in biological data. However, a direct performance comparison of these methods in splice site domain has not been assessed yet. This study compares various Markovian encoding models for splice site prediction utilizing support vector machine, as the most outstanding learning method in the domain, and conducts a new precise evaluation of Markovian approaches that corrects this limitation. Moreover, a novel sequence encoding approach based on third order Markov model (MM3) is proposed. The experimental results show that the proposed method, namely MM3-SVM, performs significantly better than thirteen best known state-of-the-art algorithms, while tested on HS3D dataset considering several performance criteria. Further, it achieved higher prediction accuracy than several well-known tools like NNsplice, MEM, MM1, WMM, and GeneID, using an independent test set of 50 genes. We also developed MMSVM, a web tool to predict splice sites in any human sequence using the proposed approach. The MMSVM web server can be assessed at https://pashaei.shinyapps.io/mmsvm.

journal_name

Comput Biol Chem

authors

Pashaei E,Aydin N

doi

10.1016/j.compbiolchem.2018.02.005

subject

Has Abstract

pub_date

2018-04-01 00:00:00

pages

159-170

eissn

1476-9271

issn

1476-928X

pii

S1476-9271(17)30363-8

journal_volume

73

pub_type

杂志文章
  • Interaction of zervamicin IIB with lipid bilayers. Molecular dynamics study.

    abstract::In this work we have studied the interaction of zervamicin IIB (ZrvIIB) with the model membranes of eukaryotes and prokaryotes using all-atom molecular dynamics. In all our simulations zervamicin molecule interacted only with lipid headgroups but did not penetrate the hydrophobic core of the bilayers. During the inter...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2010.12.005

    authors: Levtsova OV,Antonov MY,Naumenkova TV,Sokolova OS

    更新日期:2011-02-01 00:00:00

  • QSAR study of pyrazolo[4,3-e][1,2,4]triazine sulfonamides against tumor-associated human carbonic anhydrase isoforms IX and XII.

    abstract::The QSAR models for a set of pyrazolo[4,3-e][1,2,4]triazines incorporating benzenesulfonamide moiety combined directly with the heterocyclic ring or by NH linkage were generated. The inhibitory potency of compounds against human carbonic anhydrase isoforms IX and XII and antiproliferative activity against human MCF-7 ...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2017.09.006

    authors: Matysiak J,Skrzypek A,Tarasiuk P,Mojzych M

    更新日期:2017-12-01 00:00:00

  • Disruption of murine Tcte3-3 induces tissue specific apoptosis via co-expression of Anxa5 and Pebp1.

    abstract::Programmed cell death or apoptosis plays a vital physiological role in the development and homeostasis. Any discrepancy in apoptosis may trigger testicular and neurodegenerative diseases, ischemic damage, autoimmune disorders and many types of cancer. Tcte3 (T-complex testis expressed 3) is an accessory component of a...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2014.10.005

    authors: Parveen Z,Bibi Z,Bibi N,Neesen J,Rashid S

    更新日期:2014-12-01 00:00:00

  • Protein subcellular location prediction using optimally weighted fuzzy k-NN algorithm.

    abstract::Optimally weighted fuzzy k-nearest neighbors (OWFKNN) algorithm has been used to predict proteins' subcellular locations based on their amino acid composition, in this paper. The datasets used consists of two species which are 997 prokaryotic and 2427 eukaryotic protein sequences. The overall prediction accuracy achie...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2008.07.011

    authors: Nasibov E,Kandemir-Cavas C

    更新日期:2008-12-01 00:00:00

  • DNA strand break: structural and electrostatic properties studied by molecular dynamics simulation.

    abstract::Due to their lethal consequences and a relatively high probability of introduction of repair errors and mutations, single and double strand breaks are among the most important and dangerous DNA lesions. However, the mechanisms of their recognition and repair processes are only poorly known at present. This work define...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2005.12.001

    authors: Bunta JK,Laaksonen A,Pinak M,Nemoto T

    更新日期:2006-04-01 00:00:00

  • A Computational workflow for the identification of the potent inhibitor of type II secretion system traffic ATPase of Pseudomonas aeruginosa.

    abstract::Bacterial type II secretion system has now become an attractive target for antivirulence drug development. The aim of the present study was to characterize the binding site of the type II secretion system traffic ATPase GspER of Pseudomonas aeruginosa, and identify potent inhibitors using extensive computational and v...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2018.07.012

    authors: Arifuzzaman M,Mitra S,Jahan SI,Jakaria M,Abeda T,Absar N,Dash R

    更新日期:2018-10-01 00:00:00

  • A highly accurate protein structural class prediction approach using auto cross covariance transformation and recursive feature elimination.

    abstract::Structural class characterizes the overall folding type of a protein or its domain. Many methods have been proposed to improve the prediction accuracy of protein structural class in recent years, but it is still a challenge for the low-similarity sequences. In this study, we introduce a feature extraction technique ba...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2015.08.012

    authors: Li X,Liu T,Tao P,Wang C,Chen L

    更新日期:2015-12-01 00:00:00

  • Functional and structural insights into novel DREB1A transcription factors in common wheat (Triticum aestivum L.): A molecular modeling approach.

    abstract::Triticum aestivum L. known as common wheat is one of the most important cereal crops feeding a large and growing population. Various environmental stress factors including drought, high salinity and heat etc. adversely affect wheat production in a significant manner. Dehydration-responsive element-binding (DREB1A) fac...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2016.07.008

    authors: Kumar A,Kumar S,Kumar U,Suravajhala P,Gajula MN

    更新日期:2016-10-01 00:00:00

  • A chaotic approach to maintain the population diversity of genetic algorithm in network training.

    abstract::The concept of chaos being radically different from statistical randomness is introduced into chemometrics research. The chaotic system that is deterministic with underlying patterns and inherent ability in searching the space of interest has been employed to improve the performance of chemometric algorithms. In this ...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/s1476-9271(02)00083-x

    authors: Lü Q,Shen G,Yu R

    更新日期:2003-07-01 00:00:00

  • Understanding the role of the topology in protein folding by computational inverse folding experiments.

    abstract::Recent studies suggest that protein folding should be revisited as the emergent property of a complex system and that the nature allows only a very limited number of folds that seem to be strongly influenced by geometrical properties. In this work we explore the principles underlying this new view and show how helical...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2008.03.015

    authors: Mucherino A,Costantini S,di Serafino D,D'Apuzzo M,Facchiano A,Colonna G

    更新日期:2008-08-01 00:00:00

  • C3: An R package for cross-species compendium-based cell-type identification.

    abstract::Cell type identification from an unknown sample can often be done by comparing its gene expression profile against a gene expression database containing profiles of a large number of cell-types. This type of compendium-based cell-type identification strategy is particularly successful for human and mouse samples becau...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2018.10.003

    authors: Kabir MH,Djordjevic D,O'Connor MD,Ho JWK

    更新日期:2018-12-01 00:00:00

  • Computer evaluation of VirE2 protein complexes for ssDNA transfer ability.

    abstract::The single-stranded transfer DNA from the Ti plasmid of the soil bacteria Agrobacterium nonspecifically integrates into the plant chromosome and is inherited at subsequent cell divisions. How it is transferred across host membranes is unknown, but it is believed that VirE2 proteins form a membrane-spanning pore or cha...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2017.01.016

    authors: Volokhina I,Gusev Y,Mazilov S,Moiseeva Y,Chumakov M

    更新日期:2017-06-01 00:00:00

  • Self-organizing map of gene regulatory networks for cell phenotypes during reprogramming.

    abstract::The induced pluripotent cells (iPSCs) are derived from somatic cells by reprogramming their genetic profiles. Such a process requires coordinated dynamic expression of hundreds of genes and proteins. As both deterministic and stochastic elements control the reprogramming process, it is not easy to have a way to reflec...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2011.05.002

    authors: Zhang L,Zheng Y,Li D,Zhong Y

    更新日期:2011-08-10 00:00:00

  • Automated prediction of three-way junction topological families in RNA secondary structures.

    abstract::We present an algorithm for automatically predicting the topological family of any RNA three-way junction, given only the information from the secondary structure: the sequence and the Watson-Crick pairings. The parameters of the algorithm have been determined on a data set of 33 three-way junctions whose 3D conformat...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2011.11.001

    authors: Lamiable A,Barth D,Denise A,Quessette F,Vial S,Westhof E

    更新日期:2012-04-01 00:00:00

  • Prediction and verification of microRNAs related to proline accumulation under drought stress in potato.

    abstract::Proline is an important osmotic adjusting material greatly accumulated under drought stress and can help plant to adapt to osmotic stress. MicroRNAs (miRNAs) are small, endogenous RNAs that play important regulatory roles in plant development and stress response by negatively affecting gene expression at post-transcri...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2013.04.006

    authors: Yang J,Zhang N,Ma C,Qu Y,Si H,Wang D

    更新日期:2013-10-01 00:00:00

  • Identification of effective DNA barcodes for Triticum plants through chloroplast genome-wide analysis.

    abstract::The Egyptian flora is rich with a large number of Triticum plants, which are very difficult to discriminate between in the early developmental stages. This study assesses the significance of using two DNA Barcoding loci (matK and rbcL) in distinguishing between 18 different Triticum accessions in Egypt. We isolated an...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2017.09.003

    authors: Awad M,Fahmy RM,Mosa KA,Helmy M,El-Feky FA

    更新日期:2017-12-01 00:00:00

  • In silico analyses of a new group of fungal and plant RecQ4-homologous proteins.

    abstract::Bacterial and eukaryotic RecQ helicases comprise a family of homologous proteins necessary for maintaining genomic integrity during the cell cycle and DNA repair. There is one known bacterial RecQ helicase, and five eukaryotic RecQ helicases that have been described: RecQ1p, RecQ4p, RecQ5p, Bloom, and Werner. While th...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2008.07.005

    authors: Barea F,Tessaro S,Bonatto D

    更新日期:2008-10-01 00:00:00

  • Generating SNP barcode to evaluate SNP-SNP interaction of disease by particle swarm optimization.

    abstract::Genome-wide association analysis involved many single-nucleotide polymorphisms (SNPs) data is challenging mathematically and computationally. Hence, we propose the odds ratio-based discrete binary particle swarm optimization (OR-DBPSO) method that uses the OR as a new quantitative measure of disease risk among many SN...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2008.07.029

    authors: Chang HW,Yang CH,Ho CH,Wen CH,Chuang LY

    更新日期:2009-02-01 00:00:00

  • Genome-wide identification and expression analysis of StTCP transcription factors of potato (Solanum tuberosum L.).

    abstract::The plant-specific TCP transcription factors, which play critical roles in diverse aspects of biological processes, have been identified and analyzed in various plant species. However, no systematical study of TCP family genes in potato (Solanum tuberosum L.) has been undertaken. In this study, a total of 31 non-redun...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2018.11.009

    authors: Wang Y,Zhang N,Li T,Yang J,Zhu X,Fang C,Li S,Si H

    更新日期:2019-02-01 00:00:00

  • Conformational difference between two subunits in flavin mononucleotide binding protein dimers from Desulfovibrio vulgaris (MF): molecular dynamics simulation.

    abstract::The structural and dynamical properties of five FMN binding protein (FBP) dimers, WT (wild type), E13K (Glu13 replaced by Lys), E13R (Glu13 replaced by Arg), E13T (Glu13 replaced by Thr) and E13Q (Glu13 replaced by Gln), were investigated using a method of molecular dynamics simulation (MDS). In crystal structures, su...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2016.05.007

    authors: Nunthaboot N,Lugsanangarm K,Pianwanit S,Kokpol S,Tanaka F,Nakanishi T,Kitamura M

    更新日期:2016-10-01 00:00:00

  • Reprint of "Abstraction for data integration: Fusing mammalian molecular, cellular and phenotype big datasets for better knowledge extraction".

    abstract::With advances in genomics, transcriptomics, metabolomics and proteomics, and more expansive electronic clinical record monitoring, as well as advances in computation, we have entered the Big Data era in biomedical research. Data gathering is growing rapidly while only a small fraction of this data is converted to usef...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章,评审

    doi:10.1016/j.compbiolchem.2015.08.005

    authors: Rouillard AD,Wang Z,Ma'ayan A

    更新日期:2015-12-01 00:00:00

  • Exploiting the performance of dictionary-based bio-entity name recognition in biomedical literature.

    abstract::Bio-entity name recognition is the key step for information extraction from biomedical literature. This paper presents a dictionary-based bio-entity name recognition approach. The approach expands the bio-entity name dictionary via the Abbreviation Definitions identifying algorithm, improves the recall rate through th...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2008.03.008

    authors: Yang Z,Lin H,Li Y

    更新日期:2008-08-01 00:00:00

  • CAMWI: Detecting protein complexes using weighted clustering coefficient and weighted density.

    abstract::Detection of protein complexes is very important to understand the principles of cellular organization and function. Recently, large protein-protein interactions (PPIs) networks have become available using high-throughput experimental techniques. These networks make it possible to develop computational methods for pro...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2015.07.012

    authors: Lakizadeh A,Jalili S,Marashi SA

    更新日期:2015-10-01 00:00:00

  • Domain boundary prediction based on profile domain linker propensity index.

    abstract::Successful prediction of protein domain boundaries provides valuable information not only for the computational structure prediction of multi-domain proteins but also for the experimental structure determination. In this work, a novel index at the profile level is presented, namely, the profile domain linker propensit...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2006.01.001

    authors: Dong Q,Wang X,Lin L,Xu Z

    更新日期:2006-04-01 00:00:00

  • Identification and characterization of differentially expressed genes in Type 2 Diabetes using in silico approach.

    abstract::Diabetes mellitus is clinically characterized by hyperglycemia. Though many studies have been done to understand the mechanism of Type 2 Diabetes (T2D), however, the complete network of diabetes and its associated disorders through polygenic involvement is still under debate. The present study designed to re-analyze p...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2019.01.010

    authors: Gupta MK,Vadde R

    更新日期:2019-04-01 00:00:00

  • WITHDRAWN: Identification of microRNA precursor based on gapped n-tuple structure status composition kernel.

    abstract::This article has been withdrawn at the request of the author(s) and/or editor. The Publisher apologizes for any inconvenience this may cause. The full Elsevier Policy on Article Withdrawal can be found at http://www.elsevier.com/locate/withdrawalpolicy. ...

    journal_title:Computational biology and chemistry

    pub_type: 撤回出版物

    doi:10.1016/j.compbiolchem.2016.02.010

    authors: Liu B,Fang L

    更新日期:2016-02-17 00:00:00

  • Protein function prediction using neighbor relativity in protein-protein interaction network.

    abstract::There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology and bioinformatics. Some researches utilize the proteins interact...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2012.12.003

    authors: Moosavi S,Rahgozar M,Rahimi A

    更新日期:2013-04-01 00:00:00

  • The interactome of CCT complex - A computational analysis.

    abstract::The eukaryotic chaperonin, CCT (Chaperonin Containing TCP1 or TriC-TCP-1 Ring Complex) has been subjected to physical and genetic analyses in S. cerevisiae which can be extrapolated to human CCT (hCCT), owing to its structural and functional similarities with yeast CCT (yCCT). Studies on hCCT and its interactome acqui...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2016.09.002

    authors: Narayanan A,Pullepu D,Kabir MA

    更新日期:2016-10-01 00:00:00

  • Biocomputational identification and validation of novel microRNAs predicted from bubaline whole genome shotgun sequences.

    abstract::MicroRNAs (miRNAs) are small (19-25 base long), non-coding RNAs that regulate post-transcriptional gene expression by cleaving targeted mRNAs in several eukaryotes. The miRNAs play vital roles in multiple biological and metabolic processes, including developmental timing, signal transduction, cell maintenance and diff...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2017.08.005

    authors: Manku HK,Dhanoa JK,Kaur S,Arora JS,Mukhopadhyay CS

    更新日期:2017-10-01 00:00:00

  • AROHap: An effective algorithm for single individual haplotype reconstruction based on asexual reproduction optimization.

    abstract::In this paper, a method for single individual haplotype (SIH) reconstruction using Asexual reproduction optimization (ARO) is proposed. Haplotypes, as a set of genetic variations in each chromosome, contain vital information such as the relationship between human genome and diseases. Finding haplotypes in diploid orga...

    journal_title:Computational biology and chemistry

    pub_type: 杂志文章

    doi:10.1016/j.compbiolchem.2017.12.005

    authors: Olyaee MH,Khanteymoori A

    更新日期:2018-02-01 00:00:00