Abstract:
:In many sequence data mining applications, the goal is to find frequent substrings. Some of these applications like extracting motifs in protein and DNA sequences are looking for frequently occurring approximate contiguous substrings called simple motifs. By approximate we mean that some mismatches are allowed during similarity test between substrings, and it helps to discover unknown patterns. Structured motifs in DNA sequences are frequent structured contiguous substrings which contains two or more simple motifs. There are some works that have been done to find simple motifs but these works have problems such as low scalability, high execution time, no guarantee to find all patterns, and low flexibility in adaptation to other application. The Flame is the only algorithm that can find all unknown structured patterns in a dataset and has solved most of these problems but its scalability for very large sequences is still weak. In this research a new approach named Next-Symbol-Array based Motif Discovery (NSAMD) is represented to improve scalability in extracting all unknown simple and structured patterns. To reach this goal a new data structure has been presented called Next-Symbol-Array. This data structure makes change in how to find patterns by NSAMD in comparison with Flame and helps to find structured motif faster. Proposed algorithm is as accurate as Flame and extracts all existing patterns in dataset. Performance comparisons show that NSAMD outperforms Flame in extracting structured motifs in both execution time (51% faster) and memory usage (more than 99%). Proposed algorithm is slower in extracting simple motifs but considerable improvement in memory usage (more than 99%) makes NSAMD more scalable than Flame. This advantage of NSAMD is very important in biological applications in which very large sequences are applied.
journal_name
Comput Biol Chemjournal_title
Computational biology and chemistryauthors
Pari A,Baraani A,Parseh Sdoi
10.1016/j.compbiolchem.2016.09.001subject
Has Abstractpub_date
2016-10-01 00:00:00pages
384-395eissn
1476-9271issn
1476-928Xpii
S1476-9271(15)30073-6journal_volume
64pub_type
杂志文章abstract::In process of creating genetic maps different labs/research groups obtain overlapping parts of the map. Merging these parts into one integrative map is based on looking for maximum shared marker orders among the maps. Really, not all shared markers of such maps have consensus order that obstructs building of the integ...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2005.09.007
更新日期:2006-02-01 00:00:00
abstract::Inhibition of poly(ADP-ribose) polymerase-1 (PARP-1) has turned out an innovative approach for cancer therapy due to its involvement in DNA repair pathways. Although several potent PARP-1 inhibitors have been identified, they exhibit high toxicity, resistivity and diverse pharmacological profile in clinical trials, wh...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2019.04.018
更新日期:2019-06-01 00:00:00
abstract::Tissue softening accompanies the ripening of many fruits and initiates the processes of irreversible deterioration. Expansins are plant cell wall proteins that have been proposed to disrupt hydrogen bonds within the cell wall polymer matrix. Several authors have shown that FaEXPA2 is a key gene that shows an increased...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2020.107279
更新日期:2020-05-30 00:00:00
abstract::The plant-specific TCP transcription factors, which play critical roles in diverse aspects of biological processes, have been identified and analyzed in various plant species. However, no systematical study of TCP family genes in potato (Solanum tuberosum L.) has been undertaken. In this study, a total of 31 non-redun...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2018.11.009
更新日期:2019-02-01 00:00:00
abstract::Proteins physically interact with each other and form protein complexes to perform their biological functions. The prediction of protein complexes from protein-protein interaction (PPI) network is usually difficult when the complexes are overlapping with each other in a dense region of the network. To address the prob...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2018.03.012
更新日期:2018-06-01 00:00:00
abstract::Prefoldin is a molecular chaperone and acts as a nano-actuator in cargo carriage and drug delivery for disease treatment. Investigating the mechanical properties of nano-actuator helps predict its behavior and measure its performance under various environmental conditions, like external forces that are applied. Accord...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2019.107133
更新日期:2019-12-01 00:00:00
abstract::Systemic lupus erythematosus (SLE) is a heterogeneous autoimmune disorder, and its pathogenesis in males and in cases without accompanying lupus nephritis (LN-) is not fully understood. In this study, we identified 90 (82 up- and 8 downregulated) differentially expressed genes (DEGs) common to female LN-, female LN+ a...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2019.107135
更新日期:2019-12-01 00:00:00
abstract::Previous joint experimental and theoretical work demonstrates that typically soluble peptides will be rendered insoluble in the presence of saturated sodium ions in aqueous solution due to disruption of cation-π interactions between Trp and Lys. The present work utilizes quantum chemical methods including density func...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2017.12.009
更新日期:2018-02-01 00:00:00
abstract::The P-type ATPases (P-ATPases) are present in all living cells where they mediate ion transport across membranes on the expense of ATP hydrolysis. Different ions which are transported by these pumps are protons like calcium, sodium, potassium, and heavy metals such as manganese, iron, copper, and zinc. Maintenance of ...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2017.04.006
更新日期:2017-06-01 00:00:00
abstract:BACKGROUND:Metastasis is the main cause of breast cancer (BC) lethality, especially in early stages, led to improvements in therapeutic procedures. Lately, by improvements in our perception of biological processes and immune system new classes of vaccines are emerged that grant us the opportunity of designing resolute ...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2020.107231
更新日期:2020-04-01 00:00:00
abstract::The electrostatic (ES) energy of each residue was for the first time quantitatively evaluated in a flavin mononucleotide binding protein (FBP). A residue electrostatic energy (RES) was obtained as the sum of the ES energies between atoms in each residue and all other atoms in the FBP dimer using atomic coordinates obt...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2018.01.001
更新日期:2018-02-01 00:00:00
abstract::Cyclin-Dependent Kinases (CDKs) are known to play crucial roles in controlling cell cycle progression of eukaryotic cell and inhibition of their activity has long been considered as potential strategy in anti-cancer drug research. In the present work, a series of porphyrin-anthraquinone hybrids bearing meso-substituen...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2016.12.005
更新日期:2017-04-01 00:00:00
abstract::Recent advances in high-throughput genome sequencing technologies have enabled the systematic study of various genomes by making whole genome sequencing affordable. Modern sequencers generate a huge number of small sequence fragments called reads, where the read length and the per-base sequencing cost depend on the te...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2017.03.016
更新日期:2017-08-01 00:00:00
abstract::The Protein Structure Prediction (PSP) problem comprises, among other issues, forecasting the three-dimensional native structure of proteins using only their primary structure information. Most computational studies in this area use synthetic data instead of real biological data. However, the closer to the real-world,...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2019.107192
更新日期:2020-02-01 00:00:00
abstract::Shortest common supersequence (SCS) is a classical NP-hard problem, where a string to be constructed that is the supersequence of a given string set. The SCS problem has an enormous application of data compression, query optimization in the database and different bioinformatics activities. Due to NP-hardness, the exac...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2016.05.004
更新日期:2016-10-01 00:00:00
abstract::Natural products as well as their derivatives play a significant role in the discovery of new biologically active compounds in the different areas of our life especially in the field of medicine. The synthesis of compounds produced from natural products including cytisine is one approach for the wider use of natural s...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2020.107407
更新日期:2020-11-05 00:00:00
abstract::The DNA binding protein, TDP43 is a major protein involved in amyotrophic lateral sclerosis and other neurological disorders such as frontotemporal dementia, Alzheimer disease, etc. In the present study, we have designed possible siRNAs for the glycine rich region of tardbp mutants causing ALS disorder based on a syst...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2016.01.001
更新日期:2016-04-01 00:00:00
abstract::Functional classification of genes represents one of the most basic problems in genome analysis and annotation. Our analysis of some of the popular methods for functional classification of genes shows that these methods are not always consistent with each other and may not be specific enough for high-resolution gene f...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2008.02.007
更新日期:2008-06-01 00:00:00
abstract::DNA microarray data has been widely used in cancer research due to the significant advantage helped to successfully distinguish between tumor classes. However, typical gene expression data usually presents a high-dimensional imbalanced characteristic, which poses severe challenge for traditional machine learning metho...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2019.03.017
更新日期:2019-06-01 00:00:00
abstract::With an increasing number of publicly available microarray datasets, it becomes attractive to borrow information from other relevant studies to have more reliable and powerful analysis of a given dataset. We do not assume that subjects in the current study and other relevant studies are drawn from the same population ...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2005.04.002
更新日期:2005-06-01 00:00:00
abstract::This article has been withdrawn at the request of the author(s) and/or editor. The Publisher apologizes for any inconvenience this may cause. The full Elsevier Policy on Article Withdrawal can be found at http://www.elsevier.com/locate/withdrawalpolicy. ...
journal_title:Computational biology and chemistry
pub_type: 撤回出版物
doi:10.1016/j.compbiolchem.2016.02.010
更新日期:2016-02-17 00:00:00
abstract::Identifying essential proteins is very important for understanding the minimal requirements of cellular survival and development. Fast growth in the amount of available protein-protein interactions has produced unprecedented opportunities for detecting protein essentiality from the network level. Essential proteins ha...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2011.04.002
更新日期:2011-06-01 00:00:00
abstract::Microarray technology has been widely applied in study of measuring gene expression levels for thousands of genes simultaneously. Gene cluster analysis is found useful for discovering the function of gene because co-expressed genes are likely to share the same biological function. K-means is one of well-known clusteri...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2008.03.020
更新日期:2008-08-01 00:00:00
abstract::Many Traditional Chinese Medicines (TCMs) are effective to relieve complicated diseases such as type II diabetes mellitus (T2DM). In this work, molecular docking and network analysis were employed to elucidate the action mechanism of a medical composition which had clinical efficacy for T2DM. We found that multiple ac...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2011.07.003
更新日期:2011-10-12 00:00:00
abstract::Our ability to detect differentially expressed genes in a microarray experiment can be hampered when the number of biological samples of interest is limited. In this situation, we propose the use of information from self-self hybridizations to acuminate our inference of differential expression. A unified modelling str...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2007.03.005
更新日期:2007-06-01 00:00:00
abstract::The resistances of matrix protein 2 (M2) protein inhibitors and neuraminidase inhibitors for influenza virus have attracted much attention and there is an urgent need for new drug. The antiviral drugs that selectively act on RNA polymerase are less prone to resistance and possess fewer side effects on the patient. The...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2020.107241
更新日期:2020-04-01 00:00:00
abstract::We present an algorithm for automatically predicting the topological family of any RNA three-way junction, given only the information from the secondary structure: the sequence and the Watson-Crick pairings. The parameters of the algorithm have been determined on a data set of 33 three-way junctions whose 3D conformat...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2011.11.001
更新日期:2012-04-01 00:00:00
abstract::Annular structures have been observed experimentally in aggregates of polyglutamine-containing proteins and other proteins associated with diseases of the brain. Here we report the observation of annular structures in molecular-level simulations of large systems of model polyglutamine peptides. A system of 24 polyglut...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2006.01.003
更新日期:2006-06-01 00:00:00
abstract::Bacterial MocR transcriptional regulators possess an N-terminal DNA-binding domain containing a conserved helix-turn-helix module and an effector-binding and/or oligomerization domain at the C-terminus, homologous to fold type-I pyridoxal 5'-phosphate (PLP) enzymes. Since a comprehensive structural analysis of the Moc...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2015.05.003
更新日期:2015-10-01 00:00:00
abstract::Interleukin-1β is a drug target in rheumatoid arthritis and several auto-immune disorders. In this study, a set of 48 compounds with the determined IC50 values were used for QSAR analysis by MOE. The QSAR model was developed by using training set of 41 compounds, based on 12 unique descriptors. Model was validated by ...
journal_title:Computational biology and chemistry
pub_type: 杂志文章
doi:10.1016/j.compbiolchem.2015.06.004
更新日期:2015-10-01 00:00:00