CFSP: a collaborative frequent sequence pattern discovery algorithm for nucleic acid sequence classification.

Abstract:

Background:Conserved nucleic acid sequences play an essential role in transcriptional regulation. The motifs/templates derived from nucleic acid sequence datasets are usually used as biomarkers to predict biochemical properties such as protein binding sites or to identify specific non-coding RNAs. In many cases, template-based nucleic acid sequence classification performs better than some feature extraction methods, such as N-gram and k-spaced pairs classification. The availability of large-scale experimental data provides an unprecedented opportunity to improve motif extraction methods. The process for pattern extraction from large-scale data is crucial for the creation of predictive models. Methods:In this article, a Teiresias-like feature extraction algorithm to discover frequent sub-sequences (CFSP) is proposed. Although gaps are allowed in some motif discovery algorithms, the distance and number of gaps are limited. The proposed algorithm can find frequent sequence pairs with a larger gap. The combinations of frequent sub-sequences in given protracted sequences capture the long-distance correlation, which implies a specific molecular biological property. Hence, the proposed algorithm intends to discover the combinations. A set of frequent sub-sequences derived from nucleic acid sequences with order is used as a base frequent sub-sequence array. The mutation information is attached to each sub-sequence array to implement fuzzy matching. Thus, a mutate records a single nucleotide variant or nucleotides insertion/deletion (indel) to encode a slight difference between frequent sequences and a matched subsequence of a sequence under investigation. Conclusions:The proposed algorithm has been validated with several nucleic acid sequence prediction case studies. These data demonstrate better results than the recently available feature descriptors based methods based on experimental data sets such as miRNA, piRNA, and Sigma 54 promoters. CFSP is implemented in C++ and shell script; the source code and related data are available at https://github.com/HePeng2016/CFSP.

journal_name

PeerJ

journal_title

PeerJ

authors

Peng H

doi

10.7717/peerj.8965

subject

Has Abstract

pub_date

2020-04-20 00:00:00

pages

e8965

issn

2167-8359

pii

8965

journal_volume

8

pub_type

杂志文章

相关文献

PeerJ文献大全
  • Regional drivers of clutch loss reveal important trade-offs for beach-nesting birds.

    abstract::Coastal birds are critical ecosystem constituents on sandy shores, yet are threatened by depressed reproductive success resulting from direct and indirect anthropogenic and natural pressures. Few studies examine clutch fate across the wide range of environments experienced by birds; instead, most focus at the small si...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.2460

    authors: Maslo B,Schlacher TA,Weston MA,Huijbers CM,Anderson C,Gilby BL,Olds AD,Connolly RM,Schoeman DS

    更新日期:2016-09-13 00:00:00

  • Termite mound architecture regulates nest temperature and correlates with species identities of symbiotic fungi.

    abstract:Background:Large and complex mounds built by termites of the genus Macrotermes characterize many dry African landscapes, including the savannas, bushlands, and dry forests of the Tsavo Ecosystem in southern Kenya. The termites live in obligate symbiosis with filamentous fungi of the genus Termitomyces. The insects coll...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.6237

    authors: Vesala R,Harjuntausta A,Hakkarainen A,Rönnholm P,Pellikka P,Rikkinen J

    更新日期:2019-01-16 00:00:00

  • Evaluation of the estimate bias magnitude of the Rao's quadratic diversity index.

    abstract::Rao's quadratic diversity index is one of the most widely applied diversity indices in functional and phylogenetic ecology. The standard way of computing Rao's quadratic diversity index for an ecological assemblage with a group of species with varying abundances is to sum the functional or phylogenetic distances betwe...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.5211

    authors: Chen Y,Wu Y,Shen TJ

    更新日期:2018-07-06 00:00:00

  • Phyllosphere bacterial assembly in citrus crop under conventional and ecological management.

    abstract::Divergences between agricultural management can result in different types of biological interactions between plants and microorganisms, which may affect food quality and productivity. Conventional practices are well-established in the agroindustry as very efficient and lucrative; however, the increasing demand for sus...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.9152

    authors: Carvalho CR,Dias AC,Homma SK,Cardoso EJ

    更新日期:2020-06-02 00:00:00

  • Parameter estimation in tree graph metabolic networks.

    abstract::We study the glycosylation processes that convert initially toxic substrates to nutritionally valuable metabolites in the flavonoid biosynthesis pathway of tomato (Solanum lycopersicum) seedlings. To estimate the reaction rates we use ordinary differential equations (ODEs) to model the enzyme kinetics. A popular choic...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.2417

    authors: Astola L,Stigter H,Gomez Roldan MV,van Eeuwijk F,Hall RD,Groenenboom M,Molenaar JJ

    更新日期:2016-09-20 00:00:00

  • A randomized clinical trial of vitamin D3 (cholecalciferol) in ulcerative colitis patients with hypovitaminosis D3.

    abstract:AIM:To prospectively evaluate the effects of vitamin D3 on disease activity and quality of life in ulcerative colitis (UC) patients with hypovitaminosis D. METHODS:The study was a prospective double-blinded, randomized trial conducted at Community Regional Medical Center, Fresno, CA from 2012-2013. Patients with UC an...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.3654

    authors: Mathur J,Naing S,Mills P,Limsui D

    更新日期:2017-08-03 00:00:00

  • Properties of a cryptic lysyl oxidase from haloarchaeon Haloterrigena turkmenica.

    abstract:Background:Lysyl oxidases (LOX) have been extensively studied in mammals, whereas properties and functions of recently found homologues in prokaryotic genomes remain enigmatic. Methods:LOX open reading frame was cloned from Haloterrigena turkmenica in an E. coli expression vector. Recombinant Haloterrigena turkmenica ...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.6691

    authors: Pestov NB,Kalinovsky DV,Larionova TD,Zakirova AZ,Modyanov NN,Okkelman IA,Korneenko TV

    更新日期:2019-04-05 00:00:00

  • Ixodes scapularis microbiome correlates with life stage, not the presence of human pathogens, in ticks submitted for diagnostic testing.

    abstract::Ticks are globally distributed arthropods and a public health concern due to the many human pathogens they carry and transmit, including the causative agent of Lyme disease, Borrelia burgdorferi. As tick species' ranges increase, so do the number of reported tick related illnesses. The microbiome is a critical part of...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.10424

    authors: Gil JC,Helal ZH,Risatti G,Hird SM

    更新日期:2020-12-02 00:00:00

  • Gastric Helicobacter pylori infection perturbs human oral microbiota.

    abstract:Background:We investigated the effects of gastric Helicobacter pylori infection on the daytime and overnight human oral microbiota. Methods:Twenty four volunteers were recruited. Ten tested positive for H. pylori infection by the Carbon-14 Urea Breath Test, and the rest were negative. Two oral swabs were collected: on...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.6336

    authors: Chua EG,Chong JY,Lamichhane B,Webberley KM,Marshall BJ,Wise MJ,Tay CY

    更新日期:2019-01-28 00:00:00

  • The synergistic effect of concatenation in phylogenomics: the case in Pantoea.

    abstract::With the increased availability of genome sequences for bacteria, it has become routine practice to construct genome-based phylogenies. These phylogenies have formed the basis for various taxonomic decisions, especially for resolving problematic relationships between taxa. Despite the popularity of concatenating share...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.6698

    authors: Palmer M,Venter SN,McTaggart AR,Coetzee MPA,Van Wyk S,Avontuur JR,Beukes CW,Fourie G,Santana QC,Van Der Nest MA,Blom J,Steenkamp ET

    更新日期:2019-04-16 00:00:00

  • Genome-wide identification and characterization of heat shock protein family 70 provides insight into its divergent functions on immune response and development of Paralichthys olivaceus.

    abstract::Flatfish undergo extreme morphological development and settle to a benthic in the adult stage, and are likely to be more susceptible to environmental stress. Heat shock proteins 70 (hsp70) are involved in embryonic development and stress response in metazoan animals. However, the evolutionary history and functions of ...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.7781

    authors: Liu K,Hao X,Wang Q,Hou J,Lai X,Dong Z,Shao C

    更新日期:2019-11-11 00:00:00

  • Characterizing abnormal behavior in a large population of zoo-housed chimpanzees: prevalence and potential influencing factors.

    abstract::Abnormal behaviors in captive animals are generally defined as behaviors that are atypical for the species and are often considered to be indicators of poor welfare. Although some abnormal behaviors have been empirically linked to conditions related to elevated stress and compromised welfare in primates, others have l...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.2225

    authors: Jacobson SL,Ross SR,Bloomsmith MA

    更新日期:2016-07-13 00:00:00

  • Temporal and spatial strategies in an active place avoidance task on Carousel: a study of effects of stability of arena rotation speed in rats.

    abstract::The active place avoidance task is a dry-arena task used to assess spatial navigation and memory in rodents. In this task, a subject is put on a rotating circular arena and avoids an invisible sector that is stable in relation to the room. Rotation of the arena means that the subject's avoidance must be active, otherw...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.1257

    authors: Bahník Š,Stuchlík A

    更新日期:2015-09-22 00:00:00

  • Nuclear microsatellites reveal population genetic structuring and fine-scale pattern of hybridization in the Japanese mantis shrimp Oratosquilla oratoria.

    abstract::The interplay between historical and contemporary processes can produce complex patterns of genetic differentiation in the marine realm. Recent mitochondrial and nuclear sequence analyses revealed cryptic speciation in the Japanese mantis shrimp Oratosquilla oratoria. Herein, we applied nuclear microsatellite markers ...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.10270

    authors: Cheng J,Zhang N,Sha Z

    更新日期:2020-11-05 00:00:00

  • Integrative taxonomy of the ornamental 'peppermint' shrimp public market and population genetics of Lysmata boggessi, the most heavily traded species worldwide.

    abstract::The ornamental trade is a worldwide industry worth >15 billion USD with a problem of rampant product misidentification. Minimizing misidentification is critical in the face of overexploitation of species in the trade. We surveyed the peppermint shrimp ornamental marketplace in the southeastern USA, the most intense ma...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.3786

    authors: Baeza JA,Behringer DC

    更新日期:2017-09-18 00:00:00

  • Caprellid amphipods (Caprella spp.) are vulnerable to both physiological and habitat-mediated effects of ocean acidification.

    abstract::Ocean acidification (OA) is one of the most significant threats to marine life, and is predicted to drive important changes in marine communities. Although OA impacts will be the sum of direct effects mediated by alterations of physiological rates and indirect effects mediated by shifts in species interactions and bio...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.5327

    authors: Lim EG,Harley CDG

    更新日期:2018-07-31 00:00:00

  • Quantile-dependent expressivity of plasma adiponectin concentrations may explain its sex-specific heritability, gene-environment interactions, and genotype-specific response to postprandial lipemia.

    abstract:Background:"Quantile-dependent expressivity" occurs when the effect size of a genetic variant depends upon whether the phenotype (e.g. adiponectin) is high or low relative to its distribution. We have previously shown that the heritability (h2 ) of adiposity, lipoproteins, postprandial lipemia, pulmonary function, and ...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.10099

    authors: Williams PT

    更新日期:2020-10-14 00:00:00

  • A horizon scan of future threats and opportunities for pollinators and pollination.

    abstract::Background. Pollinators, which provide the agriculturally and ecologically essential service of pollination, are under threat at a global scale. Habitat loss and homogenisation, pesticides, parasites and pathogens, invasive species, and climate change have been identified as past and current threats to pollinators. Ac...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.2249

    authors: Brown MJ,Dicks LV,Paxton RJ,Baldock KC,Barron AB,Chauzat MP,Freitas BM,Goulson D,Jepsen S,Kremen C,Li J,Neumann P,Pattemore DE,Potts SG,Schweiger O,Seymour CL,Stout JC

    更新日期:2016-08-09 00:00:00

  • Analysis of the in planta transcriptome expressed by the corn pathogen Pantoea stewartii subsp. stewartii via RNA-Seq.

    abstract::Pantoea stewartii subsp. stewartii is a bacterial phytopathogen that causes Stewart's wilt disease in corn. It uses quorum sensing to regulate expression of some genes involved in virulence in a cell density-dependent manner as the bacterial population grows from small numbers at the initial infection site in the leaf...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.3237

    authors: Packard H,Kernell Burke A,Jensen RV,Stevens AM

    更新日期:2017-04-27 00:00:00

  • Agrichemicals and antibiotics in combination increase antibiotic resistance evolution.

    abstract::Antibiotic resistance in our pathogens is medicine's climate change: caused by human activity, and resulting in more extreme outcomes. Resistance emerges in microbial populations when antibiotics act on phenotypic variance within the population. This can arise from either genotypic diversity (resulting from a mutation...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.5801

    authors: Kurenbach B,Hill AM,Godsoe W,van Hamelsveld S,Heinemann JA

    更新日期:2018-10-12 00:00:00

  • Sample entropy analysis for the estimating depth of anaesthesia through human EEG signal at different levels of unconsciousness during surgeries.

    abstract::Estimating the depth of anaesthesia (DoA) in operations has always been a challenging issue due to the underlying complexity of the brain mechanisms. Electroencephalogram (EEG) signals are undoubtedly the most widely used signals for measuring DoA. In this paper, a novel EEG-based index is proposed to evaluate DoA for...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.4817

    authors: Liu Q,Ma L,Fan SZ,Abbod MF,Shieh JS

    更新日期:2018-05-23 00:00:00

  • Associations of ADL and IADL disability with physical and mental dimensions of quality of life in people aged 75 years and older.

    abstract:Background:Quality of life is an important health outcome for older persons. It predicts the adverse outcomes of institutionalization and premature death. The aim of this cross-sectional study was to determine the influence of both disability in activities of daily living (ADL) and instrumental activities of daily livi...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.5425

    authors: Gobbens RJ

    更新日期:2018-08-09 00:00:00

  • Blue hypertext is a good design decision: no perceptual disadvantage in reading and successful highlighting of relevant information.

    abstract:BACKGROUND:Highlighted text in the Internet (i.e., hypertext) is predominantly blue and underlined. The perceptibility of these hypertext characteristics was heavily questioned by applied research and empirical tests resulted in inconclusive results. The ability to recognize blue text in foveal and parafoveal vision wa...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.2467

    authors: Gagl B

    更新日期:2016-09-20 00:00:00

  • Importance of biotic predictors in estimation of potential invasive areas: the example of the tortoise beetle Eurypedus nigrosignatus, in Hispaniola.

    abstract::Climatic variables have been the main predictors employed in ecological niche modeling and species distribution modeling, although biotic interactions are known to affect species' spatial distributions via mechanisms such as predation, competition, and mutualism. Biotic interactions can affect species' responses to ab...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.6052

    authors: Simões MVP,Peterson AT

    更新日期:2018-12-05 00:00:00

  • Diversity, host-specificity and stability of sponge-associated fungal communities of co-occurring sponges.

    abstract::Fungi play a critical role in a range of ecosystems; however, their interactions and functions in marine hosts, and particular sponges, is poorly understood. Here we assess the fungal community composition of three co-occurring sponges (Cymbastela concentrica, Scopalina sp., Tedania anhelans) and the surrounding seawa...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.4965

    authors: Nguyen MTHD,Thomas T

    更新日期:2018-06-04 00:00:00

  • Social network community structure and the contact-mediated sharing of commensal E. coli among captive rhesus macaques (Macaca mulatta).

    abstract::In group-living animals, heterogeneity in individuals' social connections may mediate the sharing of microbial infectious agents. In this regard, the genetic relatedness of individuals' commensal gut bacterium Escherichia coli may be ideal to assess the potential for pathogen transmission through animal social network...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.4271

    authors: Balasubramaniam K,Beisner B,Guan J,Vandeleest J,Fushing H,Atwill E,McCowan B

    更新日期:2018-01-17 00:00:00

  • Move it or lose it: interspecific variation in risk response of pond-breeding anurans.

    abstract::Changes in behavior are often the proximate response of animals to human disturbance, with variability in tolerance levels leading some species to exhibit striking shifts in life history, fitness, and/or survival. Thus, elucidating the effects of disturbance on animal behavior, and how this varies among taxonomically ...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.6956

    authors: Matich P,Schalk CM

    更新日期:2019-06-07 00:00:00

  • Optimizing the use of a sensor resource for opponent polarization coding.

    abstract::Flies use specialized photoreceptors R7 and R8 in the dorsal rim area (DRA) to detect skylight polarization. R7 and R8 form a tiered waveguide (central rhabdomere pair, CRP) with R7 on top, filtering light delivered to R8. We examine how the division of a given resource, CRP length, between R7 and R8 affects their abi...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.2772

    authors: Heras FJ,Laughlin SB

    更新日期:2017-01-12 00:00:00

  • Plasma proteomic analysis of systemic lupus erythematosus patients using liquid chromatography/tandem mass spectrometry with label-free quantification.

    abstract:Context:Systemic lupus erythematosus (SLE) is a chronic inflammatory autoimmune disease with unknown etiology. Objective:Human plasma is comprised of over 10 orders of magnitude concentration of proteins and tissue leakages. The changes in the abundance of these proteins have played an important role in various human ...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.4730

    authors: Madda R,Lin SC,Sun WH,Huang SL

    更新日期:2018-05-08 00:00:00

  • The use of artificial substrate units to improve inventories of cryptic crustacean species on Caribbean coral reefs.

    abstract::Motile cryptofauna inhabiting coral reefs are complex assemblages that utilize the space available among dead coral stands and the surrounding coral rubble substrate. They comprise a group of organisms largely overlooked in biodiversity estimates because they are hard to collect and identify, and their collection caus...

    journal_title:PeerJ

    pub_type: 杂志文章

    doi:10.7717/peerj.10389

    authors: Monroy-Velázquez LV,Rodríguez-Martínez RE,Blanchon P,Alvarez F

    更新日期:2020-11-23 00:00:00