CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers.

Abstract:

BACKGROUND:The problem of supervised DNA sequence classification arises in several fields of computational molecular biology. Although this problem has been extensively studied, it is still computationally challenging due to size of the datasets that modern sequencing technologies can produce. RESULTS:We introduce CLARK a novel approach to classify metagenomic reads at the species or genus level with high accuracy and high speed. Extensive experimental results on various metagenomic samples show that the classification accuracy of CLARK is better or comparable to the best state-of-the-art tools and it is significantly faster than any of its competitors. In its fastest single-threaded mode CLARK classifies, with high accuracy, about 32 million metagenomic short reads per minute. CLARK can also classify BAC clones or transcripts to chromosome arms and centromeric regions. CONCLUSIONS:CLARK is a versatile, fast and accurate sequence classification method, especially useful for metagenomics and genomics applications. It is freely available at http://clark.cs.ucr.edu/ .

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Ounit R,Wanamaker S,Close TJ,Lonardi S

doi

10.1186/s12864-015-1419-2

subject

Has Abstract

pub_date

2015-03-25 00:00:00

pages

236

issn

1471-2164

pii

10.1186/s12864-015-1419-2

journal_volume

16

pub_type

杂志文章
  • Orthonome - a new pipeline for predicting high quality orthologue gene sets applicable to complete and draft genomes.

    abstract:BACKGROUND:Distinguishing orthologous and paralogous relationships between genes across multiple species is essential for comparative genomic analyses. Various computational approaches have been developed to resolve these evolutionary relationships, but strong trade-offs between precision and recall of orthologue predi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4079-6

    authors: Rane RV,Oakeshott JG,Nguyen T,Hoffmann AA,Lee SF

    更新日期:2017-08-31 00:00:00

  • Genome-wide mapping of Hif-1α binding sites in zebrafish.

    abstract:BACKGROUND:Hypoxia Inducible Factor (HIF) regulates a cascade of transcriptional events in response to decreased oxygenation, acting from the cellular to the physiological level. This response is evolutionarily conserved, allowing the use of zebrafish (Danio rerio) as a model for studying the hypoxic response. Activati...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2169-x

    authors: Greenald D,Jeyakani J,Pelster B,Sealy I,Mathavan S,van Eeden FJ

    更新日期:2015-11-11 00:00:00

  • Unique aspects of fiber degradation by the ruminal ethanologen Ruminococcus albus 7 revealed by physiological and transcriptomic analysis.

    abstract:BACKGROUND:Bacteria in the genus Ruminococcus are ubiquitous members of the mammalian gastrointestinal tract. In particular, they are important in ruminants where they digest a wide range of plant cell wall polysaccharides. For example, Ruminococcus albus 7 is a primary cellulose degrader that produces acetate usable b...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-1066

    authors: Christopherson MR,Dawson JA,Stevenson DM,Cunningham AC,Bramhacharya S,Weimer PJ,Kendziorski C,Suen G

    更新日期:2014-12-04 00:00:00

  • Global gene expression profiling of brown to white adipose tissue transformation in sheep reveals novel transcriptional components linked to adipose remodeling.

    abstract:BACKGROUND:Large mammals are capable of thermoregulation shortly after birth due to the presence of brown adipose tissue (BAT). The majority of BAT disappears after birth and is replaced by white adipose tissue (WAT). RESULTS:We analyzed the postnatal transformation of adipose in sheep with a time course study of the ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1405-8

    authors: Basse AL,Dixen K,Yadav R,Tygesen MP,Qvortrup K,Kristiansen K,Quistorff B,Gupta R,Wang J,Hansen JB

    更新日期:2015-03-19 00:00:00

  • Leaps and lulls in the developmental transcriptome of Dictyostelium discoideum.

    abstract:BACKGROUND:Development of the soil amoeba Dictyostelium discoideum is triggered by starvation. When placed on a solid substrate, the starving solitary amoebae cease growth, communicate via extracellular cAMP, aggregate by tens of thousands and develop into multicellular organisms. Early phases of the developmental prog...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1491-7

    authors: Rosengarten RD,Santhanam B,Fuller D,Katoh-Kurasawa M,Loomis WF,Zupan B,Shaulsky G

    更新日期:2015-04-13 00:00:00

  • Sequence diversity and differential expression of major phenylpropanoid-flavonoid biosynthetic genes among three mango varieties.

    abstract:BACKGROUND:Mango fruits contain a broad spectrum of phenolic compounds which impart potential health benefits; their biosynthesis is catalysed by enzymes in the phenylpropanoid-flavonoid (PF) pathway. The aim of this study was to reveal the variability in genes involved in the PF pathway in three different mango variet...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1784-x

    authors: Hoang VL,Innes DJ,Shaw PN,Monteith GR,Gidley MJ,Dietzgen RG

    更新日期:2015-07-30 00:00:00

  • AP-SKAT: highly-efficient genome-wide rare variant association test.

    abstract:BACKGROUND:Genome-wide association studies have revealed associations between single-nucleotide polymorphisms (SNPs) and phenotypes such as disease symptoms and drug tolerance. To address the small sample size for rare variants, association studies tend to group gene or pathway level variants and evaluate the effect on...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3094-3

    authors: Hasegawa T,Kojima K,Kawai Y,Misawa K,Mimori T,Nagasaki M

    更新日期:2016-09-21 00:00:00

  • Papain-like cysteine proteases in Carica papaya: lineage-specific gene duplication and expansion.

    abstract:BACKGROUND:Papain-like cysteine proteases (PLCPs), a large group of cysteine proteases structurally related to papain, play important roles in plant development, senescence, and defense responses. Papain, the first cysteine protease whose structure was determined by X-ray crystallography, plays a crucial role in protec...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4394-y

    authors: Liu J,Sharma A,Niewiara MJ,Singh R,Ming R,Yu Q

    更新日期:2018-01-06 00:00:00

  • Positive correlation between gene coexpression and positional clustering in the zebrafish genome.

    abstract:BACKGROUND:Co-expressing genes tend to cluster in eukaryotic genomes. This paper analyzes correlation between the proximity of eukaryotic genes and their transcriptional expression pattern in the zebrafish (Danio rerio) genome using available microarray data and gene annotation. RESULTS:The analyses show that neighbou...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-42

    authors: Ng YK,Wu W,Zhang L

    更新日期:2009-01-22 00:00:00

  • Small RNA-based prediction of hybrid performance in maize.

    abstract:BACKGROUND:Small RNA (sRNA) sequences are known to have a broad impact on gene regulation by various mechanisms. Their performance for the prediction of hybrid traits has not yet been analyzed. Our objective was to analyze the relation of parental sRNA expression with the performance of their hybrids, to develop a sRNA...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4708-8

    authors: Seifert F,Thiemann A,Schrag TA,Rybka D,Melchinger AE,Frisch M,Scholten S

    更新日期:2018-05-21 00:00:00

  • High-utility conserved avian microsatellite markers enable parentage and population studies across a wide range of species.

    abstract:BACKGROUND:Microsatellites are widely used for many genetic studies. In contrast to single nucleotide polymorphism (SNP) and genotyping-by-sequencing methods, they are readily typed in samples of low DNA quality/concentration (e.g. museum/non-invasive samples), and enable the quick, cheap identification of species, hyb...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-176

    authors: Dawson DA,Ball AD,Spurgin LG,Martín-Gálvez D,Stewart IR,Horsburgh GJ,Potter J,Molina-Morales M,Bicknell AW,Preston SA,Ekblom R,Slate J,Burke T

    更新日期:2013-03-15 00:00:00

  • Short-term genome evolution of Listeria monocytogenes in a non-controlled environment.

    abstract:BACKGROUND:While increasing data on bacterial evolution in controlled environments are available, our understanding of bacterial genome evolution in natural environments is limited. We thus performed full genome analyses on four Listeria monocytogenes, including human and food isolates from both a 1988 case of sporadic...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-539

    authors: Orsi RH,Borowsky ML,Lauer P,Young SK,Nusbaum C,Galagan JE,Birren BW,Ivy RA,Sun Q,Graves LM,Swaminathan B,Wiedmann M

    更新日期:2008-11-13 00:00:00

  • Screening populations for copy number variation using genotyping-by-sequencing: a proof of concept using soybean fast neutron mutants.

    abstract:BACKGROUND:The effective use of mutant populations for reverse genetic screens relies on the population-wide characterization of the induced mutations. Genome- and population-wide characterization of the mutations found in fast neutron populations has been hindered, however, by the wide range of mutations generated and...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5998-1

    authors: Lemay MA,Torkamaneh D,Rigaill G,Boyle B,Stec AO,Stupar RM,Belzile F

    更新日期:2019-08-06 00:00:00

  • atBioNet--an integrated network analysis tool for genomics and biomarker discovery.

    abstract:BACKGROUND:Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. A...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-325

    authors: Ding Y,Chen M,Liu Z,Ding D,Ye Y,Zhang M,Kelly R,Guo L,Su Z,Harris SC,Qian F,Ge W,Fang H,Xu X,Tong W

    更新日期:2012-07-20 00:00:00

  • Delineation of condition specific Cis- and Trans-acting elements in plant promoters under various Endo- and exogenous stimuli.

    abstract:BACKGROUND:Transcription factors (TFs) play essential roles during plant development and response to environmental stresses. However, the relationships among transcription factors, cis-acting elements and target gene expression under endo- and exogenous stimuli have not been systematically characterized. RESULTS:Here,...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4469-4

    authors: Chow CN,Chiang-Hsieh YF,Chien CH,Zheng HQ,Lee TY,Wu NY,Tseng KC,Hou PF,Chang WC

    更新日期:2018-05-09 00:00:00

  • Stress-mediated convergence of splicing landscapes in male and female rock doves.

    abstract:BACKGROUND:The process of alternative splicing provides a unique mechanism by which eukaryotes are able to produce numerous protein products from the same gene. Heightened variability in the proteome has been thought to potentiate increased behavioral complexity and response flexibility to environmental stimuli, thus c...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-6600-6

    authors: Lang AS,Austin SH,Harris RM,Calisi RM,MacManes MD

    更新日期:2020-03-23 00:00:00

  • Identification and transcriptomic profiling of genes involved in increasing sugar content during salt stress in sweet sorghum leaves.

    abstract:BACKGROUND:Sweet sorghum is an annual C4 crop considered to be one of the most promising bio-energy crops due to its high sugar content in stem, yet it is poorly understood how this plant increases its sugar content in response to salt stress. In response to high NaCl, many of its major processes, such as photosynthesi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1760-5

    authors: Sui N,Yang Z,Liu M,Wang B

    更新日期:2015-07-19 00:00:00

  • Unsupervised genome-wide recognition of local relationship patterns.

    abstract:BACKGROUND:Phenomena such as incomplete lineage sorting, horizontal gene transfer, gene duplication and subsequent sub- and neo-functionalisation can result in distinct local phylogenetic relationships that are discordant with species phylogeny. In order to assess the possible biological roles for these subdivisions, t...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-347

    authors: Zamani N,Russell P,Lantz H,Hoeppner MP,Meadows JR,Vijay N,Mauceli E,di Palma F,Lindblad-Toh K,Jern P,Grabherr MG

    更新日期:2013-05-24 00:00:00

  • Transcriptome of the adult female malaria mosquito vector Anopheles albimanus.

    abstract:BACKGROUND:Human Malaria is transmitted by mosquitoes of the genus Anopheles. Transmission is a complex phenomenon involving biological and environmental factors of humans, parasites and mosquitoes. Among more than 500 anopheline species, only a few species from different branches of the mosquito evolutionary tree tran...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-207

    authors: Martínez-Barnetche J,Gómez-Barreto RE,Ovilla-Muñoz M,Téllez-Sosa J,García López DE,Dinglasan RR,Ubaida Mohien C,MacCallum RM,Redmond SN,Gibbons JG,Rokas A,Machado CA,Cazares-Raga FE,González-Cerón L,Hernández-Martínez S,Rod

    更新日期:2012-05-30 00:00:00

  • Directional RNA-seq reveals highly complex condition-dependent transcriptomes in E. coli K12 through accurate full-length transcripts assembling.

    abstract:BACKGROUND:Although prokaryotic gene transcription has been studied over decades, many aspects of the process remain poorly understood. Particularly, recent studies have revealed that transcriptomes in many prokaryotes are far more complex than previously thought. Genes in an operon are often alternatively and dynamica...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-520

    authors: Li S,Dong X,Su Z

    更新日期:2013-07-30 00:00:00

  • Identification of a quantitative trait loci (QTL) associated with ammonia tolerance in the Pacific white shrimp (Litopenaeus vannamei).

    abstract:BACKGROUND:Ammonia is one of the most common toxicological environment factors affecting shrimp health. Although ammonia tolerance in shrimp is closely related to successful industrial production, few genetic studies of this trait are available. RESULTS:In this study, we constructed a high-density genetic map of the P...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-07254-x

    authors: Zeng D,Yang C,Li Q,Zhu W,Chen X,Peng M,Chen X,Lin Y,Wang H,Liu H,Liang J,Liu Q,Zhao Y

    更新日期:2020-12-02 00:00:00

  • Analysis of the transcriptome of Panax notoginseng root uncovers putative triterpene saponin-biosynthetic genes and genetic markers.

    abstract:BACKGROUND:Panax notoginseng (Burk) F.H. Chen is important medicinal plant of the Araliacease family. Triterpene saponins are the bioactive constituents in P. notoginseng. However, available genomic information regarding this plant is limited. Moreover, details of triterpene saponin biosynthesis in the Panax species ar...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-S5-S5

    authors: Luo H,Sun C,Sun Y,Wu Q,Li Y,Song J,Niu Y,Cheng X,Xu H,Li C,Liu J,Steinmetz A,Chen S

    更新日期:2011-12-23 00:00:00

  • Expansion of CORE-SINEs in the genome of the Tasmanian devil.

    abstract:BACKGROUND:The genome of the carnivorous marsupial, the Tasmanian devil (Sarcophilus harrisii, Order: Dasyuromorphia), was sequenced in the hopes of finding a cure for or gaining a better understanding of the contagious devil facial tumor disease that is threatening the species' survival. To better understand the Tasma...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-172

    authors: Nilsson MA,Janke A,Murchison EP,Ning Z,Hallström BM

    更新日期:2012-05-06 00:00:00

  • Clustered regulatory elements at nucleosome-depleted regions punctuate a constant nucleosomal landscape in Schizosaccharomyces pombe.

    abstract:BACKGROUND:Nucleosomes facilitate the packaging of the eukaryotic genome and modulate the access of regulators to DNA. A detailed description of the nucleosomal organization under different transcriptional programmes is essential to understand their contribution to genomic regulation. RESULTS:To visualize the dynamics...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-813

    authors: Soriano I,Quintales L,Antequera F

    更新日期:2013-11-21 00:00:00

  • Genome-wide identification of novel intergenic enhancer-like elements: implications in the regulation of transcription in Plasmodium falciparum.

    abstract:BACKGROUND:The molecular mechanisms of transcriptional regulation are poorly understood in Plasmodium falciparum. In addition, most of the genes in Plasmodium falciparum are transcriptionally poised and only a handful of cis-regulatory elements are known to operate in transcriptional regulation. Here, we employed an ep...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4052-4

    authors: Ubhe S,Rawat M,Verma S,Anamika K,Karmodiya K

    更新日期:2017-08-23 00:00:00

  • Genome-wide association study using family-based cohorts identifies the WLS and CCDC170/ESR1 loci as associated with bone mineral density.

    abstract:BACKGROUND:Osteoporosis is a common and debilitating bone disease that is characterised by a low bone mineral density (BMD), a highly heritable trait. Genome-wide association studies (GWAS) have proven to be very successful in identifying common genetic variants associated with BMD adjusted for age, gender and weight, ...

    journal_title:BMC genomics

    pub_type: 杂志文章,meta分析

    doi:10.1186/s12864-016-2481-0

    authors: Mullin BH,Walsh JP,Zheng HF,Brown SJ,Surdulescu GL,Curtis C,Breen G,Dudbridge F,Richards JB,Spector TD,Wilson SG

    更新日期:2016-02-25 00:00:00

  • Comparative analysis of protein-protein interactions in the defense response of rice and wheat.

    abstract:BACKGROUND:Despite the importance of wheat as a major staple crop and the negative impact of diseases on its production worldwide, the genetic mechanisms and gene interactions involved in the resistance response in wheat are still poorly understood. The complete sequence of the rice genome has provided an extremely use...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-166

    authors: Cantu D,Yang B,Ruan R,Li K,Menzo V,Fu D,Chern M,Ronald PC,Dubcovsky J

    更新日期:2013-03-12 00:00:00

  • Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data.

    abstract:BACKGROUND:Runs of Homozygosity (ROH) are genomic regions where identical haplotypes are inherited from each parent. Since their first detection due to technological advances in the late 1990s, ROHs have been shedding light on human population history and deciphering the genetic basis of monogenic and complex traits an...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4489-0

    authors: Ceballos FC,Hazelhurst S,Ramsay M

    更新日期:2018-01-30 00:00:00

  • Functional Annotation of All Salmonid Genomes (FAASG): an international initiative supporting future salmonid research, conservation and aquaculture.

    abstract::We describe an emerging initiative - the 'Functional Annotation of All Salmonid Genomes' (FAASG), which will leverage the extensive trait diversity that has evolved since a whole genome duplication event in the salmonid ancestor, to develop an integrative understanding of the functional genomic basis of phenotypic var...

    journal_title:BMC genomics

    pub_type: 社论

    doi:10.1186/s12864-017-3862-8

    authors: Macqueen DJ,Primmer CR,Houston RD,Nowak BF,Bernatchez L,Bergseth S,Davidson WS,Gallardo-Escárate C,Goldammer T,Guiguen Y,Iturra P,Kijas JW,Koop BF,Lien S,Maass A,Martin SAM,McGinnity P,Montecino M,Naish KA,Nichols K

    更新日期:2017-06-27 00:00:00

  • Spir2; a novel QTL on chromosome 4 contributes to susceptibility to pneumococcal infection in mice.

    abstract:BACKGROUND:Streptococcus pneumoniae causes over one million deaths worldwide annually, despite recent developments in vaccine and antibiotic therapy. Host susceptibility to pneumococcal infection and disease is controlled by a combination of genetic and environmental influences, but current knowledge remains limited. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-242

    authors: Wisby L,Fernandes VE,Neill DR,Kadioglu A,Andrew PW,Denny P

    更新日期:2013-04-11 00:00:00