A kingdom-specific protein domain HMM library for improved annotation of fungal genomes.

Abstract:

BACKGROUND:Pfam is a general-purpose database of protein domain alignments and profile Hidden Markov Models (HMMs), which is very popular for the annotation of sequence data produced by genome sequencing projects. Pfam provides models that are often very general in terms of the taxa that they cover and it has previously been suggested that such general models may lack some of the specificity or selectivity that would be provided by kingdom-specific models. RESULTS:Here we present a general approach to create domain libraries of HMMs for sub-taxa of a kingdom. Taking fungal species as an example, we construct a domain library of HMMs (called Fungal Pfam or FPfam) using sequences from 30 genomes, consisting of 24 species from the ascomycetes group and two basidiomycetes, Ustilago maydis, a fungal pathogen of maize, and the white rot fungus Phanerochaete chrysosporium. In addition, we include the Microsporidion Encephalitozoon cuniculi, an obligate intracellular parasite, and two non-fungal species, the oomycetes Phytophthora sojae and Phytophthora ramorum, both plant pathogens. We evaluate the performance in terms of coverage against the original 30 genomes used in training FPfam and against five more recently sequenced fungal genomes that can be considered as an independent test set. We show that kingdom-specific models such as FPfam can find instances of both novel and well characterized domains, increases overall coverage and detects more domains per sequence with typically higher bitscores than Pfam for the same domain families. An evaluation of the effect of changing E-values on the coverage shows that the performance of FPfam is consistent over the range of E-values applied. CONCLUSION:Kingdom-specific models are shown to provide improved coverage. However, as the models become more specific, some sequences found by Pfam may be missed by the models in FPfam and some of the families represented in the test set are not present in FPfam. Therefore, we recommend that both general and specific libraries are used together for annotation and we find that a significant improvement in coverage is achieved by using both Pfam and FPfam.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Alam I,Hubbard SJ,Oliver SG,Rattray M

doi

10.1186/1471-2164-8-97

subject

Has Abstract

pub_date

2007-04-10 00:00:00

pages

97

issn

1471-2164

pii

1471-2164-8-97

journal_volume

8

pub_type

杂志文章
  • Zooplankton diversity analysis through single-gene sequencing of a community sample.

    abstract:BACKGROUND:Oceans cover more than 70% of the earth's surface and are critical for the homeostasis of the environment. Among the components of the ocean ecosystem, zooplankton play vital roles in energy and matter transfer through the system. Despite their importance, understanding of zooplankton biodiversity is limited...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-438

    authors: Machida RJ,Hashiguchi Y,Nishida M,Nishida S

    更新日期:2009-09-17 00:00:00

  • Bcheck: a wrapper tool for detecting RNase P RNA genes.

    abstract:BACKGROUND:Effective bioinformatics solutions are needed to tackle challenges posed by industrial-scale genome annotation. We present Bcheck, a wrapper tool which predicts RNase P RNA genes by combining the speed of pattern matching and sensitivity of covariance models. The core of Bcheck is a library of subfamily spec...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-432

    authors: Yusuf D,Marz M,Stadler PF,Hofacker IL

    更新日期:2010-07-13 00:00:00

  • Genome-wide expression profiling shows transcriptional reprogramming in Fusarium graminearum by Fusarium graminearum virus 1-DK21 infection.

    abstract:BACKGROUND:Fusarium graminearum virus 1 strain-DK21 (FgV1-DK21) is a mycovirus that confers hypovirulence to F. graminearum, which is the primary phytopathogenic fungus that causes Fusarium head blight (FHB) disease in many cereals. Understanding the interaction between mycoviruses and plant pathogenic fungi is necessa...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-173

    authors: Cho WK,Yu J,Lee KM,Son M,Min K,Lee YW,Kim KH

    更新日期:2012-05-06 00:00:00

  • Activation of metabolic and stress responses during subtoxic expression of the type I toxin hok in Erwinia amylovora.

    abstract:BACKGROUND:Toxin-antitoxin (TA) systems, abundant in prokaryotes, are composed of a toxin gene and its cognate antitoxin. Several toxins are implied to affect the physiological state and stress tolerance of bacteria in a population. We previously identified a chromosomally encoded hok-sok type I TA system in Erwinia am...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-021-07376-w

    authors: Peng J,Triplett LR,Sundin GW

    更新日期:2021-01-22 00:00:00

  • Metabolic modeling and analysis of the metabolic switch in Streptomyces coelicolor.

    abstract:BACKGROUND:The transition from exponential to stationary phase in Streptomyces coelicolor is accompanied by a major metabolic switch and results in a strong activation of secondary metabolism. Here we have explored the underlying reorganization of the metabolome by combining computational predictions based on constrain...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-202

    authors: Alam MT,Merlo ME,STREAM Consortium.,Hodgson DA,Wellington EM,Takano E,Breitling R

    更新日期:2010-03-26 00:00:00

  • DNA methylation regulates discrimination of enhancers from promoters through a H3K4me1-H3K4me3 seesaw mechanism.

    abstract:BACKGROUND:DNA methylation at promoters is largely correlated with inhibition of gene expression. However, the role of DNA methylation at enhancers is not fully understood, although a crosstalk with chromatin marks is expected. Actually, there exist contradictory reports about positive and negative correlations between...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-4353-7

    authors: Sharifi-Zarchi A,Gerovska D,Adachi K,Totonchi M,Pezeshk H,Taft RJ,Schöler HR,Chitsaz H,Sadeghi M,Baharvand H,Araúzo-Bravo MJ

    更新日期:2017-12-12 00:00:00

  • Exposure to maternal obesity alters gene expression in the preimplantation ovine conceptus.

    abstract:BACKGROUND:Embryonic and fetal exposure to maternal obesity causes several maladaptive morphological and epigenetic changes in exposed offspring. The timing of these events is unclear, but changes can be observed even after a short exposure to maternal obesity around the time of conception. The hypothesis of this work ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5120-0

    authors: McCoski SR,Vailes MT,Owens CE,Cockrum RR,Ealy AD

    更新日期:2018-10-11 00:00:00

  • Drosophila melanogaster retrotransposon and inverted repeat-derived endogenous siRNAs are differentially processed in distinct cellular locations.

    abstract:BACKGROUND:Endogenous small interfering (esi)RNAs repress mRNA levels and retrotransposon mobility in Drosophila somatic cells by poorly understood mechanisms. 21 nucleotide esiRNAs are primarily generated from retrotransposons and two inverted repeat (hairpin) loci in Drosophila culture cells in a Dicer2 dependent man...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3692-8

    authors: Harrington AW,McKain MR,Michalski D,Bauer KM,Daugherty JM,Steiniger M

    更新日期:2017-04-17 00:00:00

  • Comparative genomic analysis of eutherian fibroblast growth factor genes.

    abstract:BACKGROUND:The eutherian fibroblast growth factors were implicated as key regulators in developmental processes. However, there were major disagreements in descriptions of comprehensive eutherian fibroblast growth factors gene data sets including either 18 or 22 homologues. The present analysis attempted to revise and ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06958-4

    authors: Premzl M

    更新日期:2020-08-05 00:00:00

  • Comprehensive SNP array study of frequently used neuroblastoma cell lines; copy neutral loss of heterozygosity is common in the cell lines but uncommon in primary tumors.

    abstract:BACKGROUND:Copy neutral loss of heterozygosity (CN-LOH) refers to a special case of LOH occurring without any resulting loss in copy number. These alterations is sometimes seen in tumors as a way to inactivate a tumor suppressor gene and have been found to be important in several types of cancer. RESULTS:We have used ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-443

    authors: Kryh H,Carén H,Erichsen J,Sjöberg RM,Abrahamsson J,Kogner P,Martinsson T

    更新日期:2011-09-07 00:00:00

  • Comparative genomics and concerted evolution of beta-tubulin paralogs in Leishmania spp.

    abstract:BACKGROUND:Tubulin isotypes and expression patterns are highly regulated in diverse organisms. The genome sequence of the protozoan parasite Leishmania major contains three distinct beta-tubulin loci. To investigate the diversity of beta-tubulin genes, we have compared the published genome sequence to draft genome sequ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-137

    authors: Jackson AP,Vaughan S,Gull K

    更新日期:2006-06-06 00:00:00

  • Evidence of uneven selective pressure on different subsets of the conserved human genome; implications for the significance of intronic and intergenic DNA.

    abstract:BACKGROUND:Human genetic variation produces the wide range of phenotypic differences that make us individual. However, little is known about the distribution of variation in the most conserved functional regions of the human genome. We examined whether different subsets of the conserved human genome have been subjected...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-614

    authors: Davidson S,Starkey A,MacKenzie A

    更新日期:2009-12-16 00:00:00

  • Rapid quantification of sequence repeats to resolve the size, structure and contents of bacterial genomes.

    abstract:BACKGROUND:The numerous classes of repeats often impede the assembly of genome sequences from the short reads provided by new sequencing technologies. We demonstrate a simple and rapid means to ascertain the repeat structure and total size of a bacterial or archaeal genome without the need for assembly by directly anal...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-537

    authors: Williams D,Trimble WL,Shilts M,Meyer F,Ochman H

    更新日期:2013-08-08 00:00:00

  • Novel microRNA discovery using small RNA sequencing in post-mortem human brain.

    abstract:BACKGROUND:MicroRNAs (miRNAs) are short, non-coding RNAs that regulate gene expression mainly through translational repression of target mRNA molecules. More than 2700 human miRNAs have been identified and some are known to be associated with disease phenotypes and to display tissue-specific patterns of expression. ME...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3114-3

    authors: Wake C,Labadorf A,Dumitriu A,Hoss AG,Bregu J,Albrecht KH,DeStefano AL,Myers RH

    更新日期:2016-10-04 00:00:00

  • Unlocking the mystery of the hard-to-sequence phage genome: PaP1 methylome and bacterial immunity.

    abstract:BACKGROUND:Whole-genome sequencing is an important method to understand the genetic information, gene function, biological characteristics and survival mechanisms of organisms. Sequencing large genomes is very simple at present. However, we encountered a hard-to-sequence genome of Pseudomonas aeruginosa phage PaP1. Sho...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-803

    authors: Lu S,Le S,Tan Y,Li M,Liu C,Zhang K,Huang J,Chen H,Rao X,Zhu J,Zou L,Ni Q,Li S,Wang J,Jin X,Hu Q,Yao X,Zhao X,Zhang L,Huang G,Hu F

    更新日期:2014-09-19 00:00:00

  • Gene expression correlated with delay in shell formation in larval Pacific oysters (Crassostrea gigas) exposed to experimental ocean acidification provides insights into shell formation mechanisms.

    abstract:BACKGROUND:Despite recent work to characterize gene expression changes associated with larval development in oysters, the mechanism by which the larval shell is first formed is still largely unknown. In Crassostrea gigas, this shell forms within the first 24 h post fertilization, and it has been demonstrated that chang...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4519-y

    authors: De Wit P,Durland E,Ventura A,Langdon CJ

    更新日期:2018-02-22 00:00:00

  • Dual species transcript profiling during the interaction between banana (Musa acuminata) and the fungal pathogen Fusarium oxysporum f. sp. cubense.

    abstract:BACKGROUND:Banana wilt disease, caused by Fusarium oxysporum f. sp. cubense Tropical Race 4 (Foc TR4), is one of the most devastating diseases in banana (Musa spp.). Foc is a soil borne pathogen that causes rot of the roots or wilt of leaves by colonizing the xylem vessels. The dual RNA sequencing is used to simultaneo...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5902-z

    authors: Li W,Wang X,Li C,Sun J,Li S,Peng M

    更新日期:2019-06-24 00:00:00

  • HLA-VBSeq: accurate HLA typing at full resolution from whole-genome sequencing data.

    abstract:BACKGROUND:Human leucocyte antigen (HLA) genes play an important role in determining the outcome of organ transplantation and are linked to many human diseases. Because of the diversity and polymorphisms of HLA loci, HLA typing at high resolution is challenging even with whole-genome sequencing data. RESULTS:We have d...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-16-S2-S7

    authors: Nariai N,Kojima K,Saito S,Mimori T,Sato Y,Kawai Y,Yamaguchi-Kabata Y,Yasuda J,Nagasaki M

    更新日期:2015-01-01 00:00:00

  • Blood-based epigenetic estimators of chronological age in human adults using DNA methylation data from the Illumina MethylationEPIC array.

    abstract:BACKGROUND:Epigenetic clocks have been recognized for their precise prediction of chronological age, age-related diseases, and all-cause mortality. Existing epigenetic clocks are based on CpGs from the Illumina HumanMethylation450 BeadChip (450 K) which has now been replaced by the latest platform, Illumina Methylation...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-07168-8

    authors: Lee Y,Haftorn KL,Denault WRP,Nustad HE,Page CM,Lyle R,Lee-Ødegård S,Moen GH,Prasad RB,Groop LC,Sletner L,Sommer C,Magnus MC,Gjessing HK,Harris JR,Magnus P,Håberg SE,Jugessur A,Bohlin J

    更新日期:2020-10-27 00:00:00

  • Effect of sample stratification on dairy GWAS results.

    abstract:BACKGROUND:Artificial insemination and genetic selection are major factors contributing to population stratification in dairy cattle. In this study, we analyzed the effect of sample stratification and the effect of stratification correction on results of a dairy genome-wide association study (GWAS). Three methods for s...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-536

    authors: Ma L,Wiggans GR,Wang S,Sonstegard TS,Yang J,Crooker BA,Cole JB,Van Tassell CP,Lawlor TJ,Da Y

    更新日期:2012-10-06 00:00:00

  • Mitochondrial dysregulation and oxidative stress in patients with chronic kidney disease.

    abstract:BACKGROUND:Chronic renal disease (CKD) is characterized by complex changes in cell metabolism leading to an increased production of oxygen radicals, that, in turn has been suggested to play a key role in numerous clinical complications of this pathological condition. Several reports have focused on the identification o...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-388

    authors: Granata S,Zaza G,Simone S,Villani G,Latorre D,Pontrelli P,Carella M,Schena FP,Grandaliano G,Pertosa G

    更新日期:2009-08-21 00:00:00

  • Steps toward broad-spectrum therapeutics: discovering virulence-associated genes present in diverse human pathogens.

    abstract:BACKGROUND:New and improved antimicrobial countermeasures are urgently needed to counteract increased resistance to existing antimicrobial treatments and to combat currently untreatable or new emerging infectious diseases. We demonstrate that computational comparative genomics, together with experimental screening, can...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-501

    authors: Stubben CJ,Duffield ML,Cooper IA,Ford DC,Gans JD,Karlyshev AV,Lingard B,Oyston PC,de Rochefort A,Song J,Wren BW,Titball RW,Wolinsky M

    更新日期:2009-10-29 00:00:00

  • SV-AUTOPILOT: optimized, automated construction of structural variation discovery and benchmarking pipelines.

    abstract:BACKGROUND:Many tools exist to predict structural variants (SVs), utilizing a variety of algorithms. However, they have largely been developed and tested on human germline or somatic (e.g. cancer) variation. It seems appropriate to exploit this wealth of technology available for humans also for other species. Objective...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1376-9

    authors: Leung WY,Marschall T,Paudel Y,Falquet L,Mei H,Schönhuth A,Maoz Moss TY

    更新日期:2015-03-25 00:00:00

  • Population structure, demographic history and local adaptation of the grass carp.

    abstract:BACKGROUND:Genetic diversity within a species reflects population evolution, ecology, and ability to adapt. Genome-wide population surveys of both natural and introduced populations provide insights into genetic diversity, the evolutionary processes and the genetic basis underlying local adaptation. Grass carp is the m...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5872-1

    authors: Shen Y,Wang L,Fu J,Xu X,Yue GH,Li J

    更新日期:2019-06-07 00:00:00

  • Anthocyanin biosynthetic genes in Brassica rapa.

    abstract:BACKGROUND:Anthocyanins are a group of flavonoid compounds. As a group of important secondary metabolites, they perform several key biological functions in plants. Anthocyanins also play beneficial health roles as potentially protective factors against cancer and heart disease. To elucidate the anthocyanin biosynthetic...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-426

    authors: Guo N,Cheng F,Wu J,Liu B,Zheng S,Liang J,Wang X

    更新日期:2014-06-04 00:00:00

  • 2009 pandemic H1N1 influenza virus elicits similar clinical course but differential host transcriptional response in mouse, macaque, and swine infection models.

    abstract:BACKGROUND:The 2009 pandemic H1N1 influenza virus emerged in swine and quickly became a major global health threat. In mouse, non human primate, and swine infection models, the pH1N1 virus efficiently replicates in the lung and induces pro-inflammatory host responses; however, whether similar or different cellular path...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-627

    authors: Go JT,Belisle SE,Tchitchek N,Tumpey TM,Ma W,Richt JA,Safronetz D,Feldmann H,Katze MG

    更新日期:2012-11-15 00:00:00

  • Functional innovations of three chronological mesohexaploid Brassica rapa genomes.

    abstract:BACKGROUND:The Brassicaceae family is an exemplary model for studying plant polyploidy. The Brassicaceae knowledge-base includes the well-annotated Arabidopsis thaliana reference sequence; well-established evidence for three rounds of whole genome duplication (WGD); and the conservation of genomic structure, with 24 co...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-606

    authors: Kim J,Lee J,Choi JP,Park I,Yang K,Kim MK,Lee YH,Nou IS,Kim DS,Min SR,Park SU,Kim H

    更新日期:2014-07-18 00:00:00

  • De novo transcriptome reconstruction and annotation of the Egyptian rousette bat.

    abstract:BACKGROUND:The Egyptian Rousette bat (Rousettus aegyptiacus), a common fruit bat species found throughout Africa and the Middle East, was recently identified as a natural reservoir host of Marburg virus. With Ebola virus, Marburg virus is a member of the family Filoviridae that causes severe hemorrhagic fever disease i...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2124-x

    authors: Lee AK,Kulcsar KA,Elliott O,Khiabanian H,Nagle ER,Jones ME,Amman BR,Sanchez-Lockhart M,Towner JS,Palacios G,Rabadan R

    更新日期:2015-12-07 00:00:00

  • Whole genome sequencing analysis of multiple Salmonella serovars provides insights into phylogenetic relatedness, antimicrobial resistance, and virulence markers across humans, food animals and agriculture environmental sources.

    abstract:BACKGROUND:Salmonella enterica is a significant foodborne pathogen, which can be transmitted via several distinct routes, and reports on acquisition of antimicrobial resistance (AMR) are increasing. To better understand the association between human Salmonella clinical isolates and the potential environmental/animal re...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5137-4

    authors: Pornsukarom S,van Vliet AHM,Thakur S

    更新日期:2018-11-06 00:00:00

  • A hybrid expectation maximisation and MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL mapping.

    abstract:BACKGROUND:Bayesian mixture models in which the effects of SNP are assumed to come from normal distributions with different variances are attractive for simultaneous genomic prediction and QTL mapping. These models are usually implemented with Monte Carlo Markov Chain (MCMC) sampling, which requires long compute times ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3082-7

    authors: Wang T,Chen YP,Bowman PJ,Goddard ME,Hayes BJ

    更新日期:2016-09-21 00:00:00