SEPATH: benchmarking the search for pathogens in human tissue whole genome sequence data leads to template pipelines.

Abstract:

BACKGROUND:Human tissue is increasingly being whole genome sequenced as we transition into an era of genomic medicine. With this arises the potential to detect sequences originating from microorganisms, including pathogens amid the plethora of human sequencing reads. In cancer research, the tumorigenic ability of pathogens is being recognized, for example, Helicobacter pylori and human papillomavirus in the cases of gastric non-cardia and cervical carcinomas, respectively. As of yet, no benchmark has been carried out on the performance of computational approaches for bacterial and viral detection within host-dominated sequence data. RESULTS:We present the results of benchmarking over 70 distinct combinations of tools and parameters on 100 simulated cancer datasets spiked with realistic proportions of bacteria. mOTUs2 and Kraken are the highest performing individual tools achieving median genus-level F1 scores of 0.90 and 0.91, respectively. mOTUs2 demonstrates a high performance in estimating bacterial proportions. Employing Kraken on unassembled sequencing reads produces a good but variable performance depending on post-classification filtering parameters. These approaches are investigated on a selection of cervical and gastric cancer whole genome sequences where Alphapapillomavirus and Helicobacter are detected in addition to a variety of other interesting genera. CONCLUSIONS:We provide the top-performing pipelines from this benchmark in a unifying tool called SEPATH, which is amenable to high throughput sequencing studies across a range of high-performance computing clusters. SEPATH provides a benchmarked and convenient approach to detect pathogens in tissue sequence data helping to determine the relationship between metagenomics and disease.

journal_name

Genome Biol

journal_title

Genome biology

authors

Gihawi A,Rallapalli G,Hurst R,Cooper CS,Leggett RM,Brewer DS

doi

10.1186/s13059-019-1819-8

subject

Has Abstract

pub_date

2019-10-22 00:00:00

pages

208

issue

1

eissn

1474-7596

issn

1474-760X

pii

10.1186/s13059-019-1819-8

journal_volume

20

pub_type

杂志文章
  • Genetic analysis of the human infective trypanosome Trypanosoma brucei gambiense: chromosomal segregation, crossing over, and the construction of a genetic map.

    abstract:BACKGROUND:Trypanosoma brucei is the causative agent of human sleeping sickness and animal trypanosomiasis in sub-Saharan Africa, and it has been subdivided into three subspecies: Trypanosoma brucei gambiense and Trypanosoma brucei rhodesiense, which cause sleeping sickness in humans, and the nonhuman infective Trypano...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-6-r103

    authors: Cooper A,Tait A,Sweeney L,Tweedie A,Morrison L,Turner CM,MacLeod A

    更新日期:2008-01-01 00:00:00

  • Homologous recombination: from model organisms to human disease.

    abstract::Recent experiments show that properly controlled recombination between homologous DNA molecules is essential for the maintenance of genome stability and for the prevention of tumorigenesis. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2001-2-5-reviews1014

    authors: Modesti M,Kanaar R

    更新日期:2001-01-01 00:00:00

  • Whole genome DNA sequencing provides an atlas of somatic mutagenesis in healthy human cells and identifies a tumor-prone cell type.

    abstract:BACKGROUND:The lifelong accumulation of somatic mutations underlies age-related phenotypes and cancer. Mutagenic forces are thought to shape the genome of aging cells in a tissue-specific way. Whole genome analyses of somatic mutation patterns, based on both types and genomic distribution of variants, can shed light on...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1892-z

    authors: Franco I,Helgadottir HT,Moggio A,Larsson M,Vrtačnik P,Johansson A,Norgren N,Lundin P,Mas-Ponte D,Nordström J,Lundgren T,Stenvinkel P,Wennberg L,Supek F,Eriksson M

    更新日期:2019-12-18 00:00:00

  • Ancient DNA and the rewriting of human history: be sparing with Occam's razor.

    abstract::Ancient DNA research is revealing a human history far more complex than that inferred from parsimonious models based on modern DNA. Here, we review some of the key events in the peopling of the world in the light of the findings of work on ancient DNA. ...

    journal_title:Genome biology

    pub_type: 历史文章,杂志文章,评审

    doi:10.1186/s13059-015-0866-z

    authors: Haber M,Mezzavilla M,Xue Y,Tyler-Smith C

    更新日期:2016-01-11 00:00:00

  • Observation of intermittency in gene expression on cDNA microarrays.

    abstract::We used scaled factorial moments to search for intermittency in the log expression ratios (LERs) for thousands of genes spotted on cDNA microarrays (gene chips). Results indicate varying levels of intermittency in gene expression. The observation of intermittency in the data analyzed provides a complimentary handle on...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2002-3-7-preprint0005

    authors: Peterson LE,Lau K

    更新日期:2002-05-29 00:00:00

  • Old soldiers never die ....

    abstract::An ancestral supersoldier phenotype of Pheidole ants can be recovered when selection for supersoldiers re-emerges, indicating that the developmental potential for caste pathways is retained. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb4000

    authors: Boomsma JJ,Nygaard S

    更新日期:2012-02-22 00:00:00

  • What does biologically meaningful mean? A perspective on gene regulatory network validation.

    abstract::Gene regulatory networks (GRNs) are rapidly being delineated, but their quality and biological meaning are often questioned. Here, I argue that biological meaning is challenging to define and discuss reasons why GRN validation should be interpreted cautiously. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-4-109

    authors: Walhout AJ

    更新日期:2011-01-01 00:00:00

  • Discovery and functional prioritization of Parkinson's disease candidate genes from large-scale whole exome sequencing.

    abstract:BACKGROUND:Whole-exome sequencing (WES) has been successful in identifying genes that cause familial Parkinson's disease (PD). However, until now this approach has not been deployed to study large cohorts of unrelated participants. To discover rare PD susceptibility variants, we performed WES in 1148 unrelated cases an...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1147-9

    authors: Jansen IE,Ye H,Heetveld S,Lechler MC,Michels H,Seinstra RI,Lubbe SJ,Drouet V,Lesage S,Majounie E,Gibbs JR,Nalls MA,Ryten M,Botia JA,Vandrovcova J,Simon-Sanchez J,Castillo-Lizardo M,Rizzu P,Blauwendraat C,Chouhan AK

    更新日期:2017-01-30 00:00:00

  • Chromatin Central: towards the comparative proteome by accurate mapping of the yeast proteomic environment.

    abstract:BACKGROUND:Understanding the design logic of living systems requires the understanding and comparison of proteomes. Proteomes define the commonalities between organisms more precisely than genomic sequences. Because uncertainties remain regarding the accuracy of proteomic data, several issues need to be resolved before...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-11-r167

    authors: Shevchenko A,Roguev A,Schaft D,Buchanan L,Habermann B,Sakalar C,Thomas H,Krogan NJ,Shevchenko A,Stewart AF

    更新日期:2008-01-01 00:00:00

  • The Dictyostelium genome encodes numerous RasGEFs with multiple biological roles.

    abstract:BACKGROUND:Dictyostelium discoideum is a eukaryote with a simple lifestyle and a relatively small genome whose sequence has been fully determined. It is widely used for studies on cell signaling, movement and multicellular development. Ras guanine-nucleotide exchange factors (RasGEFs) are the proteins that activate Ras...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2005-6-8-r68

    authors: Wilkins A,Szafranski K,Fraser DJ,Bakthavatsalam D,Müller R,Fisher PR,Glöckner G,Eichinger L,Noegel AA,Insall RH

    更新日期:2005-01-01 00:00:00

  • SAGE profiling of the forelimb and hindlimb.

    abstract::A recent study has used serial analysis of gene expression to compare mouse forelimb and hindlimb gene-expression profiles. The method successfully identified known regulators of limb identity and has generated a candidate set of differentially expressed genes that may regulate limb identity. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2002-3-3-reviews1007

    authors: Logan M

    更新日期:2002-01-01 00:00:00

  • Global orchestration of gene expression by the biological clock of cyanobacteria.

    abstract::Prokaryotic cyanobacteria express robust circadian (daily) rhythms under the control of a central clock. Recent studies shed light on the mechanisms governing circadian rhythms in cyanobacteria and highlight key differences between prokaryotic and eukaryotic clocks. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2004-5-4-217

    authors: Johnson CH

    更新日期:2004-01-01 00:00:00

  • DNA methylation and epigenomics: new technologies and emerging concepts.

    abstract::A report of the Keystone Symposia joint meetings on DNA Methylation and Epigenomics held in Keystone, Colorado, USA, 29 March to 3 April, 2015. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-015-0674-5

    authors: Chatterjee A,Eccles MR

    更新日期:2015-05-21 00:00:00

  • From biological clock to biological rhythms.

    abstract::The genetic and molecular analysis of circadian timekeeping mechanisms has accelerated as a result of the increasing volume of genomic markers and nucleotide sequence information. Completion of whole genome sequences and the use of differential gene expression technology will hasten the discovery of the clock output p...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2000-1-4-reviews1023

    authors: Hardin PE

    更新日期:2000-01-01 00:00:00

  • Altered retinal microRNA expression profile in a mouse model of retinitis pigmentosa.

    abstract:BACKGROUND:The role played by microRNAs (miRs) as common regulators in physiologic processes such as development and various disease states was recently highlighted. Retinitis pigmentosa (RP) linked to RHO (which encodes rhodopsin) is the most frequent form of inherited retinal degeneration that leads to blindness, for...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2007-8-11-r248

    authors: Loscher CJ,Hokamp K,Kenna PF,Ivens AC,Humphries P,Palfi A,Farrar GJ

    更新日期:2007-01-01 00:00:00

  • Evidence from comparative genomics for a complete sexual cycle in the 'asexual' pathogenic yeast Candida glabrata.

    abstract:BACKGROUND:Candida glabrata is a pathogenic yeast of increasing medical concern. It has been regarded as asexual since it was first described in 1917, yet phylogenetic analyses have revealed that it is more closely related to sexual yeasts than other Candida species. We show here that the C. glabrata genome contains ma...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2003-4-2-r10

    authors: Wong S,Fares MA,Zimmermann W,Butler G,Wolfe KH

    更新日期:2003-01-01 00:00:00

  • Changes in the organization of the genome during the mammalian cell cycle.

    abstract::By using chromosome conformation capture technology, a recent study has revealed two alternative three-dimensional folding states of the human genome during the cell cycle. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb4147

    authors: Giorgetti L,Servant N,Heard E

    更新日期:2013-12-24 00:00:00

  • Integrating systems biology data to yield functional genomics insights.

    abstract::A report of the recent EMBO Conference 'From Functional Genomics to Systems Biology' held at the EMBL Advanced Training Centre, Heidelberg, Germany, 13-16 November 2010. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb-2011-12-1-302

    authors: Fordyce P,Ingolia N

    更新日期:2011-01-01 00:00:00

  • The PRC-barrel: a widespread, conserved domain shared by photosynthetic reaction center subunits and proteins of RNA metabolism.

    abstract:BACKGROUND:The H subunit of the purple bacterial photosynthetic reaction center (PRC-H) is important for the assembly of the photosynthetic reaction center and appears to regulate electron transfer during the reduction of the secondary quinone. It contains a distinct cytoplasmic beta-barrel domain whose fold has no clo...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2002-3-11-research0061

    authors: Anantharaman V,Aravind L

    更新日期:2002-10-14 00:00:00

  • CircAtlas: an integrated resource of one million highly accurate circular RNAs from 1070 vertebrate transcriptomes.

    abstract::Existing circular RNA (circRNA) databases have become essential for transcriptomics. However, most are unsuitable for mining in-depth information for candidate circRNA prioritization. To address this, we integrate circular transcript collections to develop the circAtlas database based on 1070 RNA-seq samples collected...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-02018-y

    authors: Wu W,Ji P,Zhao F

    更新日期:2020-04-28 00:00:00

  • A novel mode of chromosomal evolution peculiar to filamentous Ascomycete fungi.

    abstract:BACKGROUND:Gene loss, inversions, translocations, and other chromosomal rearrangements vary among species, resulting in different rates of structural genome evolution. Major chromosomal rearrangements are rare in most eukaryotes, giving large regions with the same genes in the same order and orientation across species....

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-5-r45

    authors: Hane JK,Rouxel T,Howlett BJ,Kema GH,Goodwin SB,Oliver RP

    更新日期:2011-01-01 00:00:00

  • Accelerated exon evolution within primate segmental duplications.

    abstract:BACKGROUND:The identification of signatures of natural selection has long been used as an approach to understanding the unique features of any given species. Genes within segmental duplications are overlooked in most studies of selection due to the limitations of draft nonhuman genome assemblies and to the methodologic...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2013-14-1-r9

    authors: Lorente-Galdos B,Bleyhl J,Santpere G,Vives L,Ramírez O,Hernandez J,Anglada R,Cooper GM,Navarro A,Eichler EE,Marques-Bonet T

    更新日期:2013-01-29 00:00:00

  • Dynamic reprogramming of chromatin accessibility during Drosophila embryo development.

    abstract:BACKGROUND:The development of complex organisms is believed to involve progressive restrictions in cellular fate. Understanding the scope and features of chromatin dynamics during embryogenesis, and identifying regulatory elements important for directing developmental processes remain key goals of developmental biology...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-5-r43

    authors: Thomas S,Li XY,Sabo PJ,Sandstrom R,Thurman RE,Canfield TK,Giste E,Fisher W,Hammonds A,Celniker SE,Biggin MD,Stamatoyannopoulos JA

    更新日期:2011-01-01 00:00:00

  • Copy number variation goes clinical.

    abstract::A report of the First Golden Helix Symposium 'Copy Number Variation (CNV) and Genomic Alterations in Health and Disease', Athens, Greece, 28-29 November 2008. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb-2009-10-1-301

    authors: Le Caignec C,Redon R

    更新日期:2009-01-01 00:00:00

  • Genome-wide signatures of differential DNA methylation in pediatric acute lymphoblastic leukemia.

    abstract:BACKGROUND:Although aberrant DNA methylation has been observed previously in acute lymphoblastic leukemia (ALL), the patterns of differential methylation have not been comprehensively determined in all subtypes of ALL on a genome-wide scale. The relationship between DNA methylation, cytogenetic background, drug resista...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2013-14-9-r105

    authors: Nordlund J,Bäcklin CL,Wahlberg P,Busche S,Berglund EC,Eloranta ML,Flaegstad T,Forestier E,Frost BM,Harila-Saari A,Heyman M,Jónsson OG,Larsson R,Palle J,Rönnblom L,Schmiegelow K,Sinnett D,Söderhäll S,Pastinen T,Gusta

    更新日期:2013-09-24 00:00:00

  • Species-wide distribution of highly polymorphic minisatellite markers suggests past and present genetic exchanges among house mouse subspecies.

    abstract:BACKGROUND:Four hypervariable minisatellite loci were scored on a panel of 116 individuals of various geographical origins representing a large part of the diversity present in house mouse subspecies. Internal structures of alleles were determined by minisatellite variant repeat mapping PCR to produce maps of interming...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2007-8-5-r80

    authors: Bonhomme F,Rivals E,Orth A,Grant GR,Jeffreys AJ,Bois PR

    更新日期:2007-01-01 00:00:00

  • Individual mRNA expression profiles reveal the effects of specific microRNAs.

    abstract:BACKGROUND:MicroRNAs (miRNAs) are oligoribonucleotides with an important role in regulation of gene expression at the level of translation. Despite imperfect target complementarity, they can also significantly reduce mRNA levels. The validity of miRNA target gene predictions is difficult to assess at the protein level....

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-5-r82

    authors: Arora A,Simpson DA

    更新日期:2008-01-01 00:00:00

  • Preferential binding of HIF-1 to transcriptionally active loci determines cell-type specific response to hypoxia.

    abstract:BACKGROUND:Hypoxia-inducible factor 1 (HIF-1) plays a key role in cellular adaptation to hypoxia. To better understand the determinants of HIF-1 binding and transactivation, we used ChIP-chip and gene expression profiling to define the relationship between the epigenetic landscape, sites of HIF-1 binding, and genes tra...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2009-10-10-r113

    authors: Xia X,Kung AL

    更新日期:2009-01-01 00:00:00

  • Anticipatory evolution and DNA shuffling.

    abstract::DNA shuffling has proven to be a powerful technique for the directed evolution of proteins. A mix of theoretical and applied research has now provided insights into how recombination can be guided to more efficiently generate proteins and even organisms with altered functions. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2002-3-8-reviews1021

    authors: Bacher JM,Reiss BD,Ellington AD

    更新日期:2002-07-31 00:00:00

  • bin3C: exploiting Hi-C sequencing data to accurately resolve metagenome-assembled genomes.

    abstract::Most microbes cannot be easily cultured, and metagenomics provides a means to study them. Current techniques aim to resolve individual genomes from metagenomes, so-called metagenome-assembled genomes (MAGs). Leading approaches depend upon time series or transect studies, the efficacy of which is a function of communit...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1643-1

    authors: DeMaere MZ,Darling AE

    更新日期:2019-02-26 00:00:00