MicroPro: using metagenomic unmapped reads to provide insights into human microbiota and disease associations.

Abstract:

:We develop a metagenomic data analysis pipeline, MicroPro, that takes into account all reads from known and unknown microbial organisms and associates viruses with complex diseases. We utilize MicroPro to analyze four metagenomic datasets relating to colorectal cancer, type 2 diabetes, and liver cirrhosis and show that including reads from unknown organisms significantly increases the prediction accuracy of the disease status for three of the four datasets. We identify new microbial organisms associated with these diseases and show viruses play important prediction roles in colorectal cancer and liver cirrhosis, but not in type 2 diabetes. MicroPro is freely available at https://github.com/zifanzhu/MicroPro .

journal_name

Genome Biol

journal_title

Genome biology

authors

Zhu Z,Ren J,Michail S,Sun F

doi

10.1186/s13059-019-1773-5

subject

Has Abstract

pub_date

2019-08-06 00:00:00

pages

154

issue

1

eissn

1474-7596

issn

1474-760X

pii

10.1186/s13059-019-1773-5

journal_volume

20

pub_type

杂志文章
  • The nuclear receptor ERβ engages AGO2 in regulation of gene transcription, RNA splicing and RISC loading.

    abstract:BACKGROUND:The RNA-binding protein Argonaute 2 (AGO2) is a key effector of RNA-silencing pathways It exerts a pivotal role in microRNA maturation and activity and can modulate chromatin remodeling, transcriptional gene regulation and RNA splicing. Estrogen receptor beta (ERβ) is endowed with oncosuppressive activities,...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1321-0

    authors: Tarallo R,Giurato G,Bruno G,Ravo M,Rizzo F,Salvati A,Ricciardi L,Marchese G,Cordella A,Rocco T,Gigantino V,Pierri B,Cimmino G,Milanesi L,Ambrosino C,Nyman TA,Nassa G,Weisz A

    更新日期:2017-10-06 00:00:00

  • The Dictyostelium genome encodes numerous RasGEFs with multiple biological roles.

    abstract:BACKGROUND:Dictyostelium discoideum is a eukaryote with a simple lifestyle and a relatively small genome whose sequence has been fully determined. It is widely used for studies on cell signaling, movement and multicellular development. Ras guanine-nucleotide exchange factors (RasGEFs) are the proteins that activate Ras...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2005-6-8-r68

    authors: Wilkins A,Szafranski K,Fraser DJ,Bakthavatsalam D,Müller R,Fisher PR,Glöckner G,Eichinger L,Noegel AA,Insall RH

    更新日期:2005-01-01 00:00:00

  • Clustered CTCF binding is an evolutionary mechanism to maintain topologically associating domains.

    abstract:BACKGROUND:CTCF binding contributes to the establishment of a higher-order genome structure by demarcating the boundaries of large-scale topologically associating domains (TADs). However, despite the importance and conservation of TADs, the role of CTCF binding in their evolution and stability remains elusive. RESULTS...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1894-x

    authors: Kentepozidou E,Aitken SJ,Feig C,Stefflova K,Ibarra-Soria X,Odom DT,Roller M,Flicek P

    更新日期:2020-01-07 00:00:00

  • The Adult Mouse Anatomical Dictionary: a tool for annotating and integrating data.

    abstract::We have developed an ontology to provide standardized nomenclature for anatomical terms in the postnatal mouse. The Adult Mouse Anatomical Dictionary is structured as a directed acyclic graph, and is organized hierarchically both spatially and functionally. The ontology will be used to annotate and integrate different...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2005-6-3-r29

    authors: Hayamizu TF,Mangan M,Corradi JP,Kadin JA,Ringwald M

    更新日期:2005-01-01 00:00:00

  • Functional constraint and small insertions and deletions in the ENCODE regions of the human genome.

    abstract:BACKGROUND:We describe the distribution of indels in the 44 Encyclopedia of DNA Elements (ENCODE) regions (about 1% of the human genome) and evaluate the potential contributions of small insertion and deletion polymorphisms (indels) to human genetic variation. We relate indels to known genomic annotation features and m...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2007-8-9-r180

    authors: Clark TG,Andrew T,Cooper GM,Margulies EH,Mullikin JC,Balding DJ

    更新日期:2007-01-01 00:00:00

  • Activity map of the tammar X chromosome shows that marsupial X inactivation is incomplete and escape is stochastic.

    abstract:BACKGROUND:X chromosome inactivation is a spectacular example of epigenetic silencing. In order to deduce how this complex system evolved, we examined X inactivation in a model marsupial, the tammar wallaby (Macropus eugenii). In marsupials, X inactivation is known to be paternal, incomplete and tissue-specific, and oc...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2010-11-12-r122

    authors: Al Nadaf S,Waters PD,Koina E,Deakin JE,Jordan KS,Graves JA

    更新日期:2010-01-01 00:00:00

  • Decreasing miRNA sequencing bias using a single adapter and circularization approach.

    abstract::The ability to accurately quantify all the microRNAs (miRNAs) in a sample is important for understanding miRNA biology and for development of new biomarkers and therapeutic targets. We develop a new method for preparing miRNA sequencing libraries, RealSeq®-AC, that involves ligating the miRNAs with a single adapter an...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-018-1488-z

    authors: Barberán-Soler S,Vo JM,Hogans RE,Dallas A,Johnston BH,Kazakov SA

    更新日期:2018-09-03 00:00:00

  • Protein recoding by ADAR1-mediated RNA editing is not essential for normal development and homeostasis.

    abstract:BACKGROUND:Adenosine-to-inosine (A-to-I) editing of dsRNA by ADAR proteins is a pervasive epitranscriptome feature. Tens of thousands of A-to-I editing events are defined in the mouse, yet the functional impact of most is unknown. Editing causing protein recoding is the essential function of ADAR2, but an essential rol...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1301-4

    authors: Heraud-Farlow JE,Chalk AM,Linder SE,Li Q,Taylor S,White JM,Pang L,Liddicoat BJ,Gupte A,Li JB,Walkley CR

    更新日期:2017-09-05 00:00:00

  • Detection and analysis of alternative splicing in Yarrowia lipolytica reveal structural constraints facilitating nonsense-mediated decay of intron-retaining transcripts.

    abstract:BACKGROUND:Hemiascomycetous yeasts have intron-poor genomes with very few cases of alternative splicing. Most of the reported examples result from intron retention in Saccharomyces cerevisiae and some have been shown to be functionally significant. Here we used transcriptome-wide approaches to evaluate the mechanisms u...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2010-11-6-r65

    authors: Mekouar M,Blanc-Lenfle I,Ozanne C,Da Silva C,Cruaud C,Wincker P,Gaillardin C,Neuvéglise C

    更新日期:2010-01-01 00:00:00

  • The continuum of causality in human genetic disorders.

    abstract::Studies of human genetic disorders have traditionally followed a reductionist paradigm. Traits are defined as Mendelian or complex based on family pedigree and population data, whereas alleles are deemed rare, common, benign, or deleterious based on their population frequencies. The availability of exome and genome da...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-1107-9

    authors: Katsanis N

    更新日期:2016-11-17 00:00:00

  • The phytochrome red/far-red photoreceptor superfamily.

    abstract::Proteins of the phytochrome superfamily of red/far-red light receptors have a variety of biological roles in plants, algae, bacteria and fungi and demonstrate a diversity of spectral sensitivities and output signaling mechanisms. Over the past few years the first three-dimensional structures of phytochrome light-sensi...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2008-9-8-230

    authors: Sharrock RA

    更新日期:2008-01-01 00:00:00

  • Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates.

    abstract:BACKGROUND:Abundant pseudogenes are a feature of mammalian genomes. Processed pseudogenes (PPs) are reverse transcribed from mRNAs. Recent molecular biological studies show that mammalian long interspersed element 1 (L1)-encoded proteins may have been involved in PP reverse transcription. Here, we present the first com...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2003-4-11-r74

    authors: Ohshima K,Hattori M,Yada T,Gojobori T,Sakaki Y,Okada N

    更新日期:2003-01-01 00:00:00

  • Large-scale and high-confidence proteomic analysis of human seminal plasma.

    abstract:BACKGROUND:The development of mass spectrometric (MS) techniques now allows the investigation of very complex protein mixtures ranging from subcellular structures to tissues. Body fluids are also popular targets of proteomic analysis because of their potential for biomarker discovery. Seminal plasma has not yet receive...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2006-7-5-r40

    authors: Pilch B,Mann M

    更新日期:2006-01-01 00:00:00

  • Genome-wide identification of functionally distinct subsets of cellular mRNAs associated with two nucleocytoplasmic-shuttling mammalian splicing factors.

    abstract:BACKGROUND:Pre-mRNA splicing is an essential step in gene expression that occurs co-transcriptionally in the cell nucleus, involving a large number of RNA binding protein splicing factors, in addition to core spliceosome components. Several of these proteins are required for the recognition of intronic sequence element...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2006-7-11-r113

    authors: Gama-Carvalho M,Barbosa-Morais NL,Brodsky AS,Silver PA,Carmo-Fonseca M

    更新日期:2006-01-01 00:00:00

  • Archaeal phylogeny based on proteins of the transcription and translation machineries: tackling the Methanopyrus kandleri paradox.

    abstract:BACKGROUND:Phylogenetic analysis of the Archaea has been mainly established by 16S rRNA sequence comparison. With the accumulation of completely sequenced genomes, it is now possible to test alternative approaches by using large sequence datasets. We analyzed archaeal phylogeny using two concatenated datasets consistin...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2004-5-3-r17

    authors: Brochier C,Forterre P,Gribaldo S

    更新日期:2004-01-01 00:00:00

  • Divergence in cis-regulatory networks: taking the 'species' out of cross-species analysis.

    abstract::Many essential transcription factors have conserved roles in regulating biological programs, yet their genomic occupancy can diverge significantly. A new study demonstrates that such variations are primarily due to cis-regulatory sequences, rather than differences between the regulators or nuclear environments. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2008-9-11-240

    authors: Zinzen RP,Furlong EE

    更新日期:2008-01-01 00:00:00

  • External signals shape the epigenome.

    abstract::A new study shows how a single cytokine, interleukin-4, regulates hematopoietic lineage choice by activating the JAK3-STAT6 pathway, which causes dendritic-cell-specific DNA demethylation. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-0884-5

    authors: Lennartsson A

    更新日期:2016-02-01 00:00:00

  • An ontology for cell types.

    abstract::We describe an ontology for cell types that covers the prokaryotic, fungal, animal and plant worlds. It includes over 680 cell types. These cell types are classified under several generic categories and are organized as a directed acyclic graph. The ontology is available in the formats adopted by the Open Biological O...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2005-6-2-r21

    authors: Bard J,Rhee SY,Ashburner M

    更新日期:2005-01-01 00:00:00

  • Butterfly gene flow goes berserk.

    abstract::A new study shows that genomic introgression between two Heliconius butterfly species is not solely confined to color pattern loci. ...

    journal_title:Genome biology

    pub_type: 评论,杂志文章

    doi:10.1186/s13059-016-0898-z

    authors: Ffrench-Constant RH

    更新日期:2016-02-27 00:00:00

  • FAN-C: a feature-rich framework for the analysis and visualisation of chromosome conformation capture data.

    abstract::Chromosome conformation capture data, particularly from high-throughput approaches such as Hi-C, are typically very complex to analyse. Existing analysis tools are often single-purpose, or limited in compatibility to a small number of data formats, frequently making Hi-C analyses tedious and time-consuming. Here, we p...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-02215-9

    authors: Kruse K,Hug CB,Vaquerizas JM

    更新日期:2020-12-17 00:00:00

  • Whole genome DNA sequencing provides an atlas of somatic mutagenesis in healthy human cells and identifies a tumor-prone cell type.

    abstract:BACKGROUND:The lifelong accumulation of somatic mutations underlies age-related phenotypes and cancer. Mutagenic forces are thought to shape the genome of aging cells in a tissue-specific way. Whole genome analyses of somatic mutation patterns, based on both types and genomic distribution of variants, can shed light on...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1892-z

    authors: Franco I,Helgadottir HT,Moggio A,Larsson M,Vrtačnik P,Johansson A,Norgren N,Lundin P,Mas-Ponte D,Nordström J,Lundgren T,Stenvinkel P,Wennberg L,Supek F,Eriksson M

    更新日期:2019-12-18 00:00:00

  • Obstacles to detecting isoforms using full-length scRNA-seq data.

    abstract:BACKGROUND:Early single-cell RNA-seq (scRNA-seq) studies suggested that it was unusual to see more than one isoform being produced from a gene in a single cell, even when multiple isoforms were detected in matched bulk RNA-seq samples. However, these studies generally did not consider the impact of dropouts or isoform ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-01981-w

    authors: Westoby J,Artemov P,Hemberg M,Ferguson-Smith A

    更新日期:2020-03-23 00:00:00

  • Direct measurement of transcription rates reveals multiple mechanisms for configuration of the Arabidopsis ambient temperature response.

    abstract:BACKGROUND:Sensing and responding to ambient temperature is important for controlling growth and development of many organisms, in part by regulating mRNA levels. mRNA abundance can change with temperature, but it is unclear whether this results from changes in transcription or decay rates, and whether passive or activ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2014-15-3-r45

    authors: Sidaway-Lee K,Costa MJ,Rand DA,Finkenstadt B,Penfield S

    更新日期:2014-03-03 00:00:00

  • Assessing taxonomic metagenome profilers with OPAL.

    abstract::The explosive growth in taxonomic metagenome profiling methods over the past years has created a need for systematic comparisons using relevant performance criteria. The Open-community Profiling Assessment tooL (OPAL) implements commonly used performance metrics, including those of the first challenge of the initiativ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1646-y

    authors: Meyer F,Bremges A,Belmann P,Janssen S,McHardy AC,Koslicki D

    更新日期:2019-03-04 00:00:00

  • Can sequence determine function?

    abstract::The functional annotation of proteins identified in genome sequencing projects is based on similarities to homologs in the databases. As a result of the possible strategies for divergent evolution, homologous enzymes frequently do not catalyze the same reaction, and we conclude that assignment of function from sequenc...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2000-1-5-reviews0005

    authors: Gerlt JA,Babbitt PC

    更新日期:2000-01-01 00:00:00

  • Accelerated exon evolution within primate segmental duplications.

    abstract:BACKGROUND:The identification of signatures of natural selection has long been used as an approach to understanding the unique features of any given species. Genes within segmental duplications are overlooked in most studies of selection due to the limitations of draft nonhuman genome assemblies and to the methodologic...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2013-14-1-r9

    authors: Lorente-Galdos B,Bleyhl J,Santpere G,Vives L,Ramírez O,Hernandez J,Anglada R,Cooper GM,Navarro A,Eichler EE,Marques-Bonet T

    更新日期:2013-01-29 00:00:00

  • A new recruit for the army of the men of death.

    abstract::The army of the men of death, in John Bunyan's memorable phrase, has a new recruit, and fear has a new face: a face wearing a surgical mask. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2003-4-7-113

    authors: Petsko GA

    更新日期:2003-01-01 00:00:00

  • Characterization of background noise in capture-based targeted sequencing data.

    abstract:BACKGROUND:Targeted deep sequencing is increasingly used to detect low-allelic fraction variants; it is therefore essential that errors that constitute baseline noise and impose a practical limit on detection are characterized. In the present study, we systematically evaluate the extent to which errors are incurred dur...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1275-2

    authors: Park G,Park JK,Shin SH,Jeon HJ,Kim NKD,Kim YJ,Shin HT,Lee E,Lee KH,Son DS,Park WY,Park D

    更新日期:2017-07-21 00:00:00

  • Vex-seq: high-throughput identification of the impact of genetic variation on pre-mRNA splicing efficiency.

    abstract::Understanding the functional impact of genomic variants is a major goal of modern genetics and personalized medicine. Although many synonymous and non-coding variants act through altering the efficiency of pre-mRNA splicing, it is difficult to predict how these variants impact pre-mRNA splicing. Here, we describe a ma...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-018-1437-x

    authors: Adamson SI,Zhan L,Graveley BR

    更新日期:2018-06-01 00:00:00

  • Retrozymes are a unique family of non-autonomous retrotransposons with hammerhead ribozymes that propagate in plants through circular RNAs.

    abstract:BACKGROUND:Catalytic RNAs, or ribozymes, are regarded as fossils of a prebiotic RNA world that have remained in the genomes of modern organisms. The simplest ribozymes are the small self-cleaving RNAs, like the hammerhead ribozyme, which have been historically considered biological oddities restricted to some RNA patho...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-1002-4

    authors: Cervera A,Urbina D,de la Peña M

    更新日期:2016-06-23 00:00:00