A geographically-diverse collection of 418 human gut microbiome pathway genome databases.

Abstract:

:Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly available. However, idiosyncratic data processing methods between studies introduce systematic differences that confound comparative analyses. To overcome these challenges, we developed GutCyc, a compendium of environmental pathway genome databases (ePGDBs) constructed from 418 assembled human microbiome datasets using MetaPathways, enabling reproducible functional metagenomic annotation. We also generated metabolic network reconstructions for each metagenome using the Pathway Tools software, empowering researchers and clinicians interested in visualizing and interpreting metabolic pathways encoded by the human gut microbiome. For the first time, GutCyc provides consistent annotations and metabolic pathway predictions, making possible comparative community analyses between health and disease states in inflammatory bowel disease, Crohn's disease, and type 2 diabetes. GutCyc data products are searchable online, or may be downloaded and explored locally using MetaPathways and Pathway Tools.

journal_name

Sci Data

journal_title

Scientific data

authors

Hahn AS,Altman T,Konwar KM,Hanson NW,Kim D,Relman DA,Dill DL,Hallam SJ

doi

10.1038/sdata.2017.35

subject

Has Abstract

pub_date

2017-04-11 00:00:00

pages

170035

issn

2052-4463

pii

sdata201735

journal_volume

4

pub_type

杂志文章
  • A collection of rumen bacteriome data from 334 mid-lactation dairy cows.

    abstract::With the help of the bacteria in the rumen, ruminants can effectively convert human inedible plant fiber to edible food (meat and milk). However, the understanding of rumen bacteriome in dairy cows is still limited, especially in a large population under the same diet, breed, and milking period. Here we described the ...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.301

    authors: Sun HZ,Xue M,Guan LL,Liu J

    更新日期:2019-01-22 00:00:00

  • If these data could talk.

    abstract::In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientific fields exhibit distressin...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.114

    authors: Pasquier T,Lau MK,Trisovic A,Boose ER,Couturier B,Crosas M,Ellison AM,Gibson V,Jones CR,Seltzer M

    更新日期:2017-09-05 00:00:00

  • Genome sequencing of a single tardigrade Hypsibius dujardini individual.

    abstract::Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molec...

    journal_title:Scientific data

    pub_type: 评论,杂志文章

    doi:10.1038/sdata.2016.63

    authors: Arakawa K,Yoshida Y,Tomita M

    更新日期:2016-08-16 00:00:00

  • Transcriptome data of temporal and cingulate cortex in the Rett syndrome brain.

    abstract::Rett syndrome is an X-linked neurodevelopmental disorder caused by mutation in the methyl-CpG-binding protein 2 gene (MECP2) in the majority of cases. We describe an RNA sequencing dataset of postmortem brain tissue samples from four females clinically diagnosed with Rett syndrome and four age-matched female donors. T...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0527-2

    authors: Aldinger KA,Timms AE,MacDonald JW,McNamara HK,Herstein JS,Bammler TK,Evgrafov OV,Knowles JA,Levitt P

    更新日期:2020-06-19 00:00:00

  • Computational workflow to study the seasonal variation of secondary metabolites in nine different bryophytes.

    abstract::In Eco-Metabolomics interactions are studied of non-model organisms in their natural environment and relations are made between biochemistry and ecological function. Current challenges when processing such metabolomics data involve complex experiment designs which are often carried out in large field campaigns involvi...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.179

    authors: Peters K,Gorzolka K,Bruelheide H,Neumann S

    更新日期:2018-08-28 00:00:00

  • Multivariate time series dataset for space weather data analytics.

    abstract::We introduce and make openly accessible a comprehensive, multivariate time series (MVTS) dataset extracted from solar photospheric vector magnetograms in Spaceweather HMI Active Region Patch (SHARP) series. Our dataset also includes a cross-checked NOAA solar flare catalog that immediately facilitates solar flare pred...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0548-x

    authors: Angryk RA,Martens PC,Aydin B,Kempton D,Mahajan SS,Basodi S,Ahmadzadeh A,Cai X,Filali Boubrahimi S,Hamdi SM,Schuh MA,Georgoulis MK

    更新日期:2020-07-10 00:00:00

  • VLUIS, a land use data product for Victoria, Australia, covering 2006 to 2013.

    abstract::Land Use Information is a key dataset required to enable an understanding of the changing nature of our landscapes and the associated influences on natural resources and regional communities. The Victorian Land Use Information System (VLUIS) data product has been created within the State Government of Victoria to supp...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.70

    authors: Morse-McNabb E,Sheffield K,Clark R,Lewis H,Robson S,Cherry D,Williams S

    更新日期:2015-11-24 00:00:00

  • Time series of heat demand and heat pump efficiency for energy system modeling.

    abstract::With electric heat pumps substituting for fossil-fueled alternatives, the temporal variability of their power consumption becomes increasingly important to the electricity system. To easily include this variability in energy system analyses, this paper introduces the "When2Heat" dataset comprising synthetic national t...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0199-y

    authors: Ruhnau O,Hirth L,Praktiknjo A

    更新日期:2019-10-01 00:00:00

  • A dataset of distribution and diversity of ticks in China.

    abstract::While tick-borne zoonoses, such as Lyme disease and tick-borne encephalitis, present an increasing global concern, knowledge of their vectors' distribution remains limited, especially for China. In this paper, we present the first comprehensive dataset of known tick species and their distributions in China, derived fr...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0115-5

    authors: Zhang G,Zheng D,Tian Y,Li S

    更新日期:2019-07-01 00:00:00

  • Spatial data of Ixodes ricinus instar abundance and nymph pathogen prevalence, Scandinavia, 2016-2017.

    abstract::Ticks carry pathogens that can cause disease in both animals and humans, and there is a need to monitor the distribution and abundance of ticks and the pathogens they carry to pinpoint potential high risk areas for tick-borne disease transmission. In a joint Scandinavian study, we measured Ixodes ricinus instar abunda...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00579-y

    authors: Kjær LJ,Klitgaard K,Soleng A,Edgar KS,Lindstedt HEH,Paulsen KM,Andreassen ÅK,Korslund L,Kjelland V,Slettan A,Stuen S,Kjellander P,Christensson M,Teräväinen M,Baum A,Jensen LM,Bødker R

    更新日期:2020-07-16 00:00:00

  • Tesco Grocery 1.0, a large-scale dataset of grocery purchases in London.

    abstract::We present the Tesco Grocery 1.0 dataset: a record of 420 M food items purchased by 1.6 M fidelity card owners who shopped at the 411 Tesco stores in Greater London over the course of the entire year of 2015, aggregated at the level of census areas to preserve anonymity. For each area, we report the number of transact...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0397-7

    authors: Aiello LM,Quercia D,Schifanella R,Del Prete L

    更新日期:2020-02-18 00:00:00

  • Draft genome of the big-headed turtle Platysternon megacephalum.

    abstract::The big-headed turtle, Platysternon megacephalum, as the sole member of the monotypic family Platysternidae, has a number of distinct characteristics including an extra-large head, long tail, flat carapace, and a preference for low water temperature environments. We performed whole genome sequencing, assembly, and gen...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0067-9

    authors: Cao D,Wang M,Ge Y,Gong S

    更新日期:2019-05-16 00:00:00

  • Experimental flows through an array of emerged or slightly submerged square cylinders over a rough bed.

    abstract::The experimental dataset presented was collected in an 18 m long and 1 m wide laboratory flume. Low to high flood flows through an urbanized floodplain were modelled. The floodplain bed is rough, modelled with dense artificial grass. A square cylinder array, representing house models, was set on the rough bed. The cyl...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00791-w

    authors: Oukacine M,Proust S,Larrarte F,Goutal N

    更新日期:2021-01-11 00:00:00

  • Multiple-data-based monthly geopotential model set LDCmgm90.

    abstract::While the GRACE (Gravity Recovery and Climate Experiment) satellite mission is of great significance in understanding various branches of Earth sciences, the quality of GRACE monthly products can be unsatisfactory due to strong longitudinal stripe-pattern errors and other flaws. Based on corrected GRACE Mascon (mass c...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0239-7

    authors: Chen W,Luo J,Ray J,Yu N,Li JC

    更新日期:2019-10-23 00:00:00

  • Construction, complete sequence, and annotation of a BAC contig covering the silkworm chorion locus.

    abstract::The silkmoth chorion was studied extensively by F.C. Kafatos' group for almost 40 years. However, the complete structure of the chorion locus was not obtained in the genome sequence of Bombyx mori published in 2008 due to repetitive sequences, resulting in gaps and an incomplete view of the locus. To obtain the comple...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.62

    authors: Chen Z,Nohata J,Guo H,Li S,Liu J,Guo Y,Yamamoto K,Kadono-Okuda K,Liu C,Arunkumar KP,Nagaraju J,Zhang Y,Liu S,Labropoulou V,Swevers L,Tsitoura P,Iatrou K,Gopinathan KP,Goldsmith MR,Xia Q,Mita K

    更新日期:2015-11-10 00:00:00

  • Quantitative mapping of RNA-mediated nuclear estrogen receptor β interactome in human breast cancer cells.

    abstract::The nuclear receptor estrogen receptor 2 (ESR2, ERβ) modulates cancer cell proliferation and tumor growth, exerting an oncosuppressive role in breast cancer (BC). Interaction proteomics by tandem affinity purification coupled to mass spectrometry was previously applied in BC cells to identify proteins acting in concer...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.31

    authors: Giurato G,Nassa G,Salvati A,Alexandrova E,Rizzo F,Nyman TA,Weisz A,Tarallo R

    更新日期:2018-03-06 00:00:00

  • The Dat Project, an open and decentralized research data tool.

    abstract::Today's scientific data are primarily stored and accessed via centralized Web-based infrastructure. Centralization has advantages but also carries risks such as link rot and content drift, which can hinder scientific progress. It is time to ask whether traditional, centralized Web architecture aligns with scholarly pr...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.221

    authors: Robinson DC,Hand JA,Madsen MB,McKelvey KR

    更新日期:2018-10-23 00:00:00

  • PPDIST, global 0.1° daily and 3-hourly precipitation probability distribution climatologies for 1979-2018.

    abstract::We introduce the Precipitation Probability DISTribution (PPDIST) dataset, a collection of global high-resolution (0.1°) observation-based climatologies (1979-2018) of the occurrence and peak intensity of precipitation (P) at daily and 3-hourly time-scales. The climatologies were produced using neural networks trained ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00631-x

    authors: Beck HE,Westra S,Tan J,Pappenberger F,Huffman GJ,McVicar TR,Gründemann GJ,Vergopolan N,Fowler HJ,Lewis E,Verbist K,Wood EF

    更新日期:2020-09-11 00:00:00

  • Sample descriptors linked to metagenomic sequencing data from human and animal enteric samples from Vietnam.

    abstract::There is still limited information on the diversity of viruses co-circulating in humans and animals. Here, we report data obtained from a large field collection of enteric samples taken from humans, pigs, rodents and other mammal hosts in Vietnam between 2012 and 2016. Each of 2100 stool or rectal swab samples was sub...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0215-2

    authors: Woolhouse M,Ashworth J,Bogaardt C,Tue NT,Baker S,Thwaites G,Phuc TM

    更新日期:2019-10-15 00:00:00

  • Data for training and testing radiation detection algorithms in an urban environment.

    abstract::The detection, identification, and localization of illicit nuclear materials in urban environments is of utmost importance for national security. Most often, the process of performing these operations consists of a team of trained individuals equipped with radiation detection devices that have built-in algorithms to a...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00672-2

    authors: Ghawaly JM Jr,Nicholson AD,Peplow DE,Anderson-Cook CM,Myers KL,Archer DE,Willis MJ,Quiter BJ

    更新日期:2020-10-05 00:00:00

  • Phase contrast time-lapse microscopy datasets with automated and manual cell tracking annotations.

    abstract::Phase contrast time-lapse microscopy is a non-destructive technique that generates large volumes of image-based information to quantify the behaviour of individual cells or cell populations. To guide the development of algorithms for computer-aided cell tracking and analysis, 48 time-lapse image sequences, each spanni...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.237

    authors: Ker DFE,Eom S,Sanami S,Bise R,Pascale C,Yin Z,Huh SI,Osuna-Highley E,Junkers SN,Helfrich CJ,Liang PY,Pan J,Jeong S,Kang SS,Liu J,Nicholson R,Sandbothe MF,Van PT,Liu A,Chen M,Kanade T,Weiss LE,Campbell PG

    更新日期:2018-11-13 00:00:00

  • Multidisciplinary database of permeability of fault zones and surrounding protolith rocks at world-wide sites.

    abstract::Brittle faults and fault zones are important fluid flow conduits through the upper part of Earth's crust that are involved in many well-known phenomena (e.g. earthquakes, thermal water and gas transport, or water leakage to underground tunnels). The permeability property, or the ability of porous materials to conduct ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0435-5

    authors: Scibek J

    更新日期:2020-03-19 00:00:00

  • Nationwide registry of sepsis patients in Japan focused on disseminated intravascular coagulation 2011-2013.

    abstract::Sepsis is a syndrome with physiologic, pathologic, and biochemical abnormalities induced by infection. Sepsis can induce the dysregulation of systemic coagulation and fibrinolytic systems, resulting in disseminated intravascular coagulation (DIC), which is associated with a high mortality rate. Although there is no in...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.243

    authors: Hayakawa M,Yamakawa K,Saito S,Uchino S,Kudo D,Iizuka Y,Sanui M,Takimoto K,Mayumi T

    更新日期:2018-12-11 00:00:00

  • Machine learning for the detection of early immunological markers as predictors of multi-organ dysfunction.

    abstract::The immune response to major trauma has been analysed mainly within post-hospital admission settings where the inflammatory response is already underway and the early drivers of clinical outcome cannot be readily determined. Thus, there is a need to better understand the immediate immune response to injury and how thi...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0337-6

    authors: Bravo-Merodio L,Acharjee A,Hazeldine J,Bentley C,Foster M,Gkoutos GV,Lord JM

    更新日期:2019-12-19 00:00:00

  • Publisher Correction: The Scales Project, a cross-national dataset on the interpretation of thermal perception scales.

    abstract::An amendment to this paper has been published and can be accessed via a link at the top of the paper. ...

    journal_title:Scientific data

    pub_type: 杂志文章,已发布勘误

    doi:10.1038/s41597-019-0348-3

    authors: Schweiker M,Abdul-Zahra A,André M,Al-Atrash F,Al-Khatri H,Alprianti RR,Alsaad H,Amin R,Ampatzi E,Arsano AY,Azadeh M,Azar E,Bahareh B,Batagarawa A,Becker S,Buonocore C,Cao B,Choi JH,Chun C,Daanen H,Damiati SA,Dan

    更新日期:2020-01-06 00:00:00

  • The Centennial Trends Greater Horn of Africa precipitation dataset.

    abstract::East Africa is a drought prone, food and water insecure region with a highly variable climate. This complexity makes rainfall estimation challenging, and this challenge is compounded by low rain gauge densities and inhomogeneous monitoring networks. The dearth of observations is particularly problematic over the past ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.50

    authors: Funk C,Nicholson SE,Landsfeld M,Klotter D,Peterson P,Harrison L

    更新日期:2015-09-29 00:00:00

  • Small-wedge synchrotron and serial XFEL datasets for Cysteinyl leukotriene GPCRs.

    abstract::Structural studies of challenging targets such as G protein-coupled receptors (GPCRs) have accelerated during the last several years due to the development of new approaches, including small-wedge and serial crystallography. Here, we describe the deposition of seven datasets consisting of X-ray diffraction images acqu...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00729-2

    authors: Marin E,Luginina A,Gusach A,Kovalev K,Bukhdruker S,Khorn P,Polovinkin V,Lyapina E,Rogachev A,Gordeliy V,Mishin A,Cherezov V,Borshchevskiy V

    更新日期:2020-11-12 00:00:00

  • MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports.

    abstract::Chest radiography is an extremely powerful imaging modality, allowing for a detailed inspection of a patient's chest, but requires specialized training for proper interpretation. With the advent of high performance general purpose computer vision algorithms, the accurate automated analysis of chest radiographs is beco...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0322-0

    authors: Johnson AEW,Pollard TJ,Berkowitz SJ,Greenbaum NR,Lungren MP,Deng CY,Mark RG,Horng S

    更新日期:2019-12-12 00:00:00

  • One millennium of historical freshwater fish occurrence data for Portuguese rivers and streams.

    abstract::The insights that historical evidence of human presence and man-made documents provide are unique. For example, using historical data may be critical to adequately understand the ecological requirements of species. However, historical information about freshwater species distribution remains largely a knowledge gap. I...

    journal_title:Scientific data

    pub_type: 历史文章,杂志文章

    doi:10.1038/sdata.2018.163

    authors: Duarte G,Moreira M,Branco P,da Costa L,Ferreira MT,Segurado P

    更新日期:2018-08-14 00:00:00

  • A database seed for a community-driven material intensity research platform.

    abstract::The data record contains Material Intensity data for buildings (MI). MI coefficients are often used for different types of analysis of socio-economic systems and in particular for environmental assessments. Until now, MI values were compiled and reported ad-hoc with few cross-study comparisons. We extracted and conver...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0021-x

    authors: Heeren N,Fishman T

    更新日期:2019-04-09 00:00:00