Tesco Grocery 1.0, a large-scale dataset of grocery purchases in London.

Abstract:

:We present the Tesco Grocery 1.0 dataset: a record of 420 M food items purchased by 1.6 M fidelity card owners who shopped at the 411 Tesco stores in Greater London over the course of the entire year of 2015, aggregated at the level of census areas to preserve anonymity. For each area, we report the number of transactions and nutritional properties of the typical food item bought including the average caloric intake and the composition of nutrients. The set of global trade international numbers (barcodes) for each food type is also included. To establish data validity we: i) compare food purchase volumes to population from census to assess representativeness, and ii) match nutrient and energy intake to official statistics of food-related illnesses to appraise the extent to which the dataset is ecologically valid. Given its unprecedented scale and geographic granularity, the data can be used to link food purchases to a number of geographically-salient indicators, which enables studies on health outcomes, cultural aspects, and economic factors.

journal_name

Sci Data

journal_title

Scientific data

authors

Aiello LM,Quercia D,Schifanella R,Del Prete L

doi

10.1038/s41597-020-0397-7

subject

Has Abstract

pub_date

2020-02-18 00:00:00

pages

57

issue

1

issn

2052-4463

pii

10.1038/s41597-020-0397-7

journal_volume

7

pub_type

杂志文章
  • Outlier analyses of the Protein Data Bank archive using a probability-density-ranking approach.

    abstract::Outlier analyses are central to scientific data assessments. Conventional outlier identification methods do not work effectively for Protein Data Bank (PDB) data, which are characterized by heavy skewness and the presence of bounds and/or long tails. We have developed a data-driven nonparametric method to identify out...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.293

    authors: Shao C,Liu Z,Yang H,Wang S,Burley SK

    更新日期:2018-12-11 00:00:00

  • The systematic identification of cytoskeletal genes required for Drosophila melanogaster muscle maintenance.

    abstract::Animal muscles must maintain their function and structure while bearing substantial mechanical loads. How muscles withstand persistent mechanical strain is presently not well understood. Understanding the mechanisms by which tissues maintain their complex architecture is a key goal of cell biology. This dataset repres...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2014.2

    authors: Perkins AD,Lee MJ,Tanentzapf G

    更新日期:2014-03-11 00:00:00

  • Transcriptome data of temporal and cingulate cortex in the Rett syndrome brain.

    abstract::Rett syndrome is an X-linked neurodevelopmental disorder caused by mutation in the methyl-CpG-binding protein 2 gene (MECP2) in the majority of cases. We describe an RNA sequencing dataset of postmortem brain tissue samples from four females clinically diagnosed with Rett syndrome and four age-matched female donors. T...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0527-2

    authors: Aldinger KA,Timms AE,MacDonald JW,McNamara HK,Herstein JS,Bammler TK,Evgrafov OV,Knowles JA,Levitt P

    更新日期:2020-06-19 00:00:00

  • A collection of rumen bacteriome data from 334 mid-lactation dairy cows.

    abstract::With the help of the bacteria in the rumen, ruminants can effectively convert human inedible plant fiber to edible food (meat and milk). However, the understanding of rumen bacteriome in dairy cows is still limited, especially in a large population under the same diet, breed, and milking period. Here we described the ...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.301

    authors: Sun HZ,Xue M,Guan LL,Liu J

    更新日期:2019-01-22 00:00:00

  • Unbalanced historical phenotypic data from seed regeneration of a barley ex situ collection.

    abstract::The scarce knowledge on phenotypic characterization restricts the usage of genetic diversity of plant genetic resources in research and breeding. We describe original and ready-to-use processed data for approximately 60% of ~22,000 barley accessions hosted at the Federal ex situ Genebank for Agricultural and Horticult...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.278

    authors: Gonzalez MY,Weise S,Zhao Y,Philipp N,Arend D,Börner A,Oppermann M,Graner A,Reif JC,Schulthess AW

    更新日期:2018-12-04 00:00:00

  • RE-Europe, a large-scale dataset for modeling a highly renewable European electricity system.

    abstract::Future highly renewable energy systems will couple to complex weather and climate dynamics. This coupling is generally not captured in detail by the open models developed in the power and energy system communities, where such open models exist. To enable modeling such a future energy system, we describe a dedicated la...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.175

    authors: Jensen TV,Pinson P

    更新日期:2017-11-28 00:00:00

  • Creating a surrogate commuter network from Australian Bureau of Statistics census data.

    abstract::Between the 2011 and 2016 national censuses, the Australian Bureau of Statistics changed its anonymity policy compliance system for the distribution of census data. The new method has resulted in dramatic inconsistencies when comparing low-resolution data to aggregated high-resolution data. Hence, aggregated totals do...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0137-z

    authors: Fair KM,Zachreson C,Prokopenko M

    更新日期:2019-08-16 00:00:00

  • Long-term surveys of age structure in 13 ungulate and one ostrich species in the Serengeti, 1926-2018.

    abstract::The Serengeti ecosystem spans an extensive network of protected areas in Tanzania, eastern Africa, and a UNESCO Wold Heritage Site. It is home to some of the largest animal migrations on the planet. Here, we describe a dataset consisting of the sample counts of three age classes (infant, juvenile and adult) of 13 ungu...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00701-0

    authors: Rogy P,Sinclair ARE

    更新日期:2020-10-21 00:00:00

  • Thermodynamic and transport properties of hydrogen containing streams.

    abstract::The use of hydrogen (H2) as a substitute for fossil fuel, which accounts for the majority of the world's energy, is environmentally the most benign option for the reduction of CO2 emissions. This will require gigawatt-scale storage systems and as such, H2 storage in porous rocks in the subsurface will be required. Acc...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0568-6

    authors: Hassanpouryouzband A,Joonaki E,Edlmann K,Heinemann N,Yang J

    更新日期:2020-07-09 00:00:00

  • Ground reference data for sugarcane biomass estimation in São Paulo state, Brazil.

    abstract::In order to make effective decisions on sustainable development, it is essential for sugarcane-producing countries to take into account sugarcane acreage and sugarcane production dynamics. The availability of sugarcane biophysical data along the growth season is key to an effective mapping of such dynamics, especially...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.150

    authors: Molijn RA,Iannini L,Rocha JV,Hanssen RF

    更新日期:2018-08-07 00:00:00

  • PPDIST, global 0.1° daily and 3-hourly precipitation probability distribution climatologies for 1979-2018.

    abstract::We introduce the Precipitation Probability DISTribution (PPDIST) dataset, a collection of global high-resolution (0.1°) observation-based climatologies (1979-2018) of the occurrence and peak intensity of precipitation (P) at daily and 3-hourly time-scales. The climatologies were produced using neural networks trained ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00631-x

    authors: Beck HE,Westra S,Tan J,Pappenberger F,Huffman GJ,McVicar TR,Gründemann GJ,Vergopolan N,Fowler HJ,Lewis E,Verbist K,Wood EF

    更新日期:2020-09-11 00:00:00

  • Genome-wide barcoded transposon screen for cancer drug sensitivity in haploid mouse embryonic stem cells.

    abstract::We describe a screen for cellular response to drugs that makes use of haploid embryonic stem cells. We generated ten libraries of mutants with piggyBac gene trap transposon integrations, totalling approximately 100,000 mutant clones. Random barcode sequences were inserted into the transposon vector to allow the number...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.20

    authors: Pettitt SJ,Krastev DB,Pemberton HN,Fontebasso Y,Frankum J,Rehman FL,Brough R,Song F,Bajrami I,Rafiq R,Wallberg F,Kozarewa I,Fenwick K,Armisen-Garrido J,Swain A,Gulati A,Campbell J,Ashworth A,Lord CJ

    更新日期:2017-03-01 00:00:00

  • A high-throughput drug combination screen of targeted small molecule inhibitors in cancer cell lines.

    abstract::While there is a high interest in drug combinations in cancer therapy, openly accessible datasets for drug combination responses are sparse. Here we present a dataset comprising 171 pairwise combinations of 19 individual drugs targeting signal transduction mechanisms across eight cancer cell lines, where the effect of...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0255-7

    authors: Flobak Å,Niederdorfer B,Nakstad VT,Thommesen L,Klinkenberg G,Lægreid A

    更新日期:2019-10-29 00:00:00

  • Oral microbiota and dental caries data from monozygotic and dizygotic twin children.

    abstract::There are recent studies which aimed to detect the inheritance on the etiology of dental caries exploring oral composition. We present data on the oral microbiota and its relation with dental caries and other factors in monozygotic (MZ) and dizygotic (DZ) twin children. Following clinical investigation, DNA samples we...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00691-z

    authors: Kasimoglu Y,Koruyucu M,Birant S,Karacan I,Topcuoglu N,Tuna EB,Gencay K,Seymen F

    更新日期:2020-10-13 00:00:00

  • A global view on the effect of water uptake on aerosol particle light scattering.

    abstract::A reference dataset of multi-wavelength particle light scattering and hemispheric backscattering coefficients for different relative humidities (RH) between RH = 30 and 95% and wavelengths between λ = 450 nm and 700 nm is described in this work. Tandem-humidified nephelometer measurements from 26 ground-based sites ar...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0158-7

    authors: Burgos MA,Andrews E,Titos G,Alados-Arboledas L,Baltensperger U,Day D,Jefferson A,Kalivitis N,Mihalopoulos N,Sherman J,Sun J,Weingartner E,Zieger P

    更新日期:2019-08-22 00:00:00

  • An annual time series of weekly size-resolved aerosol properties in the megacity of Metro Manila, Philippines.

    abstract::Size-resolved aerosol samples were collected in Metro Manila between July 2018 and October 2019. Two Micro-Orifice Uniform Deposit Impactors (MOUDI) were deployed at Manila Observatory in Quezon City, Metro Manila with samples collected on a weekly basis for water-soluble speciation and mass quantification. Additional...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0466-y

    authors: Stahl C,Cruz MT,Bañaga PA,Betito G,Braun RA,Aghdam MA,Cambaliza MO,Lorenzo GR,MacDonald AB,Pabroa PC,Yee JR,Simpas JB,Sorooshian A

    更新日期:2020-04-29 00:00:00

  • Comprehensive draft of the mouse embryonic fibroblast lysosomal proteome by mass spectrometry based proteomics.

    abstract::Lysosomes are the main degradative organelles of cells and involved in a variety of processes including the recycling of macromolecules, storage of compounds, and metabolic signaling. Despite an increasing interest in the proteomic analysis of lysosomes, no systematic study of sample preparation protocols for lysosome...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0399-5

    authors: Ponnaiyan S,Akter F,Singh J,Winter D

    更新日期:2020-02-26 00:00:00

  • The downed and dead wood inventory of forests in the United States.

    abstract::The quantity and condition of downed dead wood (DDW) is emerging as a major factor governing forest ecosystem processes such as carbon cycling, fire behavior, and tree regeneration. Despite this, systematic inventories of DDW are sparse if not absent across major forest biomes. The Forest Inventory and Analysis progra...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.303

    authors: Woodall CW,Monleon VJ,Fraver S,Russell MB,Hatfield MH,Campbell JL,Domke GM

    更新日期:2019-01-08 00:00:00

  • An agricultural survey for more than 9,500 African households.

    abstract::Surveys for more than 9,500 households were conducted in the growing seasons 2002/2003 or 2003/2004 in eleven African countries: Burkina Faso, Cameroon, Ghana, Niger and Senegal in western Africa; Egypt in northern Africa; Ethiopia and Kenya in eastern Africa; South Africa, Zambia and Zimbabwe in southern Africa. Hous...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.20

    authors: Waha K,Zipf B,Kurukulasuriya P,Hassan RM

    更新日期:2016-05-24 00:00:00

  • De novo transcriptome assembly databases for the butterfly orchid Phalaenopsis equestris.

    abstract::Orchids are renowned for their spectacular flowers and ecological adaptations. After the sequencing of the genome of the tropical epiphytic orchid Phalaenopsis equestris, we combined Illumina HiSeq2000 for RNA-Seq and Trinity for de novo assembly to characterize the transcriptomes for 11 diverse P. equestris tissues r...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.83

    authors: Niu SC,Xu Q,Zhang GQ,Zhang YQ,Tsai WC,Hsu JL,Liang CK,Luo YB,Liu ZJ

    更新日期:2016-09-27 00:00:00

  • The Dat Project, an open and decentralized research data tool.

    abstract::Today's scientific data are primarily stored and accessed via centralized Web-based infrastructure. Centralization has advantages but also carries risks such as link rot and content drift, which can hinder scientific progress. It is time to ask whether traditional, centralized Web architecture aligns with scholarly pr...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.221

    authors: Robinson DC,Hand JA,Madsen MB,McKelvey KR

    更新日期:2018-10-23 00:00:00

  • A synthesis of bacterial and archaeal phenotypic trait data.

    abstract::A synthesis of phenotypic and quantitative genomic traits is provided for bacteria and archaea, in the form of a scripted, reproducible workflow that standardizes and merges 26 sources. The resulting unified dataset covers 14 phenotypic traits, 5 quantitative genomic traits, and 4 environmental characteristics for app...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0497-4

    authors: Madin JS,Nielsen DA,Brbic M,Corkrey R,Danko D,Edwards K,Engqvist MKM,Fierer N,Geoghegan JL,Gillings M,Kyrpides NC,Litchman E,Mason CE,Moore L,Nielsen SL,Paulsen IT,Price ND,Reddy TBK,Richards MA,Rocha EPC,Schmidt

    更新日期:2020-06-05 00:00:00

  • Monitoring microbial responses to ocean deoxygenation in a model oxygen minimum zone.

    abstract::Today in Scientific Data, two compendia of geochemical and multi-omic sequence information (DNA, RNA, protein) generated over almost a decade of time series monitoring in a seasonally anoxic coastal marine setting are presented to the scientific community. These data descriptors introduce a model ecosystem for the stu...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.158

    authors: Hallam SJ,Torres-Beltrán M,Hawley AK

    更新日期:2017-10-31 00:00:00

  • Multiple-data-based monthly geopotential model set LDCmgm90.

    abstract::While the GRACE (Gravity Recovery and Climate Experiment) satellite mission is of great significance in understanding various branches of Earth sciences, the quality of GRACE monthly products can be unsatisfactory due to strong longitudinal stripe-pattern errors and other flaws. Based on corrected GRACE Mascon (mass c...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0239-7

    authors: Chen W,Luo J,Ray J,Yu N,Li JC

    更新日期:2019-10-23 00:00:00

  • A database of human gait performance on irregular and uneven surfaces collected by wearable sensors.

    abstract::Gait analysis has traditionally relied on laborious and lab-based methods. Data from wearable sensors, such as Inertial Measurement Units (IMU), can be analyzed with machine learning to perform gait analysis in real-world environments. This database provides data from thirty participants (fifteen males and fifteen fem...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0563-y

    authors: Luo Y,Coppola SM,Dixon PC,Li S,Dennerlein JT,Hu B

    更新日期:2020-07-08 00:00:00

  • A Mediterranean coastal database for assessing the impacts of sea-level rise and associated hazards.

    abstract::We have developed a new coastal database for the Mediterranean basin that is intended for coastal impact and adaptation assessment to sea-level rise and associated hazards on a regional scale. The data structure of the database relies on a linear representation of the coast with associated spatial assessment units. Us...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.44

    authors: Wolff C,Vafeidis AT,Muis S,Lincke D,Satta A,Lionello P,Jimenez JA,Conte D,Hinkel J

    更新日期:2018-03-27 00:00:00

  • Building fault detection data to aid diagnostic algorithm creation and performance testing.

    abstract::It is estimated that approximately 4-5% of national energy consumption can be saved through corrections to existing commercial building controls infrastructure and resulting improvements to efficiency. Correspondingly, automated fault detection and diagnostics (FDD) algorithms are designed to identify the presence of ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0398-6

    authors: Granderson J,Lin G,Harding A,Im P,Chen Y

    更新日期:2020-02-24 00:00:00

  • Comprehensive high-resolution multiple-reaction monitoring mass spectrometry for targeted eicosanoid assays.

    abstract::Eicosanoids comprise a class of bioactive lipids derived from a unique group of essential fatty acids that mediate a variety of important physiological functions. Owing to the structural diversity of these lipids, their analysis in biological samples is often a major challenge. Advancements in mass spectrometric have ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.167

    authors: Sorgi CA,Peti APF,Petta T,Meirelles AFG,Fontanari C,Moraes LAB,Faccioli LH

    更新日期:2018-08-21 00:00:00

  • Erratum: Genomes and phenomes of a population of outbred rats and its progenitors.

    abstract::[This corrects the article DOI: 10.1038/sdata.2014.11.]. ...

    journal_title:Scientific data

    pub_type: 已发布勘误

    doi:10.1038/sdata.2014.16

    authors: Baud A,Guryev V,Hummel O,Johannesson M,Rat Genome Sequencing and Mapping Consortium.,Flint J

    更新日期:2014-07-08 00:00:00

  • Daily transcriptomes of the copepod Calanus finmarchicus during the summer solstice at high Arctic latitudes.

    abstract::The zooplankter Calanus finmarchicus is a member of the so-called "Calanus Complex", a group of copepods that constitutes a key element of the Arctic polar marine ecosystem, providing a crucial link between primary production and higher trophic levels. Climate change induces the shift of C. finmarchicus to higher lati...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00751-4

    authors: Payton L,Noirot C,Hoede C,Hüppe L,Last K,Wilcockson D,Ershova EA,Valière S,Meyer B

    更新日期:2020-11-24 00:00:00