Tesco Grocery 1.0, a large-scale dataset of grocery purchases in London.


:We present the Tesco Grocery 1.0 dataset: a record of 420 M food items purchased by 1.6 M fidelity card owners who shopped at the 411 Tesco stores in Greater London over the course of the entire year of 2015, aggregated at the level of census areas to preserve anonymity. For each area, we report the number of transactions and nutritional properties of the typical food item bought including the average caloric intake and the composition of nutrients. The set of global trade international numbers (barcodes) for each food type is also included. To establish data validity we: i) compare food purchase volumes to population from census to assess representativeness, and ii) match nutrient and energy intake to official statistics of food-related illnesses to appraise the extent to which the dataset is ecologically valid. Given its unprecedented scale and geographic granularity, the data can be used to link food purchases to a number of geographically-salient indicators, which enables studies on health outcomes, cultural aspects, and economic factors.


Sci Data


Scientific data


Aiello LM,Quercia D,Schifanella R,Del Prete L




Has Abstract


2020-02-18 00:00:00












  • Paired rRNA-depleted and polyA-selected RNA sequencing data and supporting multi-omics data from human T cells.

    abstract::Both poly(A) enrichment and ribosomal RNA depletion are commonly used for RNA sequencing. Either has its advantages and disadvantages that may lead to biases in the downstream analyses. To better access these effects, we carried out both ribosomal RNA-depleted and poly(A)-selected RNA-seq for CD4+ T naive cells isolat...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Chen L,Yang R,Kwan T,Tang C,Watt S,Zhang Y,Bourque G,Ge B,Downes K,Frontini M,Ouwehand WH,Lin JW,Soranzo N,Pastinen T,Chen L

    更新日期:2020-11-09 00:00:00

  • The downed and dead wood inventory of forests in the United States.

    abstract::The quantity and condition of downed dead wood (DDW) is emerging as a major factor governing forest ecosystem processes such as carbon cycling, fire behavior, and tree regeneration. Despite this, systematic inventories of DDW are sparse if not absent across major forest biomes. The Forest Inventory and Analysis progra...

    journal_title:Scientific data



    authors: Woodall CW,Monleon VJ,Fraver S,Russell MB,Hatfield MH,Campbell JL,Domke GM

    更新日期:2019-01-08 00:00:00

  • A suite of global accessibility indicators.

    abstract::Good access to resources and opportunities is essential for sustainable development. Improving access, especially in rural areas, requires useful measures of current access to the locations where these resources and opportunities are found. Recent work has developed a global map of travel times to cities with more tha...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Nelson A,Weiss DJ,van Etten J,Cattaneo A,McMenomy TS,Koo J

    更新日期:2019-11-07 00:00:00

  • Creating a surrogate commuter network from Australian Bureau of Statistics census data.

    abstract::Between the 2011 and 2016 national censuses, the Australian Bureau of Statistics changed its anonymity policy compliance system for the distribution of census data. The new method has resulted in dramatic inconsistencies when comparing low-resolution data to aggregated high-resolution data. Hence, aggregated totals do...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Fair KM,Zachreson C,Prokopenko M

    更新日期:2019-08-16 00:00:00

  • Obstacles to the reuse of study metadata in ClinicalTrials.gov.

    abstract::Metadata that are structured using principled schemas and that use terms from ontologies are essential to making biomedical data findable and reusable for downstream analyses. The largest source of metadata that describes the experimental protocol, funding, and scientific leadership of clinical studies is ClinicalTria...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Miron L,Gonçalves RS,Musen MA

    更新日期:2020-12-18 00:00:00

  • Synthetic skull bone defects for automatic patient-specific craniofacial implant design.

    abstract::Patient-specific craniofacial implants are used to repair skull bone defects after trauma or surgery. Currently, cranial implants are designed and produced by third-party suppliers, which is usually time-consuming and expensive. Recent advances in additive manufacturing made the in-hospital or in-operation-room fabric...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Li J,Gsaxner C,Pepe A,Morais A,Alves V,von Campe G,Wallner J,Egger J

    更新日期:2021-01-29 00:00:00

  • The effect of 16S rRNA region choice on bacterial community metabarcoding results.

    abstract::In this work, we compare the resolution of V2-V3 and V3-V4 16S rRNA regions for the purposes of estimating microbial community diversity using paired-end Illumina MiSeq reads, and show that the fragment, including V2 and V3 regions, has higher resolution for lower-rank taxa (genera and species). It allows for a more p...

    journal_title:Scientific data



    authors: Bukin YS,Galachyants YP,Morozov IV,Bukin SV,Zakharenko AS,Zemskaya TI

    更新日期:2019-02-05 00:00:00

  • Genotoype-by-sequencing of three geographically distinct populations of Olympia oysters, Ostrea lurida.

    abstract::Olympia oysters are found along the west coast of North America and as the only native oyster species in the region, receive considerable attention with regard to restoration and conservation. Knowledge of genetic structure of this species is essential for resource managers. Here we provide genetic data for three dist...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: White SJ,Vadopalas B,Silliman K,Roberts SB

    更新日期:2017-09-12 00:00:00

  • Curated compendium of human transcriptional biomarker data.

    abstract::One important use of genome-wide transcriptional profiles is to identify relationships between transcription levels and patient outcomes. These translational insights can guide the development of biomarkers for clinical application. Data from thousands of translational-biomarker studies have been deposited in public r...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Golightly NP,Bell A,Bischoff AI,Hollingsworth PD,Piccolo SR

    更新日期:2018-04-17 00:00:00

  • A Mediterranean coastal database for assessing the impacts of sea-level rise and associated hazards.

    abstract::We have developed a new coastal database for the Mediterranean basin that is intended for coastal impact and adaptation assessment to sea-level rise and associated hazards on a regional scale. The data structure of the database relies on a linear representation of the coast with associated spatial assessment units. Us...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Wolff C,Vafeidis AT,Muis S,Lincke D,Satta A,Lionello P,Jimenez JA,Conte D,Hinkel J

    更新日期:2018-03-27 00:00:00

  • A high-content image-based drug screen of clinical compounds against cell transmission of adenovirus.

    abstract::Human adenoviruses (HAdVs) are fatal to immuno-suppressed individuals, but no effective anti-HAdV therapy is available. Here, we present a novel image-based high-throughput screening (HTS) platform, which scores the full viral replication cycle from virus entry to dissemination of progeny and second-round infections. ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Georgi F,Kuttler F,Murer L,Andriasyan V,Witte R,Yakimovich A,Turcatti G,Greber UF

    更新日期:2020-08-12 00:00:00

  • Publisher Correction: Tracking vegetation phenology across diverse biomes using Version 2.0 of the PhenoCam Dataset.

    abstract::An amendment to this paper has been published and can be accessed via a link at the top of the paper. ...

    journal_title:Scientific data

    pub_type: 杂志文章,已发布勘误


    authors: Seyednasrollah B,Young AM,Hufkens K,Milliman T,Friedl MA,Frolking S,Richardson AD

    更新日期:2019-11-01 00:00:00

  • Transcriptomic profiling of 39 commonly-used neuroblastoma cell lines.

    abstract::Neuroblastoma cell lines are an important and cost-effective model used to study oncogenic drivers of the disease. While many of these cell lines have been previously characterized with SNP, methylation, and/or mRNA expression microarrays, there has not been an effort to comprehensively sequence these cell lines. Here...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Harenza JL,Diamond MA,Adams RN,Song MM,Davidson HL,Hart LS,Dent MH,Fortina P,Reynolds CP,Maris JM

    更新日期:2017-03-28 00:00:00

  • Enabling precision medicine in neonatology, an integrated repository for preterm birth research.

    abstract::Preterm birth, or the delivery of an infant prior to 37 weeks of gestation, is a significant cause of infant morbidity and mortality. In the last decade, the advent and continued development of molecular profiling technologies has enabled researchers to generate vast amount of 'omics' data, which together with integra...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Sirota M,Thomas CG,Liu R,Zuhl M,Banerjee P,Wong RJ,Quaintance CC,Leite R,Chubiz J,Anderson R,Chappell J,Kim M,Grobman W,Zhang G,Rokas A,England SK,Parry S,Shaw GM,Simpson JL,Thomson E,Butte AJ,March of Dimes Pre

    更新日期:2018-11-06 00:00:00

  • The effects of sequencing platforms on phylogenetic resolution in 16 S rRNA gene profiling of human feces.

    abstract::High-quality and high-throughput sequencing technologies are required for therapeutic and diagnostic analyses of human gut microbiota. Here, we evaluated the advantages and disadvantages of the various commercial sequencing platforms for studying human gut microbiota. We generated fecal bacterial sequences from 170 Ko...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Whon TW,Chung WH,Lim MY,Song EJ,Kim PS,Hyun DW,Shin NR,Bae JW,Nam YD

    更新日期:2018-04-24 00:00:00

  • dbPSP 2.0, an updated database of protein phosphorylation sites in prokaryotes.

    abstract::In prokaryotes, protein phosphorylation plays a critical role in regulating a broad spectrum of biological processes and occurs mainly on various amino acids, including serine (S), threonine (T), tyrosine (Y), arginine (R), aspartic acid (D), histidine (H) and cysteine (C) residues of protein substrates. Through liter...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Shi Y,Zhang Y,Lin S,Wang C,Zhou J,Peng D,Xue Y

    更新日期:2020-05-29 00:00:00

  • Spatiotemporal dataset on Chinese population distribution and its driving factors from 1949 to 2013.

    abstract::Spatio-temporal data on human population and its driving factors is critical to understanding and responding to population problems. Unfortunately, such spatio-temporal data on a large scale and over the long term are often difficult to obtain. Here, we present a dataset on Chinese population distribution and its driv...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Wang L,Chen L

    更新日期:2016-07-05 00:00:00

  • Spatial and temporal analysis of extreme sea level and storm surge events around the coastline of the UK.

    abstract::In this paper we analyse the spatial footprint and temporal clustering of extreme sea level and skew surge events around the UK coast over the last 100 years (1915-2014). The vast majority of the extreme sea level events are generated by moderate, rather than extreme skew surges, combined with spring astronomical high...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Haigh ID,Wadey MP,Wahl T,Ozsoy O,Nicholls RJ,Brown JM,Horsburgh K,Gouldby B

    更新日期:2016-12-06 00:00:00

  • A microarray whole-genome gene expression dataset in a rat model of inflammatory corneal angiogenesis.

    abstract::In angiogenesis with concurrent inflammation, many pathways are activated, some linked to VEGF and others largely VEGF-independent. Pathways involving inflammatory mediators, chemokines, and micro-RNAs may play important roles in maintaining a pro-angiogenic environment or mediating angiogenic regression. Here, we des...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Mukwaya A,Lindvall JM,Xeroudaki M,Peebo B,Ali Z,Lennikov A,Jensen LD,Lagali N

    更新日期:2016-11-22 00:00:00

  • A multi-omics digital research object for the genetics of sleep regulation.

    abstract::With the aim to uncover the molecular pathways underlying the regulation of sleep, we recently assembled an extensive and comprehensive systems genetics dataset interrogating a genetic reference population of mice at the levels of the genome, the brain and liver transcriptomes, the plasma metabolome, and the sleep-wak...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Jan M,Gobet N,Diessler S,Franken P,Xenarios I

    更新日期:2019-10-31 00:00:00

  • Optical motion capture dataset of selected techniques in beginner and advanced Kyokushin karate athletes.

    abstract::Human motion capture is commonly used in various fields, including sport, to analyze, understand, and synthesize kinematic and kinetic data. Specialized computer vision and marker-based optical motion capture techniques constitute the gold-standard for accurate and robust human motion capture. The dataset presented co...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Szczęsna A,Błaszczyszyn M,Pawlyta M

    更新日期:2021-01-18 00:00:00

  • Building fault detection data to aid diagnostic algorithm creation and performance testing.

    abstract::It is estimated that approximately 4-5% of national energy consumption can be saved through corrections to existing commercial building controls infrastructure and resulting improvements to efficiency. Correspondingly, automated fault detection and diagnostics (FDD) algorithms are designed to identify the presence of ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Granderson J,Lin G,Harding A,Im P,Chen Y

    更新日期:2020-02-24 00:00:00

  • Long term survey of the fish community and associated benthic fauna of the Seine estuary nursery grounds.

    abstract::Estuaries are crucial ecosystems where human activities deeply affect numerous ecological functions. Here we present a survey dataset based on the monitoring of fish nursery grounds of the Seine estuary and eastern bay of Seine collected once a year using a beam trawl during three distinct periods (1995-2002, 2008-201...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Cariou T,Dubroca L,Vogel C

    更新日期:2020-07-13 00:00:00

  • Multi-year whole-blood transcriptome data for the study of onset and progression of Parkinson's Disease.

    abstract::Parkinson's disease (PD) is an age-related, chronic and progressive neurodegenerative disorder characterized by a loss of multifocal neurons, resulting in both non-motor and motor symptoms. While several genetic and environmental contributory risk factors have been identified, more exact methods for diagnosing and ass...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Valentine MNZ,Hashimoto K,Fukuhara T,Saiki S,Ishikawa KI,Hattori N,Carninci P

    更新日期:2019-04-05 00:00:00

  • A high-throughput drug combination screen of targeted small molecule inhibitors in cancer cell lines.

    abstract::While there is a high interest in drug combinations in cancer therapy, openly accessible datasets for drug combination responses are sparse. Here we present a dataset comprising 171 pairwise combinations of 19 individual drugs targeting signal transduction mechanisms across eight cancer cell lines, where the effect of...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Flobak Å,Niederdorfer B,Nakstad VT,Thommesen L,Klinkenberg G,Lægreid A

    更新日期:2019-10-29 00:00:00

  • Multidisciplinary database of permeability of fault zones and surrounding protolith rocks at world-wide sites.

    abstract::Brittle faults and fault zones are important fluid flow conduits through the upper part of Earth's crust that are involved in many well-known phenomena (e.g. earthquakes, thermal water and gas transport, or water leakage to underground tunnels). The permeability property, or the ability of porous materials to conduct ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Scibek J

    更新日期:2020-03-19 00:00:00

  • Flow cytometry analysis of adrenoceptors expression in human adipose-derived mesenchymal stem/stromal cells.

    abstract::Mesenchymal stem/stromal cells (MSCs) were identified in most tissues of an adult organism. MSCs mediate physiological renewal, as well as regulation of tissue homeostasis, reparation and regeneration. Functions of MSCs are regulated by endocrine and neuronal signals, and noradrenaline is one of the most important MSC...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Tyurin-Kuzmin PA,Dyikanov DT,Fadeeva JI,Sysoeva VY,Kalinina NI

    更新日期:2018-10-02 00:00:00

  • De novo transcriptomes of 14 gammarid individuals for proteogenomic analysis of seven taxonomic groups.

    abstract::Gammarids are amphipods found worldwide distributed in fresh and marine waters. They play an important role in aquatic ecosystems and are well established sentinel species in ecotoxicology. In this study, we sequenced the transcriptomes of a male individual and a female individual for seven different taxonomic groups ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Cogne Y,Degli-Esposti D,Pible O,Gouveia D,François A,Bouchez O,Eché C,Ford A,Geffard O,Armengaud J,Chaumot A,Almunia C

    更新日期:2019-09-27 00:00:00

  • Multi-omics profile of the mouse dentate gyrus after kainic acid-induced status epilepticus.

    abstract::Temporal lobe epilepsy (TLE) can develop from alterations in hippocampal structure and circuit characteristics, and can be modeled in mice by administration of kainic acid (KA). Adult neurogenesis in the dentate gyrus (DG) contributes to hippocampal functions and has been reported to contribute to the development of T...

    journal_title:Scientific data

    pub_type: 评论,杂志文章


    authors: Schouten M,Bielefeld P,Fratantoni SA,Hubens CJ,Piersma SR,Pham TV,Voskuyl RA,Lucassen PJ,Jimenez CR,Fitzsimons CP

    更新日期:2016-08-16 00:00:00

  • A systematic review and meta-analysis of seroprevalence surveys of ebolavirus infection.

    abstract::Asymptomatic ebolavirus infection could greatly influence transmission dynamics, but there is little consensus on how frequently it occurs or even if it exists. This paper summarises the available evidence on seroprevalence of Ebola, Sudan and Bundibugyo virus IgG in people without known ebolavirus disease. Through sy...

    journal_title:Scientific data

    pub_type: 杂志文章,meta分析,评审


    authors: Bower H,Glynn JR

    更新日期:2017-01-31 00:00:00