Construction of the REACHES climate database based on historical documents of China.


:This paper describes the methodology of an ongoing project of constructing an East Asian climate database REACHES based on Chinese historical documents. The record source is Compendium of Meteorological Records of China in the Last 3000 Years which collects meteorology and climate related records from mainly official and local chronicles along with a small number of other documents. We report the digitization of the records covering the period 1644-1795. An example of the original records is translated to illustrate the typical contents which contain time, location and type of events. Chinese historical times and location names are converted into Gregorian calendar and latitudes and longitudes. A hierarchical database system is developed that consists of the hierarchies of domains, main categories, subcategories, and further details. Historical events are then digitized and categorized into such a system. Code systems are developed at all levels such that the original descriptive entries are converted into digitized records suitable for treatment by computers. Statistics and characteristics of the digitized records in the database are described.


Sci Data


Scientific data


Wang PK,Lin KE,Liao YC,Liao HM,Lin YS,Hsu CT,Hsu SM,Wan CW,Lee SY,Fan IC,Tan PH,Ting TT




Has Abstract


2018-12-18 00:00:00










  • Comprehensive analysis of the venom gland transcriptome of the spider Dolomedes fimbriatus.

    abstract::A comprehensive transcriptome analysis of an expressed sequence tag (EST) database of the spider Dolomedes fimbriatus venom glands using single-residue distribution analysis (SRDA) identified 7,169 unique sequences. Mature chains of 163 different toxin-like polypeptides were predicted on the basis of well-established ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Kozlov SA,Lazarev VN,Kostryukova ES,Selezneva OV,Ospanova EA,Alexeev DG,Govorun VM,Grishin EV

    更新日期:2014-08-05 00:00:00

  • Construction, complete sequence, and annotation of a BAC contig covering the silkworm chorion locus.

    abstract::The silkmoth chorion was studied extensively by F.C. Kafatos' group for almost 40 years. However, the complete structure of the chorion locus was not obtained in the genome sequence of Bombyx mori published in 2008 due to repetitive sequences, resulting in gaps and an incomplete view of the locus. To obtain the comple...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Chen Z,Nohata J,Guo H,Li S,Liu J,Guo Y,Yamamoto K,Kadono-Okuda K,Liu C,Arunkumar KP,Nagaraju J,Zhang Y,Liu S,Labropoulou V,Swevers L,Tsitoura P,Iatrou K,Gopinathan KP,Goldsmith MR,Xia Q,Mita K

    更新日期:2015-11-10 00:00:00

  • Publisher Correction: The Scales Project, a cross-national dataset on the interpretation of thermal perception scales.

    abstract::An amendment to this paper has been published and can be accessed via a link at the top of the paper. ...

    journal_title:Scientific data

    pub_type: 杂志文章,已发布勘误


    authors: Schweiker M,Abdul-Zahra A,André M,Al-Atrash F,Al-Khatri H,Alprianti RR,Alsaad H,Amin R,Ampatzi E,Arsano AY,Azadeh M,Azar E,Bahareh B,Batagarawa A,Becker S,Buonocore C,Cao B,Choi JH,Chun C,Daanen H,Damiati SA,Dan

    更新日期:2020-01-06 00:00:00

  • Tesco Grocery 1.0, a large-scale dataset of grocery purchases in London.

    abstract::We present the Tesco Grocery 1.0 dataset: a record of 420 M food items purchased by 1.6 M fidelity card owners who shopped at the 411 Tesco stores in Greater London over the course of the entire year of 2015, aggregated at the level of census areas to preserve anonymity. For each area, we report the number of transact...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Aiello LM,Quercia D,Schifanella R,Del Prete L

    更新日期:2020-02-18 00:00:00

  • Flow and detailed 3D morphodynamic data from laboratory experiments of fluvial dike breaching.

    abstract::This paper presents a dataset obtained from fifty four laboratory experiments of the breaching of fluvial dikes due to flow overtopping. Data were collected on two complementary experimental setups, each consisting of a main channel representing the river, an erodible lateral dike and a floodplain. The dataset covers ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Rifai I,El Kadi Abderrezzak K,Erpicum S,Archambeau P,Violeau D,Pirotton M,Dewals B

    更新日期:2019-05-13 00:00:00

  • A test-retest dataset for assessing long-term reliability of brain morphology and resting-state brain activity.

    abstract::We present a test-retest dataset for evaluation of long-term reliability of measures from structural and resting-state functional magnetic resonance imaging (sMRI and rfMRI) scans. The repeated scan dataset was collected from 61 healthy adults in two sessions using highly similar imaging parameters at an interval of 1...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Huang L,Huang T,Zhen Z,Liu J

    更新日期:2016-03-15 00:00:00

  • dbPSP 2.0, an updated database of protein phosphorylation sites in prokaryotes.

    abstract::In prokaryotes, protein phosphorylation plays a critical role in regulating a broad spectrum of biological processes and occurs mainly on various amino acids, including serine (S), threonine (T), tyrosine (Y), arginine (R), aspartic acid (D), histidine (H) and cysteine (C) residues of protein substrates. Through liter...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Shi Y,Zhang Y,Lin S,Wang C,Zhou J,Peng D,Xue Y

    更新日期:2020-05-29 00:00:00

  • Multi-year whole-blood transcriptome data for the study of onset and progression of Parkinson's Disease.

    abstract::Parkinson's disease (PD) is an age-related, chronic and progressive neurodegenerative disorder characterized by a loss of multifocal neurons, resulting in both non-motor and motor symptoms. While several genetic and environmental contributory risk factors have been identified, more exact methods for diagnosing and ass...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Valentine MNZ,Hashimoto K,Fukuhara T,Saiki S,Ishikawa KI,Hattori N,Carninci P

    更新日期:2019-04-05 00:00:00

  • The Dat Project, an open and decentralized research data tool.

    abstract::Today's scientific data are primarily stored and accessed via centralized Web-based infrastructure. Centralization has advantages but also carries risks such as link rot and content drift, which can hinder scientific progress. It is time to ask whether traditional, centralized Web architecture aligns with scholarly pr...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Robinson DC,Hand JA,Madsen MB,McKelvey KR

    更新日期:2018-10-23 00:00:00

  • Arbovirus emergence in the temperate city of Córdoba, Argentina, 2009-2018.

    abstract::The distribution of arbovirus disease transmission is expanding from the tropics and subtropics into temperate regions worldwide. The temperate city of Córdoba, Argentina has been experiencing the emergence of dengue virus, transmitted by the mosquito Aedes aegypti, since 2009, when autochthonous transmission of the v...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Robert MA,Tinunin DT,Benitez EM,Ludueña-Almeida FF,Romero M,Stewart-Ibarra AM,Estallo EL

    更新日期:2019-11-21 00:00:00

  • A high-content image-based drug screen of clinical compounds against cell transmission of adenovirus.

    abstract::Human adenoviruses (HAdVs) are fatal to immuno-suppressed individuals, but no effective anti-HAdV therapy is available. Here, we present a novel image-based high-throughput screening (HTS) platform, which scores the full viral replication cycle from virus entry to dissemination of progeny and second-round infections. ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Georgi F,Kuttler F,Murer L,Andriasyan V,Witte R,Yakimovich A,Turcatti G,Greber UF

    更新日期:2020-08-12 00:00:00

  • Multiscale dynamic human mobility flow dataset in the U.S. during the COVID-19 epidemic.

    abstract::Understanding dynamic human mobility changes and spatial interaction patterns at different geographic scales is crucial for assessing the impacts of non-pharmaceutical interventions (such as stay-at-home orders) during the COVID-19 pandemic. In this data descriptor, we introduce a regularly-updated multiscale dynamic ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Kang Y,Gao S,Liang Y,Li M,Rao J,Kruse J

    更新日期:2020-11-12 00:00:00

  • Long-term observation of amphibian populations inhabiting urban and forested areas in Yekaterinburg, Russia.

    abstract::This article presents data derived from a 36 year-long uninterrupted observational study of amphibian populations living in the city and vicinity of Yekaterinburg, Russia. This area is inhabited by six amphibian species. Based on a degree of anthropogenic transformation, the urban territory is divided into five highly...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Vershinin VL,Vershinina SD,Berzin DL,Zmeeva DV,Kinev AV

    更新日期:2015-05-12 00:00:00

  • Epigenetic and transcriptional profiling of triple negative breast cancer.

    abstract::The human HCC1806 cell line is frequently used as a preclinical model for triple negative breast cancer (TNBC). Given that dysregulated epigenetic mechanisms are involved in cancer pathogenesis, emerging therapeutic strategies target chromatin regulators, such as histone deacetylases. A comprehensive understanding of ...

    journal_title:Scientific data



    authors: Perreault AA,Sprunger DM,Venters BJ

    更新日期:2019-03-05 00:00:00

  • Experimental flows through an array of emerged or slightly submerged square cylinders over a rough bed.

    abstract::The experimental dataset presented was collected in an 18 m long and 1 m wide laboratory flume. Low to high flood flows through an urbanized floodplain were modelled. The floodplain bed is rough, modelled with dense artificial grass. A square cylinder array, representing house models, was set on the rough bed. The cyl...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Oukacine M,Proust S,Larrarte F,Goutal N

    更新日期:2021-01-11 00:00:00

  • Comprehensive draft of the mouse embryonic fibroblast lysosomal proteome by mass spectrometry based proteomics.

    abstract::Lysosomes are the main degradative organelles of cells and involved in a variety of processes including the recycling of macromolecules, storage of compounds, and metabolic signaling. Despite an increasing interest in the proteomic analysis of lysosomes, no systematic study of sample preparation protocols for lysosome...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Ponnaiyan S,Akter F,Singh J,Winter D

    更新日期:2020-02-26 00:00:00

  • A high-throughput drug combination screen of targeted small molecule inhibitors in cancer cell lines.

    abstract::While there is a high interest in drug combinations in cancer therapy, openly accessible datasets for drug combination responses are sparse. Here we present a dataset comprising 171 pairwise combinations of 19 individual drugs targeting signal transduction mechanisms across eight cancer cell lines, where the effect of...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Flobak Å,Niederdorfer B,Nakstad VT,Thommesen L,Klinkenberg G,Lægreid A

    更新日期:2019-10-29 00:00:00

  • An annotated fluorescence image dataset for training nuclear segmentation methods.

    abstract::Fully-automated nuclear image segmentation is the prerequisite to ensure statistically significant, quantitative analyses of tissue preparations,applied in digital pathology or quantitative microscopy. The design of segmentation methods that work independently of the tissue type or preparation is complex, due to varia...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Kromp F,Bozsaky E,Rifatbegovic F,Fischer L,Ambros M,Berneder M,Weiss T,Lazic D,Dörr W,Hanbury A,Beiske K,Ambros PF,Ambros IM,Taschner-Mandl S

    更新日期:2020-08-11 00:00:00

  • Draft genome of the big-headed turtle Platysternon megacephalum.

    abstract::The big-headed turtle, Platysternon megacephalum, as the sole member of the monotypic family Platysternidae, has a number of distinct characteristics including an extra-large head, long tail, flat carapace, and a preference for low water temperature environments. We performed whole genome sequencing, assembly, and gen...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Cao D,Wang M,Ge Y,Gong S

    更新日期:2019-05-16 00:00:00

  • Optical motion capture dataset of selected techniques in beginner and advanced Kyokushin karate athletes.

    abstract::Human motion capture is commonly used in various fields, including sport, to analyze, understand, and synthesize kinematic and kinetic data. Specialized computer vision and marker-based optical motion capture techniques constitute the gold-standard for accurate and robust human motion capture. The dataset presented co...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Szczęsna A,Błaszczyszyn M,Pawlyta M

    更新日期:2021-01-18 00:00:00

  • One millennium of historical freshwater fish occurrence data for Portuguese rivers and streams.

    abstract::The insights that historical evidence of human presence and man-made documents provide are unique. For example, using historical data may be critical to adequately understand the ecological requirements of species. However, historical information about freshwater species distribution remains largely a knowledge gap. I...

    journal_title:Scientific data

    pub_type: 历史文章,杂志文章


    authors: Duarte G,Moreira M,Branco P,da Costa L,Ferreira MT,Segurado P

    更新日期:2018-08-14 00:00:00

  • Harmonised LUCAS in-situ land cover and use database for field surveys from 2006 to 2018 in the European Union.

    abstract::Accurately characterizing land surface changes with Earth Observation requires geo-located ground truth. In the European Union (EU), a tri-annual surveyed sample of land cover and land use has been collected since 2006 under the Land Use/Cover Area frame Survey (LUCAS). A total of 1351293 observations at 651780 unique...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: d'Andrimont R,Yordanov M,Martinez-Sanchez L,Eiselt B,Palmieri A,Dominici P,Gallego J,Reuter HI,Joebges C,Lemoine G,van der Velde M

    更新日期:2020-10-16 00:00:00

  • Sample descriptors linked to metagenomic sequencing data from human and animal enteric samples from Vietnam.

    abstract::There is still limited information on the diversity of viruses co-circulating in humans and animals. Here, we report data obtained from a large field collection of enteric samples taken from humans, pigs, rodents and other mammal hosts in Vietnam between 2012 and 2016. Each of 2100 stool or rectal swab samples was sub...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Woolhouse M,Ashworth J,Bogaardt C,Tue NT,Baker S,Thwaites G,Phuc TM

    更新日期:2019-10-15 00:00:00

  • A database of geopositioned Middle East Respiratory Syndrome Coronavirus occurrences.

    abstract::As a World Health Organization Research and Development Blueprint priority pathogen, there is a need to better understand the geographic distribution of Middle East Respiratory Syndrome Coronavirus (MERS-CoV) and its potential to infect mammals and humans. This database documents cases of MERS-CoV globally, with speci...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Ramshaw RE,Letourneau ID,Hong AY,Hon J,Morgan JD,Osborne JCP,Shirude S,Van Kerkhove MD,Hay SI,Pigott DM

    更新日期:2019-12-13 00:00:00

  • A multi-omics dataset of heat-shock response in the yeast RNA binding protein Mip6.

    abstract::Gene expression is a biological process regulated at different molecular levels, including chromatin accessibility, transcription, and RNA maturation and transport. In addition, these regulatory mechanisms have strong links with cellular metabolism. Here we present a multi-omics dataset that captures different aspects...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Nuño-Cabanes C,Ugidos M,Tarazona S,Martín-Expósito M,Ferrer A,Rodríguez-Navarro S,Conesa A

    更新日期:2020-02-27 00:00:00

  • Oral microbiota and dental caries data from monozygotic and dizygotic twin children.

    abstract::There are recent studies which aimed to detect the inheritance on the etiology of dental caries exploring oral composition. We present data on the oral microbiota and its relation with dental caries and other factors in monozygotic (MZ) and dizygotic (DZ) twin children. Following clinical investigation, DNA samples we...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Kasimoglu Y,Koruyucu M,Birant S,Karacan I,Topcuoglu N,Tuna EB,Gencay K,Seymen F

    更新日期:2020-10-13 00:00:00

  • Obstacles to the reuse of study metadata in

    abstract::Metadata that are structured using principled schemas and that use terms from ontologies are essential to making biomedical data findable and reusable for downstream analyses. The largest source of metadata that describes the experimental protocol, funding, and scientific leadership of clinical studies is ClinicalTria...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Miron L,Gonçalves RS,Musen MA

    更新日期:2020-12-18 00:00:00

  • The effects of sequencing platforms on phylogenetic resolution in 16 S rRNA gene profiling of human feces.

    abstract::High-quality and high-throughput sequencing technologies are required for therapeutic and diagnostic analyses of human gut microbiota. Here, we evaluated the advantages and disadvantages of the various commercial sequencing platforms for studying human gut microbiota. We generated fecal bacterial sequences from 170 Ko...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Whon TW,Chung WH,Lim MY,Song EJ,Kim PS,Hyun DW,Shin NR,Bae JW,Nam YD

    更新日期:2018-04-24 00:00:00

  • Spatial and temporal dynamics of multidimensional well-being, livelihoods and ecosystem services in coastal Bangladesh.

    abstract::Populations in resource dependent economies gain well-being from the natural environment, in highly spatially and temporally variable patterns. To collect information on this, we designed and implemented a 1586-household quantitative survey in the southwest coastal zone of Bangladesh. Data were collected on material, ...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Adams H,Adger WN,Ahmad S,Ahmed A,Begum D,Lázár AN,Matthews Z,Rahman MM,Streatfield PK

    更新日期:2016-11-08 00:00:00

  • A suite of global accessibility indicators.

    abstract::Good access to resources and opportunities is essential for sustainable development. Improving access, especially in rural areas, requires useful measures of current access to the locations where these resources and opportunities are found. Recent work has developed a global map of travel times to cities with more tha...

    journal_title:Scientific data

    pub_type: 杂志文章


    authors: Nelson A,Weiss DJ,van Etten J,Cattaneo A,McMenomy TS,Koo J

    更新日期:2019-11-07 00:00:00