If these data could talk.

Abstract:

:In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientific fields exhibit distressingly low rates of reproducibility. Although there are many dimensions to this issue, we believe that there is a lack of formalism used when describing end-to-end published results, from the data source to the analysis to the final published results. Even when authors do their best to make their research and data accessible, this lack of formalism reduces the clarity and efficiency of reporting, which contributes to issues of reproducibility. Data provenance aids both reproducibility through systematic and formal records of the relationships among data sources, processes, datasets, publications and researchers.

journal_name

Sci Data

journal_title

Scientific data

authors

Pasquier T,Lau MK,Trisovic A,Boose ER,Couturier B,Crosas M,Ellison AM,Gibson V,Jones CR,Seltzer M

doi

10.1038/sdata.2017.114

subject

Has Abstract

pub_date

2017-09-05 00:00:00

pages

170114

issn

2052-4463

pii

sdata2017114

journal_volume

4

pub_type

杂志文章
  • Creating a surrogate commuter network from Australian Bureau of Statistics census data.

    abstract::Between the 2011 and 2016 national censuses, the Australian Bureau of Statistics changed its anonymity policy compliance system for the distribution of census data. The new method has resulted in dramatic inconsistencies when comparing low-resolution data to aggregated high-resolution data. Hence, aggregated totals do...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0137-z

    authors: Fair KM,Zachreson C,Prokopenko M

    更新日期:2019-08-16 00:00:00

  • Map of physical interactions between extracellular domains of Arabidopsis leucine-rich repeat receptor kinases.

    abstract::Plants use surface receptors to perceive information about many aspects of their local environment. These receptors physically interact to form both steady state and signalling competent complexes. The signalling events downstream of receptor activation impact both plant developmental and immune responses. Here, we pr...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2019.25

    authors: Mott GA,Smakowska-Luzan E,Pasha A,Parys K,Howton TC,Neuhold J,Lehner A,Grünwald K,Stolt-Bergner P,Provart NJ,Mukhtar MS,Desveaux D,Guttman DS,Belkhadir Y

    更新日期:2019-02-26 00:00:00

  • Systematic analysis of infectious disease outcomes by age shows lowest severity in school-age children.

    abstract::The COVID-19 pandemic has ignited interest in age-specific manifestations of infection but surprisingly little is known about relative severity of infectious disease between the extremes of age. In a systematic analysis we identified 142 datasets with information on severity of disease by age for 32 different infectio...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00668-y

    authors: Glynn JR,Moss PAH

    更新日期:2020-10-15 00:00:00

  • The effects of sequencing platforms on phylogenetic resolution in 16 S rRNA gene profiling of human feces.

    abstract::High-quality and high-throughput sequencing technologies are required for therapeutic and diagnostic analyses of human gut microbiota. Here, we evaluated the advantages and disadvantages of the various commercial sequencing platforms for studying human gut microbiota. We generated fecal bacterial sequences from 170 Ko...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.68

    authors: Whon TW,Chung WH,Lim MY,Song EJ,Kim PS,Hyun DW,Shin NR,Bae JW,Nam YD

    更新日期:2018-04-24 00:00:00

  • Multiscale dynamic human mobility flow dataset in the U.S. during the COVID-19 epidemic.

    abstract::Understanding dynamic human mobility changes and spatial interaction patterns at different geographic scales is crucial for assessing the impacts of non-pharmaceutical interventions (such as stay-at-home orders) during the COVID-19 pandemic. In this data descriptor, we introduce a regularly-updated multiscale dynamic ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00734-5

    authors: Kang Y,Gao S,Liang Y,Li M,Rao J,Kruse J

    更新日期:2020-11-12 00:00:00

  • flEECe, an energy use and occupant behavior dataset for net-zero energy affordable senior residential buildings.

    abstract::The behaviors of building occupants have continued to perplex scholars for years in our attempts to develop models for energy efficient housing. Building simulations, project delivery approaches, policies, and more have fell short of their optimistic goals due to the complexity of human behavior. As a part of a multip...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0275-3

    authors: Paige F,Agee P,Jazizadeh F

    更新日期:2019-11-26 00:00:00

  • A database seed for a community-driven material intensity research platform.

    abstract::The data record contains Material Intensity data for buildings (MI). MI coefficients are often used for different types of analysis of socio-economic systems and in particular for environmental assessments. Until now, MI values were compiled and reported ad-hoc with few cross-study comparisons. We extracted and conver...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0021-x

    authors: Heeren N,Fishman T

    更新日期:2019-04-09 00:00:00

  • An agricultural survey for more than 9,500 African households.

    abstract::Surveys for more than 9,500 households were conducted in the growing seasons 2002/2003 or 2003/2004 in eleven African countries: Burkina Faso, Cameroon, Ghana, Niger and Senegal in western Africa; Egypt in northern Africa; Ethiopia and Kenya in eastern Africa; South Africa, Zambia and Zimbabwe in southern Africa. Hous...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.20

    authors: Waha K,Zipf B,Kurukulasuriya P,Hassan RM

    更新日期:2016-05-24 00:00:00

  • Genome-wide barcoded transposon screen for cancer drug sensitivity in haploid mouse embryonic stem cells.

    abstract::We describe a screen for cellular response to drugs that makes use of haploid embryonic stem cells. We generated ten libraries of mutants with piggyBac gene trap transposon integrations, totalling approximately 100,000 mutant clones. Random barcode sequences were inserted into the transposon vector to allow the number...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.20

    authors: Pettitt SJ,Krastev DB,Pemberton HN,Fontebasso Y,Frankum J,Rehman FL,Brough R,Song F,Bajrami I,Rafiq R,Wallberg F,Kozarewa I,Fenwick K,Armisen-Garrido J,Swain A,Gulati A,Campbell J,Ashworth A,Lord CJ

    更新日期:2017-03-01 00:00:00

  • A data citation roadmap for scholarly data repositories.

    abstract::This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles, a synopsis and harmonization of the recommendations of major science policy bodies. The roadmap was developed by the Repositories Expert Group, as p...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0031-8

    authors: Fenner M,Crosas M,Grethe JS,Kennedy D,Hermjakob H,Rocca-Serra P,Durand G,Berjon R,Karcher S,Martone M,Clark T

    更新日期:2019-04-10 00:00:00

  • Building fault detection data to aid diagnostic algorithm creation and performance testing.

    abstract::It is estimated that approximately 4-5% of national energy consumption can be saved through corrections to existing commercial building controls infrastructure and resulting improvements to efficiency. Correspondingly, automated fault detection and diagnostics (FDD) algorithms are designed to identify the presence of ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0398-6

    authors: Granderson J,Lin G,Harding A,Im P,Chen Y

    更新日期:2020-02-24 00:00:00

  • Comprehensive analysis of the venom gland transcriptome of the spider Dolomedes fimbriatus.

    abstract::A comprehensive transcriptome analysis of an expressed sequence tag (EST) database of the spider Dolomedes fimbriatus venom glands using single-residue distribution analysis (SRDA) identified 7,169 unique sequences. Mature chains of 163 different toxin-like polypeptides were predicted on the basis of well-established ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2014.23

    authors: Kozlov SA,Lazarev VN,Kostryukova ES,Selezneva OV,Ospanova EA,Alexeev DG,Govorun VM,Grishin EV

    更新日期:2014-08-05 00:00:00

  • A microarray whole-genome gene expression dataset in a rat model of inflammatory corneal angiogenesis.

    abstract::In angiogenesis with concurrent inflammation, many pathways are activated, some linked to VEGF and others largely VEGF-independent. Pathways involving inflammatory mediators, chemokines, and micro-RNAs may play important roles in maintaining a pro-angiogenic environment or mediating angiogenic regression. Here, we des...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.103

    authors: Mukwaya A,Lindvall JM,Xeroudaki M,Peebo B,Ali Z,Lennikov A,Jensen LD,Lagali N

    更新日期:2016-11-22 00:00:00

  • Transcriptome sequencing, molecular markers, and transcription factor discovery of Platanus acerifolia in the presence of Corythucha ciliata.

    abstract::The London Planetree (Platanus acerifolia) are present throughout the world. The tree is considered a greening plant and is commonly planted in streets, parks, and courtyards. The Sycamore lace bug (Corythucha ciliata) is a serious pest of this tree. To determine the molecular mechanism behind the interaction between ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0111-9

    authors: Li F,Wu C,Gao M,Jiao M,Qu C,Gonzalez-Uriarte A,Luo C

    更新日期:2019-07-22 00:00:00

  • The effect of 16S rRNA region choice on bacterial community metabarcoding results.

    abstract::In this work, we compare the resolution of V2-V3 and V3-V4 16S rRNA regions for the purposes of estimating microbial community diversity using paired-end Illumina MiSeq reads, and show that the fragment, including V2 and V3 regions, has higher resolution for lower-rank taxa (genera and species). It allows for a more p...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2019.7

    authors: Bukin YS,Galachyants YP,Morozov IV,Bukin SV,Zakharenko AS,Zemskaya TI

    更新日期:2019-02-05 00:00:00

  • Highly sampled measurements in a controlled atmosphere at the Biosphere 2 Landscape Evolution Observatory.

    abstract::Land-atmosphere interactions at different temporal and spatial scales are important for our understanding of the Earth system and its modeling. The Landscape Evolution Observatory (LEO) at Biosphere 2, managed by the University of Arizona, hosts three nearly identical artificial bare-soil hillslopes with dimensions of...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00645-5

    authors: Arevalo J,Zeng X,Durcik M,Sibayan M,Pangle L,Abramson N,Bugaj A,Ng WR,Kim M,Barron-Gafford G,van Haren J,Niu GY,Adams J,Ruiz J,Troch PA

    更新日期:2020-09-15 00:00:00

  • A database of geopositioned Middle East Respiratory Syndrome Coronavirus occurrences.

    abstract::As a World Health Organization Research and Development Blueprint priority pathogen, there is a need to better understand the geographic distribution of Middle East Respiratory Syndrome Coronavirus (MERS-CoV) and its potential to infect mammals and humans. This database documents cases of MERS-CoV globally, with speci...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0330-0

    authors: Ramshaw RE,Letourneau ID,Hong AY,Hon J,Morgan JD,Osborne JCP,Shirude S,Van Kerkhove MD,Hay SI,Pigott DM

    更新日期:2019-12-13 00:00:00

  • Daily transcriptomes of the copepod Calanus finmarchicus during the summer solstice at high Arctic latitudes.

    abstract::The zooplankter Calanus finmarchicus is a member of the so-called "Calanus Complex", a group of copepods that constitutes a key element of the Arctic polar marine ecosystem, providing a crucial link between primary production and higher trophic levels. Climate change induces the shift of C. finmarchicus to higher lati...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00751-4

    authors: Payton L,Noirot C,Hoede C,Hüppe L,Last K,Wilcockson D,Ershova EA,Valière S,Meyer B

    更新日期:2020-11-24 00:00:00

  • A systematic review and meta-analysis of seroprevalence surveys of ebolavirus infection.

    abstract::Asymptomatic ebolavirus infection could greatly influence transmission dynamics, but there is little consensus on how frequently it occurs or even if it exists. This paper summarises the available evidence on seroprevalence of Ebola, Sudan and Bundibugyo virus IgG in people without known ebolavirus disease. Through sy...

    journal_title:Scientific data

    pub_type: 杂志文章,meta分析,评审

    doi:10.1038/sdata.2016.133

    authors: Bower H,Glynn JR

    更新日期:2017-01-31 00:00:00

  • An archive of longitudinal recordings of the vocalizations of adult Gombe chimpanzees.

    abstract::Studies of chimpanzee vocal communication provide valuable insights into the evolution of communication in complex societies, and also comparative data for understanding the evolution of human language. One particularly valuable dataset of recordings from free-living chimpanzees was collected by Frans X. Plooij and th...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.27

    authors: Plooij FX,van de Rijt-Plooij H,Fischer M,Wilson ML,Pusey A

    更新日期:2015-05-26 00:00:00

  • A kinematic and kinetic dataset of 18 above-knee amputees walking at various speeds.

    abstract::Motion capture is necessary to quantify gait deviations in individuals with lower-limb amputations. However, access to the patient population and the necessary equipment is limited. Here we present the first open biomechanics dataset for 18 individuals with unilateral above-knee amputations walking at different speeds...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0494-7

    authors: Hood S,Ishmael MK,Gunnell A,Foreman KB,Lenzi T

    更新日期:2020-05-21 00:00:00

  • The Coral Trait Database, a curated database of trait information for coral species from the global oceans.

    abstract::Trait-based approaches advance ecological and evolutionary research because traits provide a strong link to an organism's function and fitness. Trait-based research might lead to a deeper understanding of the functions of, and services provided by, ecosystems, thereby improving management, which is vital in the curren...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.17

    authors: Madin JS,Anderson KD,Andreasen MH,Bridge TC,Cairns SD,Connolly SR,Darling ES,Diaz M,Falster DS,Franklin EC,Gates RD,Harmer A,Hoogenboom MO,Huang D,Keith SA,Kosnik MA,Kuo CY,Lough JM,Lovelock CE,Luiz O,Martinelli J

    更新日期:2016-03-29 00:00:00

  • A three-dimensional thalamocortical dataset for characterizing brain heterogeneity.

    abstract::Neural microarchitecture is heterogeneous, varying both across and within brain regions. The consistent identification of regions of interest is one of the most critical aspects in examining neurocircuitry, as these structures serve as the vital landmarks with which to map brain pathways. Access to continuous, three-d...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00692-y

    authors: Prasad JA,Balwani AH,Johnson EC,Miano JD,Sampathkumar V,De Andrade V,Fezzaa K,Du M,Vescovi R,Jacobsen C,Kording KP,Gürsoy D,Gray Roncal W,Kasthuri N,Dyer EL

    更新日期:2020-10-20 00:00:00

  • Fractionation of parietal function in bistable perception probed with concurrent TMS-EEG.

    abstract::When visual input has conflicting interpretations, conscious perception can alternate spontaneously between these possible interpretations. This is called bistable perception. Previous neuroimaging studies have indicated the involvement of two right parietal areas in resolving perceptual ambiguity (ant-SPLr and post-S...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.65

    authors: Schauer G,Chang A,Schwartzman D,Rae CL,Iriye H,Seth AK,Kanai R

    更新日期:2016-08-16 00:00:00

  • Imaging and clinical data archive for head and neck squamous cell carcinoma patients treated with radiotherapy.

    abstract::Cross sectional imaging is essential for the patient-specific planning and delivery of radiotherapy, a primary determinant of head and neck cancer outcomes. Due to challenges ensuring data quality and patient de-identification, publicly available datasets including diagnostic and radiation treatment planning imaging a...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.173

    authors: Grossberg AJ,Mohamed ASR,Elhalawani H,Bennett WC,Smith KE,Nolan TS,Williams B,Chamchod S,Heukelom J,Kantor ME,Browne T,Hutcheson KA,Gunn GB,Garden AS,Morrison WH,Frank SJ,Rosenthal DI,Freymann JB,Fuller CD

    更新日期:2018-09-04 00:00:00

  • Whole genome characterization of sequence diversity of 15,220 Icelanders.

    abstract::Understanding of sequence diversity is the cornerstone of analysis of genetic disorders, population genetics, and evolutionary biology. Here, we present an update of our sequencing set to 15,220 Icelanders who we sequenced to an average genome-wide coverage of 34X. We identified 39,020,168 autosomal variants passing G...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.115

    authors: Jónsson H,Sulem P,Kehr B,Kristmundsdottir S,Zink F,Hjartarson E,Hardarson MT,Hjorleifsson KE,Eggertsson HP,Gudjonsson SA,Ward LD,Arnadottir GA,Helgason EA,Helgason H,Gylfason A,Jonasdottir A,Jonasdottir A,Rafnar T,Bes

    更新日期:2017-09-21 00:00:00

  • Direct infusion mass spectrometry metabolomics dataset: a benchmark for data processing and quality control.

    abstract::Direct-infusion mass spectrometry (DIMS) metabolomics is an important approach for characterising molecular responses of organisms to disease, drugs and the environment. Increasingly large-scale metabolomics studies are being conducted, necessitating improvements in both bioanalytical and computational workflows to ma...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2014.12

    authors: Kirwan JA,Weber RJ,Broadhurst DI,Viant MR

    更新日期:2014-06-10 00:00:00

  • Transcriptome profiling of interaction effects of soybean cyst nematodes and soybean aphids on soybean.

    abstract::Soybean aphid (Aphis glycines; SBA) and soybean cyst nematode (Heterodera glycines; SCN) are two major pests of soybean (Glycine max) in the United States of America. This study aims to characterize three-way interactions among soybean, SBA, and SCN using both demographic and genetic datasets. SCN-resistant and SCN-su...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0140-4

    authors: Neupane S,Mathew FM,Varenhorst AJ,Nepal MP

    更新日期:2019-07-24 00:00:00

  • Obstacles to the reuse of study metadata in ClinicalTrials.gov.

    abstract::Metadata that are structured using principled schemas and that use terms from ontologies are essential to making biomedical data findable and reusable for downstream analyses. The largest source of metadata that describes the experimental protocol, funding, and scientific leadership of clinical studies is ClinicalTria...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00780-z

    authors: Miron L,Gonçalves RS,Musen MA

    更新日期:2020-12-18 00:00:00

  • Genome-wide polyadenylation site mapping datasets in the rice blast fungus Magnaporthe oryzae.

    abstract::Polyadenylation plays an important role in gene regulation, thus affecting a wide variety of biological processes. In the rice blast fungus Magnaporthe oryzae the cleavage factor I protein Rpb35 is required for pre-mRNA polyadenylation and fungal virulence. Here we present the bioinformatic approach and output data re...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.271

    authors: Marconi M,Sesma A,Rodríguez-Romero JL,González MLR,Wilkinson MD

    更新日期:2018-11-27 00:00:00