Curated compendium of human transcriptional biomarker data.

Abstract:

:One important use of genome-wide transcriptional profiles is to identify relationships between transcription levels and patient outcomes. These translational insights can guide the development of biomarkers for clinical application. Data from thousands of translational-biomarker studies have been deposited in public repositories, enabling reuse. However, data-reuse efforts require considerable time and expertise because transcriptional data are generated using heterogeneous profiling technologies, preprocessed using diverse normalization procedures, and annotated in non-standard ways. To address this problem, we curated 45 publicly available, translational-biomarker datasets from a variety of human diseases. To increase the data's utility, we reprocessed the raw expression data using a uniform computational pipeline, addressed quality-control problems, mapped the clinical annotations to a controlled vocabulary, and prepared consistently structured, analysis-ready data files. These data, along with scripts we used to prepare the data, are available in a public repository. We believe these data will be particularly useful to researchers seeking to perform benchmarking studies-for example, to compare and optimize machine-learning algorithms' ability to predict biomedical outcomes.

journal_name

Sci Data

journal_title

Scientific data

authors

Golightly NP,Bell A,Bischoff AI,Hollingsworth PD,Piccolo SR

doi

10.1038/sdata.2018.66

subject

Has Abstract

pub_date

2018-04-17 00:00:00

pages

180066

issn

2052-4463

pii

sdata201866

journal_volume

5

pub_type

杂志文章
  • Corrigendum: High-throughput RNAi screen for essential genes and drug synergistic combinations in colorectal cancer.

    abstract::This corrects the article DOI: 10.1038/sdata.2017.139. ...

    journal_title:Scientific data

    pub_type: 杂志文章,已发布勘误

    doi:10.1038/sdata.2018.215

    authors: Williams SP,Barthorpe AS,Lightfoot H,Garnett MJ,McDermott U

    更新日期:2018-10-09 00:00:00

  • Jaw biodynamic data for 24 patients with chronic unilateral temporomandibular disorder.

    abstract::This study assessed 24 adult patients, suffering from severe chronic unilateral pain diagnosed as temporomandibular joint (TMJ) disorder (TMD). The full dentate patients had normal occlusion and had never received an occlusal therapy, i.e., were with natural dental evolution/maturation. The following functional and dy...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.168

    authors: López-Cedrún J,Santana-Mora U,Pombo M,Pérez Del Palomar A,Alonso De la Peña V,Mora MJ,Santana U

    更新日期:2017-11-07 00:00:00

  • Corrigendum: Metagenome sequencing and 98 microbial genomes from Juan de Fuca Ridge flank subsurface fluids.

    abstract::This corrects the article DOI: 10.1038/sdata.2017.37. ...

    journal_title:Scientific data

    pub_type: 杂志文章,已发布勘误

    doi:10.1038/sdata.2017.80

    authors: Jungbluth SP,Amend JP,Rappé MS

    更新日期:2017-07-04 00:00:00

  • Spatial and temporal dynamics of multidimensional well-being, livelihoods and ecosystem services in coastal Bangladesh.

    abstract::Populations in resource dependent economies gain well-being from the natural environment, in highly spatially and temporally variable patterns. To collect information on this, we designed and implemented a 1586-household quantitative survey in the southwest coastal zone of Bangladesh. Data were collected on material, ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.94

    authors: Adams H,Adger WN,Ahmad S,Ahmed A,Begum D,Lázár AN,Matthews Z,Rahman MM,Streatfield PK

    更新日期:2016-11-08 00:00:00

  • An agricultural survey for more than 9,500 African households.

    abstract::Surveys for more than 9,500 households were conducted in the growing seasons 2002/2003 or 2003/2004 in eleven African countries: Burkina Faso, Cameroon, Ghana, Niger and Senegal in western Africa; Egypt in northern Africa; Ethiopia and Kenya in eastern Africa; South Africa, Zambia and Zimbabwe in southern Africa. Hous...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.20

    authors: Waha K,Zipf B,Kurukulasuriya P,Hassan RM

    更新日期:2016-05-24 00:00:00

  • Transcriptome data of temporal and cingulate cortex in the Rett syndrome brain.

    abstract::Rett syndrome is an X-linked neurodevelopmental disorder caused by mutation in the methyl-CpG-binding protein 2 gene (MECP2) in the majority of cases. We describe an RNA sequencing dataset of postmortem brain tissue samples from four females clinically diagnosed with Rett syndrome and four age-matched female donors. T...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0527-2

    authors: Aldinger KA,Timms AE,MacDonald JW,McNamara HK,Herstein JS,Bammler TK,Evgrafov OV,Knowles JA,Levitt P

    更新日期:2020-06-19 00:00:00

  • A data set of global river networks and corresponding water resources zones divisions.

    abstract::As basic data, the river networks and water resources zones (WRZ) are critical for planning, utilization, development, conservation and management of water resources. Currently, the river network and WRZ of world are most obtained based on digital elevation model data automatically, which are not accuracy enough, espe...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0243-y

    authors: Yan D,Wang K,Qin T,Weng B,Wang H,Bi W,Li X,Li M,Lv Z,Liu F,He S,Ma J,Shen Z,Wang J,Bai H,Man Z,Sun C,Liu M,Shi X,Jing L,Sun R,Cao S,Hao C,Wang L,Pei M,Dorjsuren B,Gedefaw M,Girma A,Abiyu A

    更新日期:2019-10-22 00:00:00

  • Direct infusion mass spectrometry metabolomics dataset: a benchmark for data processing and quality control.

    abstract::Direct-infusion mass spectrometry (DIMS) metabolomics is an important approach for characterising molecular responses of organisms to disease, drugs and the environment. Increasingly large-scale metabolomics studies are being conducted, necessitating improvements in both bioanalytical and computational workflows to ma...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2014.12

    authors: Kirwan JA,Weber RJ,Broadhurst DI,Viant MR

    更新日期:2014-06-10 00:00:00

  • Ground reference data for sugarcane biomass estimation in São Paulo state, Brazil.

    abstract::In order to make effective decisions on sustainable development, it is essential for sugarcane-producing countries to take into account sugarcane acreage and sugarcane production dynamics. The availability of sugarcane biophysical data along the growth season is key to an effective mapping of such dynamics, especially...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.150

    authors: Molijn RA,Iannini L,Rocha JV,Hanssen RF

    更新日期:2018-08-07 00:00:00

  • The Centennial Trends Greater Horn of Africa precipitation dataset.

    abstract::East Africa is a drought prone, food and water insecure region with a highly variable climate. This complexity makes rainfall estimation challenging, and this challenge is compounded by low rain gauge densities and inhomogeneous monitoring networks. The dearth of observations is particularly problematic over the past ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.50

    authors: Funk C,Nicholson SE,Landsfeld M,Klotter D,Peterson P,Harrison L

    更新日期:2015-09-29 00:00:00

  • A functional trait database for Mediterranean Basin plants.

    abstract::Functional trait databases are emerging as crucial tools for a wide range of ecological studies across the world. Here, we provide a database of functional traits for vascular plant species of the Mediterranean Basin. The database includes 25,764 individual records of 44 traits from 2,457 plant taxa distributed in 119...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.135

    authors: Tavşanoğlu Ç,Pausas JG

    更新日期:2018-07-10 00:00:00

  • If these data could talk.

    abstract::In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientific fields exhibit distressin...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.114

    authors: Pasquier T,Lau MK,Trisovic A,Boose ER,Couturier B,Crosas M,Ellison AM,Gibson V,Jones CR,Seltzer M

    更新日期:2017-09-05 00:00:00

  • A high-content image-based drug screen of clinical compounds against cell transmission of adenovirus.

    abstract::Human adenoviruses (HAdVs) are fatal to immuno-suppressed individuals, but no effective anti-HAdV therapy is available. Here, we present a novel image-based high-throughput screening (HTS) platform, which scores the full viral replication cycle from virus entry to dissemination of progeny and second-round infections. ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00604-0

    authors: Georgi F,Kuttler F,Murer L,Andriasyan V,Witte R,Yakimovich A,Turcatti G,Greber UF

    更新日期:2020-08-12 00:00:00

  • Construction, complete sequence, and annotation of a BAC contig covering the silkworm chorion locus.

    abstract::The silkmoth chorion was studied extensively by F.C. Kafatos' group for almost 40 years. However, the complete structure of the chorion locus was not obtained in the genome sequence of Bombyx mori published in 2008 due to repetitive sequences, resulting in gaps and an incomplete view of the locus. To obtain the comple...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.62

    authors: Chen Z,Nohata J,Guo H,Li S,Liu J,Guo Y,Yamamoto K,Kadono-Okuda K,Liu C,Arunkumar KP,Nagaraju J,Zhang Y,Liu S,Labropoulou V,Swevers L,Tsitoura P,Iatrou K,Gopinathan KP,Goldsmith MR,Xia Q,Mita K

    更新日期:2015-11-10 00:00:00

  • Impacts of elevated atmospheric CO₂ on nutrient content of important food crops.

    abstract::One of the many ways that climate change may affect human health is by altering the nutrient content of food crops. However, previous attempts to study the effects of increased atmospheric CO2 on crop nutrition have been limited by small sample sizes and/or artificial growing conditions. Here we present data from a me...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.36

    authors: Dietterich LH,Zanobetti A,Kloog I,Huybers P,Leakey AD,Bloom AJ,Carlisle E,Fernando N,Fitzgerald G,Hasegawa T,Holbrook NM,Nelson RL,Norton R,Ottman MJ,Raboy V,Sakai H,Sartor KA,Schwartz J,Seneweera S,Usui Y,Yoshina

    更新日期:2015-07-21 00:00:00

  • Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns.

    abstract::Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence i...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0145-z

    authors: McEachran AD,Balabin I,Cathey T,Transue TR,Al-Ghoul H,Grulke C,Sobus JR,Williams AJ

    更新日期:2019-08-02 00:00:00

  • PPDIST, global 0.1° daily and 3-hourly precipitation probability distribution climatologies for 1979-2018.

    abstract::We introduce the Precipitation Probability DISTribution (PPDIST) dataset, a collection of global high-resolution (0.1°) observation-based climatologies (1979-2018) of the occurrence and peak intensity of precipitation (P) at daily and 3-hourly time-scales. The climatologies were produced using neural networks trained ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00631-x

    authors: Beck HE,Westra S,Tan J,Pappenberger F,Huffman GJ,McVicar TR,Gründemann GJ,Vergopolan N,Fowler HJ,Lewis E,Verbist K,Wood EF

    更新日期:2020-09-11 00:00:00

  • Characterization of deep neural network features by decodability from human brain activity.

    abstract::Achievements of near human-level performance in object recognition by deep neural networks (DNNs) have triggered a flood of comparative studies between the brain and DNNs. Using a DNN as a proxy for hierarchical visual representations, our recent study found that human brain activity patterns measured by functional ma...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2019.12

    authors: Horikawa T,Aoki SC,Tsukamoto M,Kamitani Y

    更新日期:2019-02-12 00:00:00

  • A dataset describing a suite of novel antibody reagents for the RAS signaling network.

    abstract::RAS genes are frequently mutated in cancer and have for decades eluded effective therapeutic attack. The National Cancer Institute's RAS Initiative has a focus on understanding pathways and discovering therapies for RAS-driven cancers. Part of these efforts is the generation of novel reagents to enable the quantificat...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0166-7

    authors: Schoenherr RM,Huang D,Voytovich UJ,Ivey RG,Kennedy JJ,Saul RG,Colantonio S,Roberts RR,Knotts JG,Kaczmarczyk JA,Perry C,Hewitt SM,Bocik W,Whiteley GR,Hiltke T,Boja ES,Rodriguez H,Whiteaker JR,Paulovich AG

    更新日期:2019-08-29 00:00:00

  • Thermodynamic and transport properties of hydrogen containing streams.

    abstract::The use of hydrogen (H2) as a substitute for fossil fuel, which accounts for the majority of the world's energy, is environmentally the most benign option for the reduction of CO2 emissions. This will require gigawatt-scale storage systems and as such, H2 storage in porous rocks in the subsurface will be required. Acc...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0568-6

    authors: Hassanpouryouzband A,Joonaki E,Edlmann K,Heinemann N,Yang J

    更新日期:2020-07-09 00:00:00

  • A database seed for a community-driven material intensity research platform.

    abstract::The data record contains Material Intensity data for buildings (MI). MI coefficients are often used for different types of analysis of socio-economic systems and in particular for environmental assessments. Until now, MI values were compiled and reported ad-hoc with few cross-study comparisons. We extracted and conver...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0021-x

    authors: Heeren N,Fishman T

    更新日期:2019-04-09 00:00:00

  • Transcriptome sequencing, molecular markers, and transcription factor discovery of Platanus acerifolia in the presence of Corythucha ciliata.

    abstract::The London Planetree (Platanus acerifolia) are present throughout the world. The tree is considered a greening plant and is commonly planted in streets, parks, and courtyards. The Sycamore lace bug (Corythucha ciliata) is a serious pest of this tree. To determine the molecular mechanism behind the interaction between ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0111-9

    authors: Li F,Wu C,Gao M,Jiao M,Qu C,Gonzalez-Uriarte A,Luo C

    更新日期:2019-07-22 00:00:00

  • Author Correction: Hybrid de novo whole-genome assembly and annotation of the model tapeworm Hymenolepis diminuta.

    abstract::An amendment to this paper has been published and can be accessed via a link at the top of the paper. ...

    journal_title:Scientific data

    pub_type: 杂志文章,已发布勘误

    doi:10.1038/s41597-020-0394-x

    authors: Nowak RM,Jastrzębski JP,Kuśmirek W,Sałamatin R,Rydzanicz M,Sobczyk-Kopcioł A,Sulima-Celińska A,Paukszto Ł,Makowczenko KG,Płoski R,Tkach VV,Basałaj K,Młocicki D

    更新日期:2020-02-10 00:00:00

  • Unbalanced historical phenotypic data from seed regeneration of a barley ex situ collection.

    abstract::The scarce knowledge on phenotypic characterization restricts the usage of genetic diversity of plant genetic resources in research and breeding. We describe original and ready-to-use processed data for approximately 60% of ~22,000 barley accessions hosted at the Federal ex situ Genebank for Agricultural and Horticult...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.278

    authors: Gonzalez MY,Weise S,Zhao Y,Philipp N,Arend D,Börner A,Oppermann M,Graner A,Reif JC,Schulthess AW

    更新日期:2018-12-04 00:00:00

  • A kinematic and kinetic dataset of 18 above-knee amputees walking at various speeds.

    abstract::Motion capture is necessary to quantify gait deviations in individuals with lower-limb amputations. However, access to the patient population and the necessary equipment is limited. Here we present the first open biomechanics dataset for 18 individuals with unilateral above-knee amputations walking at different speeds...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0494-7

    authors: Hood S,Ishmael MK,Gunnell A,Foreman KB,Lenzi T

    更新日期:2020-05-21 00:00:00

  • A suite of global accessibility indicators.

    abstract::Good access to resources and opportunities is essential for sustainable development. Improving access, especially in rural areas, requires useful measures of current access to the locations where these resources and opportunities are found. Recent work has developed a global map of travel times to cities with more tha...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0265-5

    authors: Nelson A,Weiss DJ,van Etten J,Cattaneo A,McMenomy TS,Koo J

    更新日期:2019-11-07 00:00:00

  • Data for training and testing radiation detection algorithms in an urban environment.

    abstract::The detection, identification, and localization of illicit nuclear materials in urban environments is of utmost importance for national security. Most often, the process of performing these operations consists of a team of trained individuals equipped with radiation detection devices that have built-in algorithms to a...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00672-2

    authors: Ghawaly JM Jr,Nicholson AD,Peplow DE,Anderson-Cook CM,Myers KL,Archer DE,Willis MJ,Quiter BJ

    更新日期:2020-10-05 00:00:00

  • MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports.

    abstract::Chest radiography is an extremely powerful imaging modality, allowing for a detailed inspection of a patient's chest, but requires specialized training for proper interpretation. With the advent of high performance general purpose computer vision algorithms, the accurate automated analysis of chest radiographs is beco...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0322-0

    authors: Johnson AEW,Pollard TJ,Berkowitz SJ,Greenbaum NR,Lungren MP,Deng CY,Mark RG,Horng S

    更新日期:2019-12-12 00:00:00

  • Long term survey of the fish community and associated benthic fauna of the Seine estuary nursery grounds.

    abstract::Estuaries are crucial ecosystems where human activities deeply affect numerous ecological functions. Here we present a survey dataset based on the monitoring of fish nursery grounds of the Seine estuary and eastern bay of Seine collected once a year using a beam trawl during three distinct periods (1995-2002, 2008-201...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0572-x

    authors: Cariou T,Dubroca L,Vogel C

    更新日期:2020-07-13 00:00:00

  • De novo transcriptome assembly databases for the butterfly orchid Phalaenopsis equestris.

    abstract::Orchids are renowned for their spectacular flowers and ecological adaptations. After the sequencing of the genome of the tropical epiphytic orchid Phalaenopsis equestris, we combined Illumina HiSeq2000 for RNA-Seq and Trinity for de novo assembly to characterize the transcriptomes for 11 diverse P. equestris tissues r...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.83

    authors: Niu SC,Xu Q,Zhang GQ,Zhang YQ,Tsai WC,Hsu JL,Liang CK,Luo YB,Liu ZJ

    更新日期:2016-09-27 00:00:00