Abstract:
:One important use of genome-wide transcriptional profiles is to identify relationships between transcription levels and patient outcomes. These translational insights can guide the development of biomarkers for clinical application. Data from thousands of translational-biomarker studies have been deposited in public repositories, enabling reuse. However, data-reuse efforts require considerable time and expertise because transcriptional data are generated using heterogeneous profiling technologies, preprocessed using diverse normalization procedures, and annotated in non-standard ways. To address this problem, we curated 45 publicly available, translational-biomarker datasets from a variety of human diseases. To increase the data's utility, we reprocessed the raw expression data using a uniform computational pipeline, addressed quality-control problems, mapped the clinical annotations to a controlled vocabulary, and prepared consistently structured, analysis-ready data files. These data, along with scripts we used to prepare the data, are available in a public repository. We believe these data will be particularly useful to researchers seeking to perform benchmarking studies-for example, to compare and optimize machine-learning algorithms' ability to predict biomedical outcomes.
journal_name
Sci Datajournal_title
Scientific dataauthors
Golightly NP,Bell A,Bischoff AI,Hollingsworth PD,Piccolo SRdoi
10.1038/sdata.2018.66subject
Has Abstractpub_date
2018-04-17 00:00:00pages
180066issn
2052-4463pii
sdata201866journal_volume
5pub_type
杂志文章相关文献
Scientific Data文献大全abstract::This corrects the article DOI: 10.1038/sdata.2017.139. ...
journal_title:Scientific data
pub_type: 杂志文章,已发布勘误
doi:10.1038/sdata.2018.215
更新日期:2018-10-09 00:00:00
abstract::This study assessed 24 adult patients, suffering from severe chronic unilateral pain diagnosed as temporomandibular joint (TMJ) disorder (TMD). The full dentate patients had normal occlusion and had never received an occlusal therapy, i.e., were with natural dental evolution/maturation. The following functional and dy...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2017.168
更新日期:2017-11-07 00:00:00
abstract::This corrects the article DOI: 10.1038/sdata.2017.37. ...
journal_title:Scientific data
pub_type: 杂志文章,已发布勘误
doi:10.1038/sdata.2017.80
更新日期:2017-07-04 00:00:00
abstract::Populations in resource dependent economies gain well-being from the natural environment, in highly spatially and temporally variable patterns. To collect information on this, we designed and implemented a 1586-household quantitative survey in the southwest coastal zone of Bangladesh. Data were collected on material, ...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2016.94
更新日期:2016-11-08 00:00:00
abstract::Surveys for more than 9,500 households were conducted in the growing seasons 2002/2003 or 2003/2004 in eleven African countries: Burkina Faso, Cameroon, Ghana, Niger and Senegal in western Africa; Egypt in northern Africa; Ethiopia and Kenya in eastern Africa; South Africa, Zambia and Zimbabwe in southern Africa. Hous...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2016.20
更新日期:2016-05-24 00:00:00
abstract::Rett syndrome is an X-linked neurodevelopmental disorder caused by mutation in the methyl-CpG-binding protein 2 gene (MECP2) in the majority of cases. We describe an RNA sequencing dataset of postmortem brain tissue samples from four females clinically diagnosed with Rett syndrome and four age-matched female donors. T...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-020-0527-2
更新日期:2020-06-19 00:00:00
abstract::As basic data, the river networks and water resources zones (WRZ) are critical for planning, utilization, development, conservation and management of water resources. Currently, the river network and WRZ of world are most obtained based on digital elevation model data automatically, which are not accuracy enough, espe...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-019-0243-y
更新日期:2019-10-22 00:00:00
abstract::Direct-infusion mass spectrometry (DIMS) metabolomics is an important approach for characterising molecular responses of organisms to disease, drugs and the environment. Increasingly large-scale metabolomics studies are being conducted, necessitating improvements in both bioanalytical and computational workflows to ma...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2014.12
更新日期:2014-06-10 00:00:00
abstract::In order to make effective decisions on sustainable development, it is essential for sugarcane-producing countries to take into account sugarcane acreage and sugarcane production dynamics. The availability of sugarcane biophysical data along the growth season is key to an effective mapping of such dynamics, especially...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2018.150
更新日期:2018-08-07 00:00:00
abstract::East Africa is a drought prone, food and water insecure region with a highly variable climate. This complexity makes rainfall estimation challenging, and this challenge is compounded by low rain gauge densities and inhomogeneous monitoring networks. The dearth of observations is particularly problematic over the past ...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2015.50
更新日期:2015-09-29 00:00:00
abstract::Functional trait databases are emerging as crucial tools for a wide range of ecological studies across the world. Here, we provide a database of functional traits for vascular plant species of the Mediterranean Basin. The database includes 25,764 individual records of 44 traits from 2,457 plant taxa distributed in 119...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2018.135
更新日期:2018-07-10 00:00:00
abstract::In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientific fields exhibit distressin...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2017.114
更新日期:2017-09-05 00:00:00
abstract::Human adenoviruses (HAdVs) are fatal to immuno-suppressed individuals, but no effective anti-HAdV therapy is available. Here, we present a novel image-based high-throughput screening (HTS) platform, which scores the full viral replication cycle from virus entry to dissemination of progeny and second-round infections. ...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-020-00604-0
更新日期:2020-08-12 00:00:00
abstract::The silkmoth chorion was studied extensively by F.C. Kafatos' group for almost 40 years. However, the complete structure of the chorion locus was not obtained in the genome sequence of Bombyx mori published in 2008 due to repetitive sequences, resulting in gaps and an incomplete view of the locus. To obtain the comple...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2015.62
更新日期:2015-11-10 00:00:00
abstract::One of the many ways that climate change may affect human health is by altering the nutrient content of food crops. However, previous attempts to study the effects of increased atmospheric CO2 on crop nutrition have been limited by small sample sizes and/or artificial growing conditions. Here we present data from a me...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2015.36
更新日期:2015-07-21 00:00:00
abstract::Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence i...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-019-0145-z
更新日期:2019-08-02 00:00:00
abstract::We introduce the Precipitation Probability DISTribution (PPDIST) dataset, a collection of global high-resolution (0.1°) observation-based climatologies (1979-2018) of the occurrence and peak intensity of precipitation (P) at daily and 3-hourly time-scales. The climatologies were produced using neural networks trained ...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-020-00631-x
更新日期:2020-09-11 00:00:00
abstract::Achievements of near human-level performance in object recognition by deep neural networks (DNNs) have triggered a flood of comparative studies between the brain and DNNs. Using a DNN as a proxy for hierarchical visual representations, our recent study found that human brain activity patterns measured by functional ma...
journal_title:Scientific data
pub_type:
doi:10.1038/sdata.2019.12
更新日期:2019-02-12 00:00:00
abstract::RAS genes are frequently mutated in cancer and have for decades eluded effective therapeutic attack. The National Cancer Institute's RAS Initiative has a focus on understanding pathways and discovering therapies for RAS-driven cancers. Part of these efforts is the generation of novel reagents to enable the quantificat...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-019-0166-7
更新日期:2019-08-29 00:00:00
abstract::The use of hydrogen (H2) as a substitute for fossil fuel, which accounts for the majority of the world's energy, is environmentally the most benign option for the reduction of CO2 emissions. This will require gigawatt-scale storage systems and as such, H2 storage in porous rocks in the subsurface will be required. Acc...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-020-0568-6
更新日期:2020-07-09 00:00:00
abstract::The data record contains Material Intensity data for buildings (MI). MI coefficients are often used for different types of analysis of socio-economic systems and in particular for environmental assessments. Until now, MI values were compiled and reported ad-hoc with few cross-study comparisons. We extracted and conver...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-019-0021-x
更新日期:2019-04-09 00:00:00
abstract::The London Planetree (Platanus acerifolia) are present throughout the world. The tree is considered a greening plant and is commonly planted in streets, parks, and courtyards. The Sycamore lace bug (Corythucha ciliata) is a serious pest of this tree. To determine the molecular mechanism behind the interaction between ...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-019-0111-9
更新日期:2019-07-22 00:00:00
abstract::An amendment to this paper has been published and can be accessed via a link at the top of the paper. ...
journal_title:Scientific data
pub_type: 杂志文章,已发布勘误
doi:10.1038/s41597-020-0394-x
更新日期:2020-02-10 00:00:00
abstract::The scarce knowledge on phenotypic characterization restricts the usage of genetic diversity of plant genetic resources in research and breeding. We describe original and ready-to-use processed data for approximately 60% of ~22,000 barley accessions hosted at the Federal ex situ Genebank for Agricultural and Horticult...
journal_title:Scientific data
pub_type:
doi:10.1038/sdata.2018.278
更新日期:2018-12-04 00:00:00
abstract::Motion capture is necessary to quantify gait deviations in individuals with lower-limb amputations. However, access to the patient population and the necessary equipment is limited. Here we present the first open biomechanics dataset for 18 individuals with unilateral above-knee amputations walking at different speeds...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-020-0494-7
更新日期:2020-05-21 00:00:00
abstract::Good access to resources and opportunities is essential for sustainable development. Improving access, especially in rural areas, requires useful measures of current access to the locations where these resources and opportunities are found. Recent work has developed a global map of travel times to cities with more tha...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-019-0265-5
更新日期:2019-11-07 00:00:00
abstract::The detection, identification, and localization of illicit nuclear materials in urban environments is of utmost importance for national security. Most often, the process of performing these operations consists of a team of trained individuals equipped with radiation detection devices that have built-in algorithms to a...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-020-00672-2
更新日期:2020-10-05 00:00:00
abstract::Chest radiography is an extremely powerful imaging modality, allowing for a detailed inspection of a patient's chest, but requires specialized training for proper interpretation. With the advent of high performance general purpose computer vision algorithms, the accurate automated analysis of chest radiographs is beco...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-019-0322-0
更新日期:2019-12-12 00:00:00
abstract::Estuaries are crucial ecosystems where human activities deeply affect numerous ecological functions. Here we present a survey dataset based on the monitoring of fish nursery grounds of the Seine estuary and eastern bay of Seine collected once a year using a beam trawl during three distinct periods (1995-2002, 2008-201...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/s41597-020-0572-x
更新日期:2020-07-13 00:00:00
abstract::Orchids are renowned for their spectacular flowers and ecological adaptations. After the sequencing of the genome of the tropical epiphytic orchid Phalaenopsis equestris, we combined Illumina HiSeq2000 for RNA-Seq and Trinity for de novo assembly to characterize the transcriptomes for 11 diverse P. equestris tissues r...
journal_title:Scientific data
pub_type: 杂志文章
doi:10.1038/sdata.2016.83
更新日期:2016-09-27 00:00:00