Obstacles to the reuse of study metadata in ClinicalTrials.gov.

Abstract:

:Metadata that are structured using principled schemas and that use terms from ontologies are essential to making biomedical data findable and reusable for downstream analyses. The largest source of metadata that describes the experimental protocol, funding, and scientific leadership of clinical studies is ClinicalTrials.gov. We evaluated whether values in 302,091 trial records adhere to expected data types and use terms from biomedical ontologies, whether records contain fields required by government regulations, and whether structured elements could replace free-text elements. Contact information, outcome measures, and study design are frequently missing or underspecified. Important fields for search, such as condition and intervention, are not restricted to ontologies, and almost half of the conditions are not denoted by MeSH terms, as recommended. Eligibility criteria are stored as semi-structured free text. Enforcing the presence of all required elements, requiring values for certain fields to be drawn from ontologies, and creating a structured eligibility criteria element would improve the reusability of data from ClinicalTrials.gov in systematic reviews, metanalyses, and matching of eligible patients to trials.

journal_name

Sci Data

journal_title

Scientific data

authors

Miron L,Gonçalves RS,Musen MA

doi

10.1038/s41597-020-00780-z

subject

Has Abstract

pub_date

2020-12-18 00:00:00

pages

443

issue

1

issn

2052-4463

pii

10.1038/s41597-020-00780-z

journal_volume

7

pub_type

杂志文章
  • Enabling precision medicine in neonatology, an integrated repository for preterm birth research.

    abstract::Preterm birth, or the delivery of an infant prior to 37 weeks of gestation, is a significant cause of infant morbidity and mortality. In the last decade, the advent and continued development of molecular profiling technologies has enabled researchers to generate vast amount of 'omics' data, which together with integra...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.219

    authors: Sirota M,Thomas CG,Liu R,Zuhl M,Banerjee P,Wong RJ,Quaintance CC,Leite R,Chubiz J,Anderson R,Chappell J,Kim M,Grobman W,Zhang G,Rokas A,England SK,Parry S,Shaw GM,Simpson JL,Thomson E,Butte AJ,March of Dimes Pre

    更新日期:2018-11-06 00:00:00

  • Spatial data of Ixodes ricinus instar abundance and nymph pathogen prevalence, Scandinavia, 2016-2017.

    abstract::Ticks carry pathogens that can cause disease in both animals and humans, and there is a need to monitor the distribution and abundance of ticks and the pathogens they carry to pinpoint potential high risk areas for tick-borne disease transmission. In a joint Scandinavian study, we measured Ixodes ricinus instar abunda...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00579-y

    authors: Kjær LJ,Klitgaard K,Soleng A,Edgar KS,Lindstedt HEH,Paulsen KM,Andreassen ÅK,Korslund L,Kjelland V,Slettan A,Stuen S,Kjellander P,Christensson M,Teräväinen M,Baum A,Jensen LM,Bødker R

    更新日期:2020-07-16 00:00:00

  • The effects of sequencing platforms on phylogenetic resolution in 16 S rRNA gene profiling of human feces.

    abstract::High-quality and high-throughput sequencing technologies are required for therapeutic and diagnostic analyses of human gut microbiota. Here, we evaluated the advantages and disadvantages of the various commercial sequencing platforms for studying human gut microbiota. We generated fecal bacterial sequences from 170 Ko...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.68

    authors: Whon TW,Chung WH,Lim MY,Song EJ,Kim PS,Hyun DW,Shin NR,Bae JW,Nam YD

    更新日期:2018-04-24 00:00:00

  • Genome-wide siRNA screen of genes regulating the LPS-induced TNF-α response in human macrophages.

    abstract::The mammalian innate immune system senses many bacterial stimuli through the toll-like receptor (TLR) family. Activation of the TLR4 receptor by bacterial lipopolysaccharide (LPS) is the most widely studied TLR pathway due to its central role in host responses to gram-negative bacterial infection and its contribution ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.7

    authors: Sun J,Katz S,Dutta B,Wang Z,Fraser ID

    更新日期:2017-03-01 00:00:00

  • If these data could talk.

    abstract::In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientific fields exhibit distressin...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.114

    authors: Pasquier T,Lau MK,Trisovic A,Boose ER,Couturier B,Crosas M,Ellison AM,Gibson V,Jones CR,Seltzer M

    更新日期:2017-09-05 00:00:00

  • A three-dimensional thalamocortical dataset for characterizing brain heterogeneity.

    abstract::Neural microarchitecture is heterogeneous, varying both across and within brain regions. The consistent identification of regions of interest is one of the most critical aspects in examining neurocircuitry, as these structures serve as the vital landmarks with which to map brain pathways. Access to continuous, three-d...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00692-y

    authors: Prasad JA,Balwani AH,Johnson EC,Miano JD,Sampathkumar V,De Andrade V,Fezzaa K,Du M,Vescovi R,Jacobsen C,Kording KP,Gürsoy D,Gray Roncal W,Kasthuri N,Dyer EL

    更新日期:2020-10-20 00:00:00

  • Computational workflow to study the seasonal variation of secondary metabolites in nine different bryophytes.

    abstract::In Eco-Metabolomics interactions are studied of non-model organisms in their natural environment and relations are made between biochemistry and ecological function. Current challenges when processing such metabolomics data involve complex experiment designs which are often carried out in large field campaigns involvi...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.179

    authors: Peters K,Gorzolka K,Bruelheide H,Neumann S

    更新日期:2018-08-28 00:00:00

  • A high-content image-based drug screen of clinical compounds against cell transmission of adenovirus.

    abstract::Human adenoviruses (HAdVs) are fatal to immuno-suppressed individuals, but no effective anti-HAdV therapy is available. Here, we present a novel image-based high-throughput screening (HTS) platform, which scores the full viral replication cycle from virus entry to dissemination of progeny and second-round infections. ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00604-0

    authors: Georgi F,Kuttler F,Murer L,Andriasyan V,Witte R,Yakimovich A,Turcatti G,Greber UF

    更新日期:2020-08-12 00:00:00

  • A multi-omics dataset of heat-shock response in the yeast RNA binding protein Mip6.

    abstract::Gene expression is a biological process regulated at different molecular levels, including chromatin accessibility, transcription, and RNA maturation and transport. In addition, these regulatory mechanisms have strong links with cellular metabolism. Here we present a multi-omics dataset that captures different aspects...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0412-z

    authors: Nuño-Cabanes C,Ugidos M,Tarazona S,Martín-Expósito M,Ferrer A,Rodríguez-Navarro S,Conesa A

    更新日期:2020-02-27 00:00:00

  • A statistical atlas of cerebral arteries generated using multi-center MRA datasets from healthy subjects.

    abstract::Magnetic resonance angiography (MRA) can capture the variation of cerebral arteries with high spatial resolution. These measurements include valuable information about the morphology, geometry, and density of brain arteries, which may be useful to identify risk factors for cerebrovascular and neurological diseases at ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0034-5

    authors: Mouches P,Forkert ND

    更新日期:2019-04-11 00:00:00

  • The Centennial Trends Greater Horn of Africa precipitation dataset.

    abstract::East Africa is a drought prone, food and water insecure region with a highly variable climate. This complexity makes rainfall estimation challenging, and this challenge is compounded by low rain gauge densities and inhomogeneous monitoring networks. The dearth of observations is particularly problematic over the past ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.50

    authors: Funk C,Nicholson SE,Landsfeld M,Klotter D,Peterson P,Harrison L

    更新日期:2015-09-29 00:00:00

  • Daily transcriptomes of the copepod Calanus finmarchicus during the summer solstice at high Arctic latitudes.

    abstract::The zooplankter Calanus finmarchicus is a member of the so-called "Calanus Complex", a group of copepods that constitutes a key element of the Arctic polar marine ecosystem, providing a crucial link between primary production and higher trophic levels. Climate change induces the shift of C. finmarchicus to higher lati...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00751-4

    authors: Payton L,Noirot C,Hoede C,Hüppe L,Last K,Wilcockson D,Ershova EA,Valière S,Meyer B

    更新日期:2020-11-24 00:00:00

  • Oral microbiota and dental caries data from monozygotic and dizygotic twin children.

    abstract::There are recent studies which aimed to detect the inheritance on the etiology of dental caries exploring oral composition. We present data on the oral microbiota and its relation with dental caries and other factors in monozygotic (MZ) and dizygotic (DZ) twin children. Following clinical investigation, DNA samples we...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00691-z

    authors: Kasimoglu Y,Koruyucu M,Birant S,Karacan I,Topcuoglu N,Tuna EB,Gencay K,Seymen F

    更新日期:2020-10-13 00:00:00

  • Spatial and temporal dynamics of multidimensional well-being, livelihoods and ecosystem services in coastal Bangladesh.

    abstract::Populations in resource dependent economies gain well-being from the natural environment, in highly spatially and temporally variable patterns. To collect information on this, we designed and implemented a 1586-household quantitative survey in the southwest coastal zone of Bangladesh. Data were collected on material, ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.94

    authors: Adams H,Adger WN,Ahmad S,Ahmed A,Begum D,Lázár AN,Matthews Z,Rahman MM,Streatfield PK

    更新日期:2016-11-08 00:00:00

  • A database of geopositioned Middle East Respiratory Syndrome Coronavirus occurrences.

    abstract::As a World Health Organization Research and Development Blueprint priority pathogen, there is a need to better understand the geographic distribution of Middle East Respiratory Syndrome Coronavirus (MERS-CoV) and its potential to infect mammals and humans. This database documents cases of MERS-CoV globally, with speci...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0330-0

    authors: Ramshaw RE,Letourneau ID,Hong AY,Hon J,Morgan JD,Osborne JCP,Shirude S,Van Kerkhove MD,Hay SI,Pigott DM

    更新日期:2019-12-13 00:00:00

  • A data set of global river networks and corresponding water resources zones divisions.

    abstract::As basic data, the river networks and water resources zones (WRZ) are critical for planning, utilization, development, conservation and management of water resources. Currently, the river network and WRZ of world are most obtained based on digital elevation model data automatically, which are not accuracy enough, espe...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0243-y

    authors: Yan D,Wang K,Qin T,Weng B,Wang H,Bi W,Li X,Li M,Lv Z,Liu F,He S,Ma J,Shen Z,Wang J,Bai H,Man Z,Sun C,Liu M,Shi X,Jing L,Sun R,Cao S,Hao C,Wang L,Pei M,Dorjsuren B,Gedefaw M,Girma A,Abiyu A

    更新日期:2019-10-22 00:00:00

  • Ground reference data for sugarcane biomass estimation in São Paulo state, Brazil.

    abstract::In order to make effective decisions on sustainable development, it is essential for sugarcane-producing countries to take into account sugarcane acreage and sugarcane production dynamics. The availability of sugarcane biophysical data along the growth season is key to an effective mapping of such dynamics, especially...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.150

    authors: Molijn RA,Iannini L,Rocha JV,Hanssen RF

    更新日期:2018-08-07 00:00:00

  • A synthesis of bacterial and archaeal phenotypic trait data.

    abstract::A synthesis of phenotypic and quantitative genomic traits is provided for bacteria and archaea, in the form of a scripted, reproducible workflow that standardizes and merges 26 sources. The resulting unified dataset covers 14 phenotypic traits, 5 quantitative genomic traits, and 4 environmental characteristics for app...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0497-4

    authors: Madin JS,Nielsen DA,Brbic M,Corkrey R,Danko D,Edwards K,Engqvist MKM,Fierer N,Geoghegan JL,Gillings M,Kyrpides NC,Litchman E,Mason CE,Moore L,Nielsen SL,Paulsen IT,Price ND,Reddy TBK,Richards MA,Rocha EPC,Schmidt

    更新日期:2020-06-05 00:00:00

  • MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding.

    abstract::The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-composition. In respo...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0549-9

    authors: Arranz V,Pearman WS,Aguirre JD,Liggins L

    更新日期:2020-07-03 00:00:00

  • 7 Tesla MRI of the ex vivo human brain at 100 micron resolution.

    abstract::We present an ultra-high resolution MRI dataset of an ex vivo human brain specimen. The brain specimen was donated by a 58-year-old woman who had no history of neurological disease and died of non-neurological causes. After fixation in 10% formalin, the specimen was imaged on a 7 Tesla MRI scanner at 100 µm isotropic ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0254-8

    authors: Edlow BL,Mareyam A,Horn A,Polimeni JR,Witzel T,Tisdall MD,Augustinack JC,Stockmann JP,Diamond BR,Stevens A,Tirrell LS,Folkerth RD,Wald LL,Fischl B,van der Kouwe A

    更新日期:2019-10-30 00:00:00

  • A multi-omics digital research object for the genetics of sleep regulation.

    abstract::With the aim to uncover the molecular pathways underlying the regulation of sleep, we recently assembled an extensive and comprehensive systems genetics dataset interrogating a genetic reference population of mice at the levels of the genome, the brain and liver transcriptomes, the plasma metabolome, and the sleep-wak...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0171-x

    authors: Jan M,Gobet N,Diessler S,Franken P,Xenarios I

    更新日期:2019-10-31 00:00:00

  • A database of human gait performance on irregular and uneven surfaces collected by wearable sensors.

    abstract::Gait analysis has traditionally relied on laborious and lab-based methods. Data from wearable sensors, such as Inertial Measurement Units (IMU), can be analyzed with machine learning to perform gait analysis in real-world environments. This database provides data from thirty participants (fifteen males and fifteen fem...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0563-y

    authors: Luo Y,Coppola SM,Dixon PC,Li S,Dennerlein JT,Hu B

    更新日期:2020-07-08 00:00:00

  • Transcriptomic profiling of 39 commonly-used neuroblastoma cell lines.

    abstract::Neuroblastoma cell lines are an important and cost-effective model used to study oncogenic drivers of the disease. While many of these cell lines have been previously characterized with SNP, methylation, and/or mRNA expression microarrays, there has not been an effort to comprehensively sequence these cell lines. Here...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.33

    authors: Harenza JL,Diamond MA,Adams RN,Song MM,Davidson HL,Hart LS,Dent MH,Fortina P,Reynolds CP,Maris JM

    更新日期:2017-03-28 00:00:00

  • A high-throughput drug combination screen of targeted small molecule inhibitors in cancer cell lines.

    abstract::While there is a high interest in drug combinations in cancer therapy, openly accessible datasets for drug combination responses are sparse. Here we present a dataset comprising 171 pairwise combinations of 19 individual drugs targeting signal transduction mechanisms across eight cancer cell lines, where the effect of...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0255-7

    authors: Flobak Å,Niederdorfer B,Nakstad VT,Thommesen L,Klinkenberg G,Lægreid A

    更新日期:2019-10-29 00:00:00

  • Construction, complete sequence, and annotation of a BAC contig covering the silkworm chorion locus.

    abstract::The silkmoth chorion was studied extensively by F.C. Kafatos' group for almost 40 years. However, the complete structure of the chorion locus was not obtained in the genome sequence of Bombyx mori published in 2008 due to repetitive sequences, resulting in gaps and an incomplete view of the locus. To obtain the comple...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.62

    authors: Chen Z,Nohata J,Guo H,Li S,Liu J,Guo Y,Yamamoto K,Kadono-Okuda K,Liu C,Arunkumar KP,Nagaraju J,Zhang Y,Liu S,Labropoulou V,Swevers L,Tsitoura P,Iatrou K,Gopinathan KP,Goldsmith MR,Xia Q,Mita K

    更新日期:2015-11-10 00:00:00

  • Tracking vegetation phenology across diverse biomes using Version 2.0 of the PhenoCam Dataset.

    abstract::Monitoring vegetation phenology is critical for quantifying climate change impacts on ecosystems. We present an extensive dataset of 1783 site-years of phenological data derived from PhenoCam network imagery from 393 digital cameras, situated from tropics to tundra across a wide range of plant functional types, biomes...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0229-9

    authors: Seyednasrollah B,Young AM,Hufkens K,Milliman T,Friedl MA,Frolking S,Richardson AD

    更新日期:2019-10-22 00:00:00

  • High resolution multi-facies realizations of sedimentary reservoir and aquifer analogs.

    abstract::Geological structures are by nature inaccessible to direct observation. This can cause difficulties in applications where a spatially explicit representation of such structures is required, in particular when modelling fluid migration in geological formations. An increasing trend in recent years has been to use analog...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.33

    authors: Bayer P,Comunian A,Höyng D,Mariethoz G

    更新日期:2015-07-07 00:00:00

  • Flow cytometry analysis of adrenoceptors expression in human adipose-derived mesenchymal stem/stromal cells.

    abstract::Mesenchymal stem/stromal cells (MSCs) were identified in most tissues of an adult organism. MSCs mediate physiological renewal, as well as regulation of tissue homeostasis, reparation and regeneration. Functions of MSCs are regulated by endocrine and neuronal signals, and noradrenaline is one of the most important MSC...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.196

    authors: Tyurin-Kuzmin PA,Dyikanov DT,Fadeeva JI,Sysoeva VY,Kalinina NI

    更新日期:2018-10-02 00:00:00

  • Transcriptomic profiling for prolonged drought in Dendrobium catenatum.

    abstract::Orchid epiphytes, a group containing at least 18,000 species, thrive in habitats that often undergo periodic drought stress. However, few global gene expression profiling datasets have been published for studies addressing the drought-resistant mechanism of this special population. In this study, an experiment involvi...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.233

    authors: Wan X,Zou LH,Zheng BQ,Tian YQ,Wang Y

    更新日期:2018-10-30 00:00:00

  • A global compendium of human Crimean-Congo haemorrhagic fever virus occurrence.

    abstract::In order to map global disease risk, a geographic database of human Crimean-Congo haemorrhagic fever virus (CCHFV) occurrence was produced by surveying peer-reviewed literature and case reports, as well as informal online sources. Here we present this database, comprising occurrence data linked to geographic point or ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2015.16

    authors: Messina JP,Pigott DM,Duda KA,Brownstein JS,Myers MF,George DB,Hay SI

    更新日期:2015-04-14 00:00:00