MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding.

Abstract:

:The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-composition. In response, studies often develop custom reference databases to improve species assignment. The disadvantage of this approach is that it limits the potential for database re-use, and the transferability of inferences across studies. Here, we present the MARine Eukaryote Species (MARES) reference database for use in marine metabarcoding studies, created using a transparent and reproducible pipeline. MARES includes all COI sequences available in GenBank and BOLD for marine taxa, unified into a single taxonomy. Our pipeline facilitates the curation of sequences, synonymization of taxonomic identifiers used by different repositories, and formatting these data for use in taxonomic assignment tools. Overall, MARES provides a benchmark COI reference database for marine eukaryotes, and a standardised pipeline for (re)producing reference databases enabling integration and fair comparison of marine DNA metabarcoding results.

journal_name

Sci Data

journal_title

Scientific data

authors

Arranz V,Pearman WS,Aguirre JD,Liggins L

doi

10.1038/s41597-020-0549-9

subject

Has Abstract

pub_date

2020-07-03 00:00:00

pages

209

issue

1

issn

2052-4463

pii

10.1038/s41597-020-0549-9

journal_volume

7

pub_type

杂志文章
  • Ground reference data for sugarcane biomass estimation in São Paulo state, Brazil.

    abstract::In order to make effective decisions on sustainable development, it is essential for sugarcane-producing countries to take into account sugarcane acreage and sugarcane production dynamics. The availability of sugarcane biophysical data along the growth season is key to an effective mapping of such dynamics, especially...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.150

    authors: Molijn RA,Iannini L,Rocha JV,Hanssen RF

    更新日期:2018-08-07 00:00:00

  • Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns.

    abstract::Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence i...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0145-z

    authors: McEachran AD,Balabin I,Cathey T,Transue TR,Al-Ghoul H,Grulke C,Sobus JR,Williams AJ

    更新日期:2019-08-02 00:00:00

  • Transcriptomic profiling for prolonged drought in Dendrobium catenatum.

    abstract::Orchid epiphytes, a group containing at least 18,000 species, thrive in habitats that often undergo periodic drought stress. However, few global gene expression profiling datasets have been published for studies addressing the drought-resistant mechanism of this special population. In this study, an experiment involvi...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.233

    authors: Wan X,Zou LH,Zheng BQ,Tian YQ,Wang Y

    更新日期:2018-10-30 00:00:00

  • An open science resource for establishing reliability and reproducibility in functional connectomics.

    abstract::Efforts to identify meaningful functional imaging-based biomarkers are limited by the ability to reliably characterize inter-individual differences in human brain function. Although a growing number of connectomics-based measures are reported to have moderate to high test-retest reliability, the variability in data ac...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2014.49

    authors: Zuo XN,Anderson JS,Bellec P,Birn RM,Biswal BB,Blautzik J,Breitner JC,Buckner RL,Calhoun VD,Castellanos FX,Chen A,Chen B,Chen J,Chen X,Colcombe SJ,Courtney W,Craddock RC,Di Martino A,Dong HM,Fu X,Gong Q,Gorgolews

    更新日期:2014-12-09 00:00:00

  • Erratum: Genomes and phenomes of a population of outbred rats and its progenitors.

    abstract::[This corrects the article DOI: 10.1038/sdata.2014.11.]. ...

    journal_title:Scientific data

    pub_type: 已发布勘误

    doi:10.1038/sdata.2014.16

    authors: Baud A,Guryev V,Hummel O,Johannesson M,Rat Genome Sequencing and Mapping Consortium.,Flint J

    更新日期:2014-07-08 00:00:00

  • Two-colour serial femtosecond crystallography dataset from gadoteridol-derivatized lysozyme for MAD phasing.

    abstract::We provide a detailed description of a gadoteridol-derivatized lysozyme (gadolinium lysozyme) two-colour serial femtosecond crystallography (SFX) dataset for multiple wavelength anomalous dispersion (MAD) structure determination. The data was collected at the Spring-8 Angstrom Compact free-electron LAser (SACLA) facil...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2017.188

    authors: Gorel A,Motomura K,Fukuzawa H,Doak RB,Grünbein ML,Hilpert M,Inoue I,Kloos M,Nass Kovács G,Nango E,Nass K,Roome CM,Shoeman RL,Tanaka R,Tono K,Foucar L,Joti Y,Yabashi M,Iwata S,Ueda K,Barends TRM,Schlichting I

    更新日期:2017-12-12 00:00:00

  • Facial model collection for medical augmented reality in oncologic cranio-maxillofacial surgery.

    abstract::Medical augmented reality (AR) is an increasingly important topic in many medical fields. AR enables x-ray vision to see through real world objects. In medicine, this offers pre-, intra- or post-interventional visualization of "hidden" structures. In contrast to a classical monitor view, AR applications provide visual...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0327-8

    authors: Gsaxner C,Wallner J,Chen X,Zemann W,Egger J

    更新日期:2019-12-09 00:00:00

  • Long term survey of the fish community and associated benthic fauna of the Seine estuary nursery grounds.

    abstract::Estuaries are crucial ecosystems where human activities deeply affect numerous ecological functions. Here we present a survey dataset based on the monitoring of fish nursery grounds of the Seine estuary and eastern bay of Seine collected once a year using a beam trawl during three distinct periods (1995-2002, 2008-201...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0572-x

    authors: Cariou T,Dubroca L,Vogel C

    更新日期:2020-07-13 00:00:00

  • HYSOGs250m, global gridded hydrologic soil groups for curve-number-based runoff modeling.

    abstract::Hydrologic soil groups (HSGs) are a fundamental component of the USDA curve-number (CN) method for estimation of rainfall runoff; yet these data are not readily available in a format or spatial-resolution suitable for regional- and global-scale modeling applications. We developed a globally consistent, gridded dataset...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.91

    authors: Ross CW,Prihodko L,Anchang J,Kumar S,Ji W,Hanan NP

    更新日期:2018-05-15 00:00:00

  • Flow and detailed 3D morphodynamic data from laboratory experiments of fluvial dike breaching.

    abstract::This paper presents a dataset obtained from fifty four laboratory experiments of the breaching of fluvial dikes due to flow overtopping. Data were collected on two complementary experimental setups, each consisting of a main channel representing the river, an erodible lateral dike and a floodplain. The dataset covers ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0057-y

    authors: Rifai I,El Kadi Abderrezzak K,Erpicum S,Archambeau P,Violeau D,Pirotton M,Dewals B

    更新日期:2019-05-13 00:00:00

  • A three-dimensional thalamocortical dataset for characterizing brain heterogeneity.

    abstract::Neural microarchitecture is heterogeneous, varying both across and within brain regions. The consistent identification of regions of interest is one of the most critical aspects in examining neurocircuitry, as these structures serve as the vital landmarks with which to map brain pathways. Access to continuous, three-d...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00692-y

    authors: Prasad JA,Balwani AH,Johnson EC,Miano JD,Sampathkumar V,De Andrade V,Fezzaa K,Du M,Vescovi R,Jacobsen C,Kording KP,Gürsoy D,Gray Roncal W,Kasthuri N,Dyer EL

    更新日期:2020-10-20 00:00:00

  • CU-BEMS, smart building electricity consumption and indoor environmental sensor datasets.

    abstract::This paper describes the release of the detailed building operation data, including electricity consumption and indoor environmental measurements, of the seven-story 11,700-m2 office building located in Bangkok, Thailand. The electricity consumption data (kW) are that of individual air conditioning units, lighting, an...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00582-3

    authors: Pipattanasomporn M,Chitalia G,Songsiri J,Aswakul C,Pora W,Suwankawin S,Audomvongseree K,Hoonchareon N

    更新日期:2020-07-20 00:00:00

  • The effect of 16S rRNA region choice on bacterial community metabarcoding results.

    abstract::In this work, we compare the resolution of V2-V3 and V3-V4 16S rRNA regions for the purposes of estimating microbial community diversity using paired-end Illumina MiSeq reads, and show that the fragment, including V2 and V3 regions, has higher resolution for lower-rank taxa (genera and species). It allows for a more p...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2019.7

    authors: Bukin YS,Galachyants YP,Morozov IV,Bukin SV,Zakharenko AS,Zemskaya TI

    更新日期:2019-02-05 00:00:00

  • Imaging and clinical data archive for head and neck squamous cell carcinoma patients treated with radiotherapy.

    abstract::Cross sectional imaging is essential for the patient-specific planning and delivery of radiotherapy, a primary determinant of head and neck cancer outcomes. Due to challenges ensuring data quality and patient de-identification, publicly available datasets including diagnostic and radiation treatment planning imaging a...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.173

    authors: Grossberg AJ,Mohamed ASR,Elhalawani H,Bennett WC,Smith KE,Nolan TS,Williams B,Chamchod S,Heukelom J,Kantor ME,Browne T,Hutcheson KA,Gunn GB,Garden AS,Morrison WH,Frank SJ,Rosenthal DI,Freymann JB,Fuller CD

    更新日期:2018-09-04 00:00:00

  • An accurate registration of the BigBrain dataset with the MNI PD25 and ICBM152 atlases.

    abstract::Brain atlases that encompass detailed anatomical or physiological features are instrumental in the research and surgical planning of various neurological conditions. Magnetic resonance imaging (MRI) has played important roles in neuro-image analysis while histological data remain crucial as a gold standard to guide an...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0217-0

    authors: Xiao Y,Lau JC,Anderson T,DeKraker J,Collins DL,Peters T,Khan AR

    更新日期:2019-10-17 00:00:00

  • A data citation roadmap for scholarly data repositories.

    abstract::This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles, a synopsis and harmonization of the recommendations of major science policy bodies. The roadmap was developed by the Repositories Expert Group, as p...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0031-8

    authors: Fenner M,Crosas M,Grethe JS,Kennedy D,Hermjakob H,Rocca-Serra P,Durand G,Berjon R,Karcher S,Martone M,Clark T

    更新日期:2019-04-10 00:00:00

  • The Coral Trait Database, a curated database of trait information for coral species from the global oceans.

    abstract::Trait-based approaches advance ecological and evolutionary research because traits provide a strong link to an organism's function and fitness. Trait-based research might lead to a deeper understanding of the functions of, and services provided by, ecosystems, thereby improving management, which is vital in the curren...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2016.17

    authors: Madin JS,Anderson KD,Andreasen MH,Bridge TC,Cairns SD,Connolly SR,Darling ES,Diaz M,Falster DS,Franklin EC,Gates RD,Harmer A,Hoogenboom MO,Huang D,Keith SA,Kosnik MA,Kuo CY,Lough JM,Lovelock CE,Luiz O,Martinelli J

    更新日期:2016-03-29 00:00:00

  • I-BLEND, a campus-scale commercial and residential buildings electrical energy dataset.

    abstract::Efficient energy consumption at the building level is vital for sustainability. Providing energy efficient systems and solutions requires an understanding of how energy gets consumed. However, there is a general lack of large-scale open datasets about the energy consumption of buildings, which hinders the research. Th...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2019.15

    authors: Rashid H,Singh P,Singh A

    更新日期:2019-02-19 00:00:00

  • Viruses of the Nahant Collection, characterization of 251 marine Vibrionaceae viruses.

    abstract::Viruses are highly discriminating in their interactions with host cells and are thought to play a major role in maintaining diversity of environmental microbes. However, large-scale ecological and genomic studies of co-occurring virus-host pairs, required to characterize the mechanistic and genomic foundations of viru...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.114

    authors: Kauffman KM,Brown JM,Sharma RS,VanInsberghe D,Elsherbini J,Polz M,Kelly L

    更新日期:2018-07-03 00:00:00

  • One millennium of historical freshwater fish occurrence data for Portuguese rivers and streams.

    abstract::The insights that historical evidence of human presence and man-made documents provide are unique. For example, using historical data may be critical to adequately understand the ecological requirements of species. However, historical information about freshwater species distribution remains largely a knowledge gap. I...

    journal_title:Scientific data

    pub_type: 历史文章,杂志文章

    doi:10.1038/sdata.2018.163

    authors: Duarte G,Moreira M,Branco P,da Costa L,Ferreira MT,Segurado P

    更新日期:2018-08-14 00:00:00

  • Outlier analyses of the Protein Data Bank archive using a probability-density-ranking approach.

    abstract::Outlier analyses are central to scientific data assessments. Conventional outlier identification methods do not work effectively for Protein Data Bank (PDB) data, which are characterized by heavy skewness and the presence of bounds and/or long tails. We have developed a data-driven nonparametric method to identify out...

    journal_title:Scientific data

    pub_type:

    doi:10.1038/sdata.2018.293

    authors: Shao C,Liu Z,Yang H,Wang S,Burley SK

    更新日期:2018-12-11 00:00:00

  • A database seed for a community-driven material intensity research platform.

    abstract::The data record contains Material Intensity data for buildings (MI). MI coefficients are often used for different types of analysis of socio-economic systems and in particular for environmental assessments. Until now, MI values were compiled and reported ad-hoc with few cross-study comparisons. We extracted and conver...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0021-x

    authors: Heeren N,Fishman T

    更新日期:2019-04-09 00:00:00

  • A statistical atlas of cerebral arteries generated using multi-center MRA datasets from healthy subjects.

    abstract::Magnetic resonance angiography (MRA) can capture the variation of cerebral arteries with high spatial resolution. These measurements include valuable information about the morphology, geometry, and density of brain arteries, which may be useful to identify risk factors for cerebrovascular and neurological diseases at ...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0034-5

    authors: Mouches P,Forkert ND

    更新日期:2019-04-11 00:00:00

  • Serial scanning electron microscopy of anti-PKHD1L1 immuno-gold labeled mouse hair cell stereocilia bundles.

    abstract::Serial electron microscopy techniques have proven to be a powerful tool in biology. Unfortunately, the data sets they generate lack robust and accurate automated segmentation algorithms. In this data descriptor publication, we introduce a serial focused ion beam scanning electron microscopy (FIB-SEM) dataset consistin...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-0509-4

    authors: Ivanchenko MV,Cicconet M,Jandal HA,Wu X,Corey DP,Indzhykulian AA

    更新日期:2020-06-17 00:00:00

  • Human pluripotent stem cell derived HLC transcriptome data enables molecular dissection of hepatogenesis.

    abstract::Induced pluripotent stem cells (iPSCs) and human embryonic stem cells (hESCs) differentiated into hepatocyte-like cells (HLCs) provide a defined and renewable source of cells for drug screening, toxicology and regenerative medicine. We previously reprogrammed human fetal foreskin fibroblast cells (HFF1) into iPSCs emp...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2018.35

    authors: Wruck W,Adjaye J

    更新日期:2018-03-13 00:00:00

  • Multi-omics profile of the mouse dentate gyrus after kainic acid-induced status epilepticus.

    abstract::Temporal lobe epilepsy (TLE) can develop from alterations in hippocampal structure and circuit characteristics, and can be modeled in mice by administration of kainic acid (KA). Adult neurogenesis in the dentate gyrus (DG) contributes to hippocampal functions and has been reported to contribute to the development of T...

    journal_title:Scientific data

    pub_type: 评论,杂志文章

    doi:10.1038/sdata.2016.68

    authors: Schouten M,Bielefeld P,Fratantoni SA,Hubens CJ,Piersma SR,Pham TV,Voskuyl RA,Lucassen PJ,Jimenez CR,Fitzsimons CP

    更新日期:2016-08-16 00:00:00

  • Direct infusion mass spectrometry metabolomics dataset: a benchmark for data processing and quality control.

    abstract::Direct-infusion mass spectrometry (DIMS) metabolomics is an important approach for characterising molecular responses of organisms to disease, drugs and the environment. Increasingly large-scale metabolomics studies are being conducted, necessitating improvements in both bioanalytical and computational workflows to ma...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/sdata.2014.12

    authors: Kirwan JA,Weber RJ,Broadhurst DI,Viant MR

    更新日期:2014-06-10 00:00:00

  • Multi-year whole-blood transcriptome data for the study of onset and progression of Parkinson's Disease.

    abstract::Parkinson's disease (PD) is an age-related, chronic and progressive neurodegenerative disorder characterized by a loss of multifocal neurons, resulting in both non-motor and motor symptoms. While several genetic and environmental contributory risk factors have been identified, more exact methods for diagnosing and ass...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-019-0022-9

    authors: Valentine MNZ,Hashimoto K,Fukuhara T,Saiki S,Ishikawa KI,Hattori N,Carninci P

    更新日期:2019-04-05 00:00:00

  • Long-term surveys of age structure in 13 ungulate and one ostrich species in the Serengeti, 1926-2018.

    abstract::The Serengeti ecosystem spans an extensive network of protected areas in Tanzania, eastern Africa, and a UNESCO Wold Heritage Site. It is home to some of the largest animal migrations on the planet. Here, we describe a dataset consisting of the sample counts of three age classes (infant, juvenile and adult) of 13 ungu...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00701-0

    authors: Rogy P,Sinclair ARE

    更新日期:2020-10-21 00:00:00

  • Transcriptome dataset of human corneal endothelium based on ribosomal RNA-depleted RNA-Seq data.

    abstract::The corneal endothelium maintains corneal transparency; consequently, damage to this endothelium by a number of pathological conditions results in severe vision loss. Publicly available expression databases of human tissues are useful for investigating the pathogenesis of diseases and for developing new therapeutic mo...

    journal_title:Scientific data

    pub_type: 杂志文章

    doi:10.1038/s41597-020-00754-1

    authors: Tokuda Y,Okumura N,Komori Y,Hanada N,Tashiro K,Koizumi N,Nakano M

    更新日期:2020-11-20 00:00:00