Data Access
GloSED is open data. We provide raw sequences, processed OTU tables, and rich environmental metadata under CC-BY 4.0.
Included Data Products
OTU sequences
FASTAQuality-filtered full-length ITS sequences for all OTUs.
988k OTU sequencesAbundance matrix
TSV / BIOM / ParquetSample-by-OTU table with rarefied and raw counts.
~3.8M OTU-sample recordsTaxonomy
TSV / ParquetCurated annotations using EUKARYOME and UNITE SH identifiers.
Sample metadata
XLSXGeocoordinates, location, soil chemistry, land cover, and other sample descriptors.
4,147 samplesTerms of Use
This dataset is licensed under a Creative Commons Attribution 4.0 International License. You are free to share and adapt the material for any purpose, provided you give appropriate credit.
How to cite
@article{GloSED2026,
title = {Global dataset of soil eukaryotic communities created with a uniform protocol and long read sequencing},
author = {Mikryukov et al.},
journal = {Scientific Data},
year = {2026},
doi = {10.1038/s41597-026-07315-y}
} Versions
- v1.0.0 March 2026
Contact
For questions about the dataset or potential collaboration, please contact: