GloSED logo (capital letter G made of microeukaryotes) GloSED

Data Access

GloSED is open data. We provide raw sequences, processed OTU tables, and rich environmental metadata under CC-BY 4.0.

Processed Data

OTU tables, taxonomy, and metadata ready for analysis.

Download via Zenodo

Raw Sequences

Demultiplexed FASTQ files (PacBio circular consensus sequences).

View on ENA

Bioinformatics

NextITS pipeline code and containers for reproducibility.

Get the Code

Included Data Products

OTU sequences

FASTA

Quality-filtered full-length ITS sequences for all OTUs.

988k OTU sequences

Abundance matrix

TSV / BIOM / Parquet

Sample-by-OTU table with rarefied and raw counts.

~3.8M OTU-sample records

Taxonomy

TSV / Parquet

Curated annotations using EUKARYOME and UNITE SH identifiers.

Sample metadata

XLSX

Geocoordinates, location, soil chemistry, land cover, and other sample descriptors.

4,147 samples

Terms of Use

This dataset is licensed under a Creative Commons Attribution 4.0 International License. You are free to share and adapt the material for any purpose, provided you give appropriate credit.

How to cite
@article{GloSED2026,
  title = {Global dataset of soil eukaryotic communities created with a uniform protocol and long read sequencing},
  author = {Mikryukov et al.},
  journal = {Scientific Data},
  year = {2026},
  doi = {10.1038/s41597-026-07315-y}
}

Versions

  • v1.0.0 March 2026

Contact

For questions about the dataset or potential collaboration, please contact: