Methods
A rigorous, standardised workflow designed for global comparability and reproducibility.
Standardised Sampling
To ensure data comparability across 4,000+ sites, all contributors followed an identical sampling protocol.
- Site selection: Representative natural vegetation with minimal anthropogenic disturbance.
- Plot design: 40 soil cores (5 cm diameter, 5 cm depth) collected from a 50×50 m plot.
- Pooling: Cores pooled into a composite sample to capture local heterogeneity.
- DNA extraction: 2 g homogenised dry soil per composite, a large mass that improves biodiversity recovery and detection of rare taxa.
Molecular Workflow
Target Markers
Full-length ITS
Primary barcode marker for Fungi. Provides high taxonomic resolution and enables accurate identification.
18S and 28S rRNA
Additional regions used to detect long-read chimeras and further improve taxonomic identification.
DNA libraries were sequenced on the PacBio Sequel II and Revio platforms to generate high-quality long reads.
Bioinformatics & QC
Raw HiFi reads were processed using the NextITS pipeline, designed specifically for long-read metabarcoding.
Demultiplexing & Filtering
Strict quality filtering (Q30+) and primer trimming, homopolymer error correction.
Chimera Removal
De novo chimera detection plus reference-based checking against EUKARYOME and UNITE.
Denoising
Reads denoised unsing UNOISE3 algorithm and clustered into operational taxonomic units (OTUs) at 98% similarity threshold.
Taxonomic Annotation
BLASTn against EUKARYOME reference database, with manual curation of the best hits.
Detailed Protocol
For the complete technical specifications, primer sequences, and bioinformatic parameters, please consult the methods section of the main paper.
Read Full MethodsTechnical Summary
- Sequencing Platform PacBio
- Bioinformatics Pipeline NextITS
- Reference Databases EUKARYOME