# README for Effaråsen Ectomycorrhizal Fungi and SCATA Data Publication

Project Title:
--------------
Analysis of Ectomycorrhizal Fungal Diversity, Relative Abundance, and Community Data in Effaråsen Scots Pine Forests: Effects of Tree Retention Level, Tree Distance, and Size.

Authors:
--------
Delphine Lariviere, Skogforsk

Description:
------------
This dataset contains environmental and taxonomic data collected from Scots pine forest stands in Effaråsen, Sweden, focusing on ectomycorrhizal (ECM) fungi. The accompanying R script performs statistical analysis and visualization of these data.

Folder Structure:
-----------------

   - ECM2017.csv : Soil sample data on ECM fungi collected in 2017, including relative abundance and diversity across 245 samples.
   - stand_leveldata.csv : Stand-level ecological data for 25 Scots pine stands, including retention levels and stand identifiers.
   - tax_2017.csv : Taxonomy reference data with taxonomical species names, SCATA groups, genus, and functional group classifications for the species detected in the study in 2017.
   - R_code_clean.R : R script performing data cleaning, statistical analysis, and visualization.


Data Details:
-------------

1. ECM2017.csv
   - Type: Environmental data on ectomycorrhizal fungi
   - Samples: 245 soil samples
   - Variables: Includes species abundance, retention level, and related measurements

Columns Description:

prov_nr               = Tube identifier (unique sample ID)
sum_Area_SQUAREMETERS = Crown area within a 15-meter radius of the sample point (in square meters)
Polygon_Count         = Number of polygons within a 15-meter radius of the sample point
BeID1                 = Stand identifiers that includes stand ID and retention level (StandID-retention Level)
Pool                  = Pool sampling information from PCR
Tag                   = Laboratory tag
UMBLA                 = SLU Metabarcoding Laboratory number
year                  = Sampling year
BeID                  = Stand identifiers without retention level
ret_lvl               = Retention level category
Total                 = Total DNA sequences per tube
prop_ECM              = Proportion of DNA sequences belonging to ectomycorrhiza
nb.species            = Number of species found per tube
SUM_DNAseq            = Sum of DNA sequences belonging exclusively to ectomycorrhiza
scata4818_1132 ...    = Individual scata numbers (unique identifiers for species or OTUs)


2. stand_leveldata.csv
   - Type: Stand-level ecological data
   - Observations: 25 forest stands
   - Variables: Includes retention level categories, tree IDs, and other stand characteristics

Columns Description:

BeID1                 = Stand identifiers that includes stand ID and retention level (StandID-retention Level)
ret_lvl               = Retention level category
mean_Area             = Mean tree crown area (in square meters)
sum_Area              = Total Tree crown area measurement (in square meters)
Mean_near_tree_VIS    = Mean distance to the nearest trees (meters) observed visually
Mean_near_tree_LAS    = Mean distance to the nearest trees (meters) based on LiDAR data
Mean_Poly_count_LAS   = Mean polygon count based on LiDAR data
Mean_avst_tall        = Mean distance to the nearest Pine tree (tall = pine) (meters)
Mean_avst_gran        = Mean distance to the nearest Spruce tree (gran = spruce) (meters)
Mean_avst_bjork       = Mean distance to the nearest birch tree (björk = birch) (meters)
mean_propECM          = Mean proportion of ectomycorrhizal DNA sequences
SUM_tot_DNA_seq       = Sum of total DNA sequences per stand
nb_scata2             = Number of scata (species or OTUs) detected
scata4818_1132 ...    = Individual scata numbers (unique identifiers for species or OTUs)

3. tax_2017.csv
   - Type: Taxonomic classification data
   - Includes species names, SCATA groupings, genus, and functional groups used in analysis

Columns Description:

Species        = Taxonomic Species name
scata_column   = SCATA group identifier
genus          = Genus name
group1         = Swedish general classification group


4. R_code_clean.R
   - Purpose: Script to analyze and visualize ECM fungi and SCATA data
   - Dependencies: Uses R a variety of packages mention L38-56  (make sure to install these)
   - Input files: Loads ECM2017.csv, stand_leveldata.csv, and tax_2017.csv 



License:
--------
Creative Commons Public Domain Dedication (CC0 1.0)

Contact:
--------
Delphine Lariviere
Email: [delphine.lariviere@skogforsk.se]
Institution: Skogforsk

Email: [delphine.lariviere@slu.se]
Institution: Southern Swedish Forest Research Centre


Instructions to Reproduce Analysis:
-----------------------------------
1. Set your working directory to the root folder where the data files are.
2. Ensure R and necessary packages are installed.
3. Run the R_code_clean.R script to reproduce all data processing, analyses, and figures.

Additional Notes:
-----------------
- Data were collected and processed in 2017.
- Functional group classifications in tax_2017.csv are based on genus-level assignments.
- For questions or support, contact the author.



Session info:
-----------------
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 26100)

Matrix products: default

locale:
[1] LC_COLLATE=English_Sweden.utf8  LC_CTYPE=English_Sweden.utf8    LC_MONETARY=English_Sweden.utf8
[4] LC_NUMERIC=C                    LC_TIME=English_Sweden.utf8    

time zone: Europe/Stockholm
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lmerTest_3.1-3 lme4_1.1-35.1  Matrix_1.6-4  

loaded via a namespace (and not attached):
 [1] vctrs_0.6.4         nlme_3.1-164        cli_3.6.1           rlang_1.1.4         generics_0.1.3     
 [6] minqa_1.2.6         glue_1.6.2          colorspace_2.1-0    scales_1.3.0        fansi_1.0.5        
[11] grid_4.3.2          munsell_0.5.0       tibble_3.2.1        MASS_7.3-60         numDeriv_2016.8-1.1
[16] lifecycle_1.0.4     compiler_4.3.2      dplyr_1.1.4         pkgconfig_2.0.3     Rcpp_1.0.11        
[21] rstudioapi_0.15.0   lattice_0.22-5      nloptr_2.0.3        R6_2.5.1            tidyselect_1.2.0   
[26] utf8_1.2.4          pillar_1.9.0        splines_4.3.2       magrittr_2.0.3      tools_4.3.2        
[31] gtable_0.3.4        boot_1.3-31         ggplot2_3.5.1      