Skip to content

cjabradshaw/random-forests-for-predicting-predator-prey-interactions-in-terrestrial-vertebrates

 
 

Repository files navigation

Predicting predator-prey interactions in terrestrial endotherms using random forest

network predictions

Accompanies paper:
Llewelyn, J, G Strona, CR Dickman, A Greenville, G Wardle, MSY Lee, S Doherty, F Shabani, F Saltré, CJA Bradshaw. 2023. Predicting predator-prey interactions in terrestrial endotherms using random forest. Ecography doi:10.1111/ecog.06619

and (out-of-date) preprint:
Llewelyn, J, G Strona, CR Dickman, A Greenville, G Wardle, MSY Lee, S Doherty, F Shabani, F Saltré, CJA Bradshaw. 2022. Predicting predator-prey interactions in terrestrial endotherms using random forest. bioRχiv doi:10.1101/2022.09.02.506446

The files include a folder with data files (.rds format) and a folder with R scripts. File paths will need to be adjusted once downloaded, and can be found in the scripts by searching for “###”.

DATA FOLDER

The data folder contains 9 files.:

  • GloBIplus_Int20EVs.RDS contains the global interaction records,
  • allNon_sameCont.RDS contains the global non-interaction records,
  • allperms_cut2_20EVs.RDS contains the interaction and non-interaction records for the seven focal predators from the Simpson Desert,
  • and 6x 'KeepVar' files that contain the variables used in the few-variable models (i.e., the most important variables as identified by the variable importance script).

Each of the interaction and non-interaction files include all ecomorphological traits and phylogenetic eigenvectors used in analyses.

FUNCTIONS FOLDER

The functions folder contains three files with functions required by the script files:

  • (all_functions_ranger.R),
  • (opt.functions.R),
  • (variable_importance_functions.R).

SCRIPT FOLDER

The script folder contains three more folders.

  1. OPTIMISE ON GLOBAL_APPLY TO GLOBAL AND SIMPSON DESERT contains scripts that (i) identify optimal parameter values for 12 different random forest models (that differ in terms of the trait data used and the global training dataset) and (ii) applies the 12 optimised model (as determined in step i) to the global interaction/non-interaction data and the Simpson Desert data (for the seven focal predators).
  2. DATA QUALITY MANIPULATION AND MODEL PERFORMANCE contains two more folders of scripts that test the effect of modifying training data quality on model performance (when training on the enhanced global data and applied to the Simpson Desert data). i. RECORDREMOVAL&REPLACE_MODELPERFORMANCE contains scripts that test the effect of removing records or switching interaction records to non-interactions (false negatives) on model performance. These modification to training data quality were made to different subsets of the data including: the whole dataset, focal prey species only, focal predator species only, and non-focal species (non-Simpson Desert) only. ii. CORRELATION&CHANGE_PROBABILITY contains scripts testing the effects of modifying the focal-predator component of training data (removing records or switching interactions to noninteractions) on (a) relative suitability of different prey for each predator and (b) the mean probability assigned to potential prey for each predator.
  3. VARIABLE_IMPORTANCE contains 6 scripts for identifying the most important variables to retain in the few-variable models.

SUPPORTING INFORMATION DATA FOLDER

The supporting information data folder contains 3 files.:

  • interactions_between_Simpson_Desert_species.xlsx contains observed predator-prey interactions between the 7 focal predators and prey species in the Simpson Desert species assemblage (birds and mammals only).
  • Simpson_Desert_predators_with_nonSD_prey.xlsx contains observed predator-prey interactions between the 7 focal predators and non-Simpson Desert species (birds and mammals only).
  • Simpson_Desert_sp_traits.xlsx contains trait data for birds and mammals from the Simpson Desert species assemblage.