Skip to content

Public data and scripts for 2019 hake survey analyses

Notifications You must be signed in to change notification settings

nwfsc-cb/eDNA-Hake-public

Repository files navigation

README

This repository supports the publication, "Environmental DNA provides quantitative estimates of Pacific hake abundance and distribution in the open ocean" by Shelton et al. It contains raw data, scripts for processing that data, code for estimating statistical models from that data, and code for post-processing those model fits.

There are two main datasets for this paper. Both were collected during the 2019 U.S.-Canada Integrated Ecosystem & Acoustic-Trawl Survey for Pacific hake aboard the NOAA Ship Bell M. Shimada from July 2 to August 19. The first is data associated hake DNA concentrations as identified in water samples collected and analyzed using qPCR analysis.  All of this data is found in the Data/qPCR folder.  This data has a helper file "_data_column_descriptions.xlsx" which provides detailed descriptions of the data contained each of the columns in each of the four .csv files:

"CTD_hake_data_10-2019.csv" contains information about station identities and CTD deployment information. Water samples were collected from Niskin bottles attached to the CTD.

"Hake eDNA 2019 qPCR results 2020-01-04 standards.csv" contains information on the qPCR analysis of standards with known hake DNA concentrations.

"Hake eDNA 2019 qPCR results 2021-01-04 results.csv" contains information of the qPCR analysis of field collected samples with unknown DNA concentrations.

"Hake eDNA 2019 qPCR results 2021-07-15 sample details.csv" provides sample numbers and additional metadata about water samples that are useful during analysis.



The second dataset is estimates of hake biomass derived from analysis of the acoustic-trawl information and can be found in the "Data/acoustics 2019" folder. Similar to the qPCR data, there is a file "column descriptions.xlsx" that provides descriptions of the content of each data column of"EchoPro_un-kriged_output_TRIMMED-05-Dec-2019_0.csv"


SCRIPTS:


The primary scripts are found in the "Scripts" folder and begin with the numbers 01 to 05. We summarize what each script does below:

01_qPCR_analysis.R reads in qPCR data and estimates statistical models of hake qPCR.

02_acoustics_analysis.R reads in acoustic estimates of biomass density and estimates statistical models for hake biomass.

03_post_process_acoustics_Output.R processes the fitted MCMC object from the eDNA models, calculates summary statistics, and generates plots. The object needed to run this script is generated by the "01..." script.

03_post_process_acoustics_Output.R processes the fitted MCMC object from the acoustics models, calculates summary statistics, and generates plots. The object needed to run this script is generated by the "02..." script.

04_post_process_qPCR_Output.R processes the fitted MCMC object from the eDNA models, calculates summary statistics, and generates plots. The object needed to run this script is generated by the "01..." script.

05_compare_acoustics_eDNA.R compares a range of model derived quantities and compares information from the acoustics and eDNA analyses. The objects needed to run this script are generated by the "03..." And "04..." scripts.

The remaining .R and .stan files are called within the 5 main scripts described above.

Data used in the statistical models for eDNA and acoustics can be found in the "Data" folder with subfolders for eDNA and acoustics data. Each subfolder contains an excel file describing the contents of the .csv data files and a description of the information contained in the data columns.