BEpipeR: a user-friendly, flexible, scalable, and easily expanded pipeline for a streamlined processing of biotic and abiotic Biodiversity Exploratories data in R

Marcel Glück | Oliver Bossdorf | Henri A. Thomassen

Quick-start

What the pipeline can do for you: Features and functionalities
How to set up the pipeline on your system: Setting-up
How to operate the pipeline: Publication to BEpipeR v1.0.0 release
Found a bug? Report a bug

Motivation

The wealth of (a)biotic environmental data generated in the Biodiversity Exploratories continues to grow steadily, and so does the effort of implementing always the newest data into our statistical frameworks. Unsurprisingly, many BE projects restrict their analyses to a handful of frequently used data sets, neglecting the wealth of information at their fingertips. Oftentimes, this might be caused by the need for stringent quality control and (pre-)processing that many environmental data sets still require. However, this approach might often prevent us from obtaining a more complete understanding of our complex study systems. To remedy this issue, this project provides a comprehensive user-friendly, flexible, scalable, reproducible and easy-to-expand R pipeline that permits for the streamlined processing of (a)biotic EP-level data generated by the Exploratories. We are convinced that such a framework will benefit many scientists in the Exploratories, as the data generated might be used as input in many types of environmental association studies. Additionally, with modifications, this pipeline might be readily adapted to process plot-based data generated by other research consortia.

This project is a registered Biodiversity Exploratories synthesis project.

Features and functionalities

✔️ Flexibility: One pipeline, three modes. Switch between forest, grassland, and combined (forest & grassland) mode effortlessly.

✔️ Ease of use: Simply parse aggregation information through csv parameters files.

✔️ Customizability: Easily adapt the pipeline to your own needs by e.g. subsetting the template for your plots of interest.

✔️ Deployability: Effortlessly run this pipeline on your infrastructure thanks to a reproducible environment.

✔️ Participatory: Shape the future of this project by either providing suggestions or participate by coding.

Processing performed

Data preparation and wrangling: Template creation, plot locations harmonization, values correction, subsetting, fallbacks to more basal (taxonomic) levels, data reshaping, normalization by variable (for e.g. sampling effort)
Quality control: Multi-mode outlier detection
Data aggregation: Both within and across data sets (mean, median, SD, MAD); processing of yearly climate aggregates (incl. the removal of poorly-supported data points)
Diversity indices: Normalization by (repeated) rarefaction; calculating species richness, Simpson/Shannon-Wiener/Margalef/Menhinick index, ...)
Post-processing: Data joining, quality control, variables selection by variance inflation factor analyses
Data export and metadata compilation: Export of composite data sets and VIF-produced subsets; fetching metadata to the variables produced to assist in preparing the data for publication, submission to BExIS, etc ...

FAQ

How do I attribute this pipeline?

Please cite this pipeline as (replace X.X.X with the actual pipeline version used):

Glück M., O. Bossdorf and H. A. Thomassen (2024). BEpipeR: a user-friendly, flexible, scalable, and easily expanded pipeline for a streamlined processing of biotic and abiotic data in R. vX.X.X. Zenodo. https://zenodo.org/doi/10.5281/zenodo.10683384

Please do so if you use the pipeline or parts of it in your own work. If you use data produced through this pipeline, please cite both the data set and this pipeline.

Acknowledgements

People and/or institutions we are indebted to:

Founders and staff of the Biodiversity Exploratories: For the envisioning, setting up, and maintenance of the research platform.
Executive department on German Copyright Law, Tübingen University: For assisting in finding a suitable license for this pipeline.
Open Access Publishing Fund, Tübingen University: For covering publication fees.

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
.github		.github
BEpipeR_logo.png		BEpipeR_logo.png
LICENSE.txt		LICENSE.txt
README.md		README.md
setup_guide.md		setup_guide.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

BEpipeR_logo.png

BEpipeR_logo.png

LICENSE.txt

LICENSE.txt

README.md

README.md

setup_guide.md

setup_guide.md

Repository files navigation

BEpipeR: a user-friendly, flexible, scalable, and easily expanded pipeline for a streamlined processing of biotic and abiotic Biodiversity Exploratories data in R

Quick-start

Motivation

Features and functionalities

Processing performed

FAQ

How do I attribute this pipeline?

Acknowledgements

About

Releases 1

Packages

License

marcelglueck/BEpipeR

Folders and files

Latest commit

History

Repository files navigation

BEpipeR: a user-friendly, flexible, scalable, and easily expanded pipeline for a streamlined processing of biotic and abiotic Biodiversity Exploratories data in R

Quick-start

Motivation

Features and functionalities

Processing performed

FAQ

How do I attribute this pipeline?

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks