Skip to content

datacarpentry/ecology-workshop

Repository files navigation

ecology-workshop

Overview of the ecology workshop

Data Carpentry's aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This workshop uses a tabular ecology dataset and teaches data cleaning, management, analysis and visualization. There are no pre-requisites, and the materials assume no prior knowledge about the tools.

The workshop uses a single tabular data set that contains observations about adorable small mammals over a long period of time in Arizona. See data.md for more information about this data set, including the download location.

The workshop can be taught using R or Python as the base language.

Overview of the lessons:

  1. Data organization in spreadsheets and data cleaning with OpenRefine
  • Introduction to R or Python
  • Data analysis and visualization in R or Python
  • SQL for data management

An example of the ecology materials in the wild is this Data Carpentry workshop at CalTech in 2015.

Detailed structure

Day 1 morning: Data organization & cleaning

There are two lessons in this section. The first is a spreadsheet lesson that teaches good data organization, and some data cleaning and quality control checking in a spreadsheet program.

The second lesson uses a spreadsheet-like program called OpenRefine to teach data cleaning and filtering, and to introduce scripting, regular expressions and APIs (application programming interfaces).

Day 1 afternoon and Day 2 morning: Data analysis & visualization

These lessons includes a basic introduction to R or Python syntax, importing CSV data, and subsetting and merging data. It finishes with calculating summary statistics and creating simple plots.

Day 2 afternoon: Data management with SQL

This lesson introduces the concept of a database using SQLite, how to structure data for easy database import, and how to import tabular data into SQLite. Then, it teaches basic queries, combining results and doing queries across multiple tables.

Other lessons

There are a number of other ecology lessons that are not part of the base workshop. Some of these are no longer taught, and some are only taught at extended workshops.