Skip to content

An open-source project dedicated to constructing robust data pipelines and scalable software infrastructure. We leverage industry-standard tools favored by developers to enhance efficiency and reliability. Uniquely, these pipelines are field-tested on farms across Sumatra, Indonesia, ensuring real-world applicability and resilience.

License

Notifications You must be signed in to change notification settings

mikestack15/orangutan-stem

Repository files navigation

Python Django Apache Airflow Google Cloud AWS Raspberry Pi Ubuntu Apache Spark OpenCV Docker Postgres Google Drive

orangutan-stem

Welcome to the orangutan-stem GitHub repo! Here you will find the repository containing our codebase and wiki for the pipelines we have built in the YouTube series.

orangutan-stem YouTube Channel

Project Focus

This project focuses on exposing nascent data professionals to the world of data and software engineering through real-world applications. The curriculum outlined in the wiki and on the YouTube channel, demonstrates how to properly setup local and cloud-based infrastructure to collect, ingest, process, store, and deliver data with modern tools that data engineers use on a daily basis.

orangutan-stem seeks to build farmland data pipelines, provide education for interested data professionals, and is an ongoing project (founded in 2019) to show the latest technologies available in the field. Our farm, Orangutan Orchard, is located in Bukit Lawang, Northern Sumatra, Indonesia.

To read more about the origins of this project, or if you are interested in trekking the nearby Gunung Leuser National Park, check out the project origins wiki.

Be sure to read the requirements carefully, submerge yourself in documentation, and work alongside the curriculum of YouTube videos. For advanced learners, the videos will show some of the latest and greatest tools, and each activity is independent of another (unless specified), so feel free to skip around as desired. I hope this repository, knowledge-base, and YouTube series can be valuable to any data professional, no matter what background they come from!

Prerequisite Skills

  1. Intermediate Python
  2. Basic SQL
  3. Familiarity with the command line on MacOS/Linux/Windows operating systems
  4. No fear of failing! Realistically, it takes thousands of hours to become above-average in this field, so be sure to practice like you're Michael Jordan, and never give up like you are Kurt Warner!

Activities

  1. Activity One: Open Weather Map API Data Pipeline
  2. Activity Two: Optics Data Pipeline
  3. Activity Three: Data Collection and Processing with Edge Computing
  4. Activity Four: Drone Survey Data Pipelines for AI
  5. Activity Five: TBD
  6. Activity Six: TBD

Getting started

  1. Table of Contents
  2. Curriculum
  3. Learning Resources Wiki
  4. Requirements

About

An open-source project dedicated to constructing robust data pipelines and scalable software infrastructure. We leverage industry-standard tools favored by developers to enhance efficiency and reliability. Uniquely, these pipelines are field-tested on farms across Sumatra, Indonesia, ensuring real-world applicability and resilience.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks