Skip to content
#

dataengineering

Here are 519 public repositories matching this topic...

In the Project Workspace, I'll find a data set containing real messages that were sent during disaster events. I will be creating a machine learning pipeline to categorize these events so that you can send the messages to an appropriate disaster relief agency. This project will include a web app where an emergency worker can input a new message …

  • Updated Apr 26, 2020
  • Python

This project showcases the implementation of a data pipeline using Apache Airflow. Leveraging the OpenWeather API, it efficiently fetches real-time weather data and performs ETL processing. Results are seamlessly stored in AWS S3 buckets for further analysis. Moreover, the integration of Slack notifications ensures timely alerts to myself.

  • Updated Mar 16, 2024
  • Python

ETL pipeline that extracts their data from S3, stages them in Redshift, and transforms data into a set of dimensional tables for their analytics team to continue finding insights in what songs their users are listening to. Then we will test the database and ETL pipeline by running queries given to us by the analytics team from Sparkify and compa…

  • Updated Jul 18, 2020
  • Python

Improve this page

Add a description, image, and links to the dataengineering topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataengineering topic, visit your repo's landing page and select "manage topics."

Learn more