This meetup is co-organized with PyAmsterdam, and it's all about ETL/ELT in action in the cloud.
Talk 1 - Data Processing with Python: a Containerized, Scheduled, and Monitored Pipeline on the Google Cloud Platform
During the talk, you will learn how to containerize a Python pipeline for data processing and execute it as a scheduled job that sends email messages for critical errors. The presentation can be find here. The repo with code here.
AWS Glue offers Serverless ETL pipeline and workflows, while it can be rather simple to start using it via console it quickly become non trivial for deployment via CI/CD pipeline. Management of multiple environments, working with the pipeline state and keeping your code DRY is a challenge. During the talk I will walk you through guts of AWS Glue and show my approach to more streamlined deployment. The presentation can be find here. The repo with code here.
This event was set up by @pyladiesams, @PyAmsterdam, @nancyirisarri and @1oglop1