Skip to content

cliffordEmmanuel/Exploring_Data_Engineering

Repository files navigation

This repository details my exploration into the data engineering process.

  • This was my first try at creating a pipeline using an amazon s3 bucket. This script downloads data from an s3 bucket transform it and loads it back into the bucket.
  • Here I explore the pyspark functions to complete a basic etl that reads data from an amazon s3 bucket to perform some text transformations
  • I explore creating a full end to end pipeline that extracts data from a postgres db transforms it and loads the data back into a postgres db using pyspark
  • This script scrapes a apiless eccomerce site for data and stores the result to a csv file

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published