This repository contains the project completed for Udacity Data Engineering Nano Degree
Created ETL pipline to load data into star schema
Modeling data with Apache Cassandra to satisfy specific analytics query requirement
Design the destination table and loaded data from AWS S3 to AWS Redshift
Built the ETL process on cloud with Spark and load back to AWS S3
Construct data pipelines with Airflow by loading data from AWS S3 to AWS Redshift
Created a data pipeline for index in different geographical location and different sectors. It allows for easily accessible index data for ETF research purpose.