A city traffic department wants to collect traffic data using swarm UAVs (drones) from a number of locations in the city and use the data collected for improving traffic flow in the city and for a number of other tasks. Now the objective is creating a scalable data warehouse that will host the vehicle trajectory data extracted by analyzing footage taken by swarm drones and static roadside cameras.
The data can be found here
Pip
Apache airflow
Python 3.5 or above
Docker and Docker compose
You can find the full list of requirements in the requirements.txt file
Highly recommended to create a new virtual environment and install every required modules and libraries on the virtual environment.
- You can clone and run the project using the following instruction
git clone https://github.com/Data-warehouse_DBT_Airflow.git
cd Data-warehouse_DBT_Airflow
pip install -r requirements.txt
The detailed use and implementation of the pipelines using Apache Airflow, DBT, postgres and Redash are found here.
The notebooks that are used in this project including EDA, data cleaning are found here in the Notebooks folder.
All the scripts and modules used for this project relating to are found in the scripts folder
All the unit and integration tests are found here in the tests folder.
👤 Akubazgi Gebremariam
Give a ⭐ if you like this project, and also feel free to contact me at any moment.