GitHub - Dorianteffo/dbt_python_docker: Use dbt to create a star schema

Architecture

Tech Stack

DBT
Docker
SQL
Python

Project Overview

The primary goal of this project is to create a data warehouse with a star schema on PostgreSQL based on a CSV file, facilitating analytics and dashboarding. We used Python to ingest raw data into the data warehouse on a specific schema and then utilized DBT to transform data, building all the dimension tables and the fact table (Dev schema). Subsequently, we applied tests to the resulting data before deploying it to our prod schema.

Data model

Run the Project

Spin up the containers using:

make up

Run the Ci and test

make ci

Ingest the CSV file data into PostgreSQL:

make ingest

Run the DBT models (this will run into the dev schema):

make dbt-debug  #to check if the connection to postgres works
make dbt-deps #install packages
make dbt-run
make dbt-test

Deploy all the models to the production schema:

make dbt-deploy

To query the PostgreSQL database, run:

make warehouse

And then write some queries :

Best items in terms of retail sales:

SELECT 
    item_id, 
    ROUND(CAST(SUM(retail_sales) AS numeric),2) AS total_sales
FROM analytics.fact_sales
GROUP BY item_id
ORDER BY total_sales DESC 
LIMIT 10

or

Warehouse and retail sales by year and months

SELECT 
    year,
    month,
    ROUND(CAST(SUM(warehouse_sales) AS numeric), 2) AS total_warehouse_sales,
    ROUND(CAST(SUM(retail_sales) AS numeric), 2) AS total_retail_sales
FROM analytics.fact_sales a
JOIN analytics.dim_date d ON a.date_id = d.date_id
GROUP BY year, month
ORDER BY year, month;

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
data		data
dbt		dbt
ingestion		ingestion
sql		sql
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

data

data

dbt

dbt

ingestion

ingestion

sql

sql

tests

tests

.env.example

.env.example

.gitignore

.gitignore

Dockerfile

Dockerfile

Makefile

Makefile

README.md

README.md

docker-compose.yaml

docker-compose.yaml

requirements.txt

requirements.txt

Repository files navigation

Architecture

Tech Stack

Project Overview

Data model

Run the Project

About

Releases

Packages

Languages

Dorianteffo/dbt_python_docker

Folders and files

Latest commit

History

Repository files navigation

Architecture

Tech Stack

Project Overview

Data model

Run the Project

About

Topics

Resources

Stars

Watchers

Forks

Languages