Skip to content

Template for a DuckDB-based, Codespace-oriented sandbox project that is also dbt Cloud compatible, and includes code-first BI tooling via Evidence. Have also added Dagster and DBTs semantic layer.

wisemuffin/jaffle-shop-template

 
 

Repository files navigation

🥪 The Jaffle Shop 🦘

This is a template for creating a fully functional dbt project for teaching, learning, writing, demoing, or any other scenarios where you need a basic project with a synthesized jaffle shop business.

Architecture

  • Data Ingestion - Airbyte
  • Data transformation - dbt
  • Data orchistration - dagster
  • CICD - TBC

Setup Option 1 (easy) running remotley on github codespace

1. Click the big green 'Use this template' button and 'Create a new repository'.

Click use template

This will create a new repository exactly like this one, and navigate you there. Make sure to execute the next instructions in that repo.

2. Click 'Code', then 'Codespaces, then 'Create codespace on main'.

Create codespace on main

This will create a new codespace, a sandboxed devcontainer with everything you need for a dbt project. Once the codespace is finished setting up, you'll be ready to run a dbt build.

3. Make sure to wait til the codespace is finished setting up.

Codespaces setup screen at postCreateCommand

After the container is built and connected to, VSCode will run a few clean up commands and then a postCreateCommand, a set of commands run after the container is set up. This is where we install our dependencies, such as dbt, the duckdb adapter, and other necessities, as well as run dbt deps to install the dbt packages we want to use. That screen will look something like the above, when its completed it will close and leave you in a fresh terminal prompt. From there you're ready to do some analytics engineering!

Setup Option 2 (more complex) running locally

  • requires docker and understanding of devcontainers
  • get docker up and running and use VS Code's dev containers extention to reload workspace in dev container.

Demos

data orchistration with dagster

task dagit

See notes on how to run locally in orchestration/readme.md

dbt power users

  • running dbt sql in vscode (ctrl + enter)
  • generate model from source (click source in .yml and click model)
  • compile sql (ctr +)
  • execute tests
  • go to definition (click a ref() or macro/util)
  • auto complete

🚧 data ingestion with airbyte

See notes on how to run locally in orchestration/readme.md

🚧 duck db not working

duckdb ./reports/jaffle_shop.duckdb

.tables

select * exclude (subtotal) from analytics.orders;

from information_schema.tables;

.exit

🚧 dbt semantic layer

limitation with package conflict dbt-metricflow[duckdb] doesnt support dbt-duckdb 1.7 yet.

🚧 dbt & datafold regression testing

dbt run -s orders && data-diff --dbt

🚧 evidence.io for reports

cd reports
npm run dev

Then add database via ui called jaffle_shop.duckdb

🚧 linting

  • sqlfluff TODO sqlfluff lint not working

Issues

dagster cant load dbt project with semantic layer

see ./orchistration/readme.md for more details

see task temp_limitation for workaround

duckdb locks

duckdb ./reports/jaffle_shop.duckdb Error: unable to open database "./reports/jaffle_shop.duckdb": IO Error: Could not set lock on file "./reports/jaffle_shop.duckdb": Resource temporarily unavailable

duckdb version with airbyte not compatible with dbt-duckdb

airbyte currently has duckdb alpha connection set to v0.6.0 however more recent versions of duckdb are available and not aligned with dbt-duckdb for now airbyte will write to duckdb_airbyte.duckdb (older version of duckdb) then dagster will pass to duckdb.duckdb (newer version of duckdb)

duckdb error - The database file was created with an newer version of DuckDB.

https://duckdb.org/internals/storage.html

waiting for v1 of duckdb to become stable

About

Template for a DuckDB-based, Codespace-oriented sandbox project that is also dbt Cloud compatible, and includes code-first BI tooling via Evidence. Have also added Dagster and DBTs semantic layer.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 64.2%
  • Shell 35.8%