Skip to content

A dbt package to support modelling event data via split tables for use in downstream tools and systems.

License

Notifications You must be signed in to change notification settings

snowplow/dbt-snowplow-normalize

Repository files navigation

early-release Release License Discourse posts

snowplow-logo

snowplow-normalize

This dbt package:

  • Provides macros to simplify the production of models that normalize Snowplow event data into a table per event type, plus a reduced events table and latest user context table, for easier integration with downstream tools.
  • Includes a python script to generate the models using a simple configuration file that uses the schemas used within your pipelines to identify the columns, to greatly reduce the upfront effort in creating these models.

Please refer to the doc site for a full breakdown of the package.

Getting Started

The easiest way to get started is to follow our QuickStart guide.

Adapter Support

The latest version of the snowplow-normalize package supports BigQuery, Databricks & Snowflake. For previous versions see our package docs.

Requirements

  • A dataset of Snowplow events must be available in the database.

Installation

Check dbt Hub for the latest installation instructions.

Configuration & Operation

Please refer to the doc site for details on how to configure and run the package.

Models

This package only contains the incremental models needed to efficiently process new events for your models, see the docs on incremental logic for more information. The package does however produce 3 types of models:

Model Description
event_name A table for each of your specified event names, with flat columns from self-describing events and contexts.
filtered_events A table that contains the event_ids, collector_tstamp, event name, and the name of the table that those events have been normalized into. Note it doesn't contain events not split out into individual tables.
event_users A table with the latest user context columns for any user_ids in your events table.

For more detailed information, see the doc site.

Join the Snowplow community

We welcome all ideas, questions and contributions!

For support requests, please use our community support Discourse forum.

If you find a bug, please report an issue on GitHub.

Copyright and license

The snowplow-normalize package is Copyright 2022-2024 Snowplow Analytics Ltd.

This distribution is all licensed under the Snowplow Personal and Academic License . (If you are uncertain how it applies to your use case, check our answers to frequently asked questions.)