You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR: #1997 adds a new way of maintaining a SCD Type 2 model from detecting changes to the source table's columns.
This issue is for tracking the idea of extending this behaviour to build a SCD Type 2 from a table that contains periodic snapshots of the source data.
Imagine a source table that looks like this:
name,price
foo,20
And some periodic process that takes snapshot of this data and makes those snapshots available in another table, e.g.:
These snaphsots allow tracking the changes that were made to the individual rows (by comparing the values), but it also contains a timestamp that can be used to determine when those changes occured. As such, a SCD Type 2 dimension can be built from this data which might look like this:
Ideally, the new SCD_TYPE_2_BY_COLUMN model kind would allow specifying a column (snapshot_date in this case) as the timestamp to use for determining when a row has changed instead of using execution_time.
The text was updated successfully, but these errors were encountered:
Adding a +1, would be ideal to have a 3rd model that combines the two SCD2 model types where I can specify the date (in this case a snapshot date or loaded at) and the columns to check for changes, when changes are detected the date column is used if not row is ignored
This PR: #1997 adds a new way of maintaining a SCD Type 2 model from detecting changes to the source table's columns.
This issue is for tracking the idea of extending this behaviour to build a SCD Type 2 from a table that contains periodic snapshots of the source data.
Imagine a source table that looks like this:
And some periodic process that takes snapshot of this data and makes those snapshots available in another table, e.g.:
These snaphsots allow tracking the changes that were made to the individual rows (by comparing the values), but it also contains a timestamp that can be used to determine when those changes occured. As such, a SCD Type 2 dimension can be built from this data which might look like this:
Ideally, the new
SCD_TYPE_2_BY_COLUMN
model kind would allow specifying a column (snapshot_date
in this case) as the timestamp to use for determining when a row has changed instead of usingexecution_time
.The text was updated successfully, but these errors were encountered: