You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We use weekly partitioned bigquery tables for our mart layer. We use dbt to create and update them. We often need to ensure that table row counts do not vary too much week by week, but there is no inbuilt test currently in dbt-expectations that allows to check this
Describe the solution you'd like
A new test that checks table row count deviations of partitions over a threshold. Something like below that throws a warning if counts deviate by more than 20% for a table my_mart_table when compared to the directly preceding partitions
Describe alternatives you've considered
We have implemented a custom test like below:
{% macro test_not_large_deviation_count(model, threshold, partition_column='partition_date', row_condition='1=1') %}
WITH counts AS (
SELECT
{{ partition_column }} AS partition_date,
COUNT(*) AS observation_count,
FROM {{ model }}
WHERE {{ row_condition }}
GROUP BY 1
),
compare AS (
SELECT
partition_date,
observation_count AS current_value,
LEAD (observation_count) OVER (ORDER BY partition_date DESC) AS preceding_value
FROM counts
)
SELECT
*
FROM compare
WHERE ABS(SAFE_DIVIDE(current_value - preceding_value, preceding_value)) > {{ threshold }}
{% endmacro %}
Additional context
It would be nice to have other column level comparison based tests to check for partition-on-partition deviations in values like max, min, avg etc for columns in a table just like row counts
The text was updated successfully, but these errors were encountered:
praveen-prashant
changed the title
[Feature Request]
[Feature Request] Support for large deviation tests for partitioned bigquery tables
Jan 10, 2023
Is your feature request related to a problem? Please describe.
We use weekly partitioned bigquery tables for our mart layer. We use dbt to create and update them. We often need to ensure that table row counts do not vary too much week by week, but there is no inbuilt test currently in dbt-expectations that allows to check this
Describe the solution you'd like
A new test that checks table row count deviations of partitions over a threshold. Something like below that throws a warning if counts deviate by more than 20% for a table
my_mart_table
when compared to the directly preceding partitionsDescribe alternatives you've considered
We have implemented a custom test like below:
Additional context
It would be nice to have other column level comparison based tests to check for partition-on-partition deviations in values like
max
,min
,avg
etc for columns in a table just likerow counts
The text was updated successfully, but these errors were encountered: