[FEATURE] Dimension tracking #342

mateuszklimek · 2022-08-24T08:27:31Z

Tell us about the problem you're trying to solve

It's pretty often that we want to know anomaly on counts on some specific values (categorical) values in the columns.
Getting anomalies on those would be really useful for some of the re_data users

fgiroud · 2022-09-15T09:45:36Z

That would be an awesome feature to have
In many scenarios, we are tracking anomalies and metrics, on a per dimension basis (like country, brand, or category)
I see a few things

the anomaly detection would be more performant when the dimension are not equally distributed. An anomaly in an under-represented dimension could be missed
that would save a lot of time in debugging, especially for tests, if the test fails for a dimension that doesn't really matter

From a UI perspective, we would need to be able to stack the metrics per dimension, having basically everything "by dimension", so guessing that's a big effort for re_data.

We tried to implement something similar, and we faced the following issues

When the dimension contains too many values, it makes the UI extremely difficult to read - in some occasions (20K dimensions) it simply breaks our testing UI. Forcing us to disable the metrics by dimension
Need to "stack" all the charts
Performances bottleneck
Having to nest basically every test configuration (level, alerts, config) per dimension

whanata · 2023-04-19T04:50:08Z

+1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Dimension tracking #342

[FEATURE] Dimension tracking #342

mateuszklimek commented Aug 24, 2022

fgiroud commented Sep 15, 2022

whanata commented Apr 19, 2023

[FEATURE] Dimension tracking #342

[FEATURE] Dimension tracking #342

Comments

mateuszklimek commented Aug 24, 2022

fgiroud commented Sep 15, 2022

whanata commented Apr 19, 2023