Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Extend RatioOfSums to support other aggregations #556

Open
mentekid opened this issue Apr 11, 2024 · 0 comments
Open

[FEATURE] Extend RatioOfSums to support other aggregations #556

mentekid opened this issue Apr 11, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@mentekid
Copy link
Contributor

Is your feature request related to a problem? Please describe.
PR 552 introduced a Ratio Of Sums analyzer that checks whether two columns' values add up to the same number. We can extend this analyzer to a Ratio Of Aggregation to accept any kind of Spark aggregation, e.g. average.

Describe the solution you'd like
There should be a generic RatioOfAggregation check that accepts two columns and an aggregation function. An implementation of that would be RatioOfSums, which sets aggregation to sum.

Describe alternatives you've considered
The alternative would be to let users define Check assertions as a function of another aggregator's value. Rather than saying this:

VerificationSuiteBuilder()
    ...
    .ratioOfSums("col1", "col2", _ > 0.9)

they could define their checks as

VerificationSuiteBuilder()
    ...
    .sum("col1", _ > 0.9 * sum("col2"))

(this is pseudocode, but basically pass an analyzer as part of the assertion)

@mentekid mentekid added the enhancement New feature or request label Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant