Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Supporing Aggregation metrics for a group #528

Open
theajay87 opened this issue Jan 8, 2024 · 0 comments
Open

[FEATURE] Supporing Aggregation metrics for a group #528

theajay87 opened this issue Jan 8, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@theajay87
Copy link

Is your feature request related to a problem? Please describe.
We have a use-case where we need to generate aggregated metrics like SUM, Mean and scannable metrics like MAX, MIN, MIN-LENGHT, MAX-LENGTH on a group defined on a column (or columns) in dataframe.

Describe the solution you'd like
Currently, the ScanShareableFrequencyBasedAnalyzer has only CountDistinct, Distinctness, Entropy, Uniqueness and UniqueValueRatio implementation. I would like to extend similar implementation for all other scannable and aggregation metrics so that each metrics can be computed at group level.

Describe alternatives you've considered

  • One option is that i externally run the groupBy clause on Dataframe and split the dataframe based on group. Later, iterate over it and then keep calling Analyszer on each group.
    Additional context
    Add any other context or screenshots about the feature request here.
@theajay87 theajay87 added the enhancement New feature or request label Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant