Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Running ProfitabilityProsecutor with an already suppressed dataset (GUI and API) #457

Open
mhalilovic opened this issue Feb 9, 2024 · 5 comments
Assignees
Labels

Comments

@mhalilovic
Copy link

I encounter an error when anonymizing a fully suppressed dataset using the API, with similar behavior observed in the GUI.

Example to reproduce using the ARX GUI:
Import a fully suppressed dataset (all * values), applying generalization hierarchies with just one level *
Configured Profitability Prosecutor with suppression limit of 100%.

When attempting to anonymize, I get the message: Cannot anonymize data: Value (NaN) out of range [0,1]

Description of the API behavior:
The same issue appears to occur when using the API with Java.
Here is part of my logs:
Caused by: java.lang.IllegalStateException: Value (NaN) out of range [0,1]
at org.deidentifier.arx.metric.v2.MetricSDNMEntropyBasedInformationLoss.getEntropyBasedInformationLoss(MetricSDNMEntropyBasedInformationLoss.java:109)
at org.deidentifier.arx.criteria.ProfitabilityProsecutor.isAnonymous(ProfitabilityProsecutor.java:121)
at org.deidentifier.arx.framework.check.groupify.HashGroupify.isPrivacyModelFulfilled(HashGroupify.java:758)
at org.deidentifier.arx.framework.check.groupify.HashGroupify.analyzeWithEarlyAbort(HashGroupify.java:653)
at org.deidentifier.arx.framework.check.groupify.HashGroupify.stateAnalyze(HashGroupify.java:447)
at org.deidentifier.arx.framework.check.TransformationChecker.check(TransformationChecker.java:217)
at org.deidentifier.arx.framework.check.TransformationChecker.check(TransformationChecker.java:170)
at org.deidentifier.arx.algorithm.FLASHAlgorithmImpl.traverse(FLASHAlgorithmImpl.java:128)
at org.deidentifier.arx.ARXAnonymizer.anonymize(ARXAnonymizer.java:777)
at org.deidentifier.arx.ARXAnonymizer.anonymize(ARXAnonymizer.java:226)
at org.deidentifier.arx.distributed.ARXWorkerLocal$1.call(Unknown Source)
at org.deidentifier.arx.distributed.ARXWorkerLocal$1.call(Unknown Source)

@prasser
Copy link
Collaborator

prasser commented Feb 9, 2024

This should be relatively easy to fix. Can you please investigate the semantics of the number [0, 1] usually returned from getEntropyBasedInformationLoss? Is it 0 for no information loss and 1 for maximum information loss, or the other way around (0 for maximum information loss and 1 for no information loss)? Please let me know here.

@mhalilovic
Copy link
Author

0 for no information loss and 1 for maximum information loss

prasser added a commit that referenced this issue Feb 9, 2024
@prasser
Copy link
Collaborator

prasser commented Feb 9, 2024

Please check whether the recent commit 984f38f fixes the problem.

@mhalilovic
Copy link
Author

My issue with the API is resolved. Thank you!

The GUI also "anonymizes" the dataset now without a message.
Most quality models have NaN or N/A values in the Quality models tab now. I do not know if this is expected behavior.

@prasser
Copy link
Collaborator

prasser commented Feb 9, 2024

Most quality models have NaN or N/A values in the Quality models tab now. I do not know if this is expected behavior.

Are you sure that this is caused by this commit? Please check.

@prasser prasser reopened this Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants