Scan: Add a robustness detector to the scan that perturbs categorial values #1847

kevinmessiaen · 2024-03-14T08:02:37Z

🚀 Feature Request

Add a robustness detector to the scan that perturbs categorial values.

The scan should be able to a set of issues that capture the perturbations needed on a single categorial feature to:

(a) change the predicted label (classification)
(b) change the prediction by an amount that exceeds a certain threshold (regression)

🔈 Motivation

Currently the scan does not have any categorial perturbation.

ChatBear · 2024-04-10T15:54:53Z

Is this issue still active ? I would want to contribute to this issue

alexcombessie · 2024-04-10T15:58:53Z

@kevinmessiaen I let you guide there, this seems easy to add, and a great idea of contribution!

kevinmessiaen · 2024-04-11T03:43:51Z

Hello @ChatBear

Yes this is still an active issue, I can assign you to it. We would be grateful to have your contribution, let me know if you have question about this.

ChatBear · 2024-04-11T07:49:44Z

Thanks, i'll try to contribute, i'll need a bit of time to understand the repo, after that i'll try to post PR

ChatBear · 2024-04-14T17:46:33Z

Hello, i have few questions about the issue.

What kind of pertubations do you except ? I was thinking of change the feature column with a probability of 0.1 (chosen arbitrary).

And do i need the create another detector from scratch, or i can use a detector from BaseTextPerturbationDetector ?

And i tried to create a branch, and i can't push in my own branch (i forked the repo but i am having trouble to create the pull request, i am kinda of new in open source so i apologize in advance if this question is inappropriate).

kevinmessiaen · 2024-04-15T07:29:58Z

Hello,

The perturbation should be on categorical feature. It should only perturb on column of the dataset, the goal is to ensure that the model isn't too sensitive to noise. In this case the probability is not necessary since we want to test that the result isn't impacted when the value change. (it makes sense in text where we have typo rate for example).

Example is having a breed category with values potential values: ['Labrador', 'Husky', 'Beagle', ...]. The idea is to switch all Labrador` value to any other breed and so on.

It won't work to reuse BaseTextPerturbationDetector since it cast column as str but we can have numerical categories for example. But you can inspire from it.

ChatBear · 2024-04-15T13:39:48Z

Ok, thanks i can continue

kevinmessiaen added enhancement New feature or request good first issue Good for newcomers labels Mar 14, 2024

kevinmessiaen assigned ChatBear Apr 11, 2024

This was referenced Apr 21, 2024

Test contribution/robustness detector #1907

Closed

Test contribution/robustness detector #1908

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scan: Add a robustness detector to the scan that perturbs categorial values #1847

Scan: Add a robustness detector to the scan that perturbs categorial values #1847

kevinmessiaen commented Mar 14, 2024 •

edited

ChatBear commented Apr 10, 2024

alexcombessie commented Apr 10, 2024

kevinmessiaen commented Apr 11, 2024

ChatBear commented Apr 11, 2024

ChatBear commented Apr 14, 2024 •

edited

kevinmessiaen commented Apr 15, 2024

ChatBear commented Apr 15, 2024

Scan: Add a robustness detector to the scan that perturbs categorial values #1847

Scan: Add a robustness detector to the scan that perturbs categorial values #1847

Comments

kevinmessiaen commented Mar 14, 2024 • edited

🚀 Feature Request

🔈 Motivation

ChatBear commented Apr 10, 2024

alexcombessie commented Apr 10, 2024

kevinmessiaen commented Apr 11, 2024

ChatBear commented Apr 11, 2024

ChatBear commented Apr 14, 2024 • edited

kevinmessiaen commented Apr 15, 2024

ChatBear commented Apr 15, 2024

kevinmessiaen commented Mar 14, 2024 •

edited

ChatBear commented Apr 14, 2024 •

edited