Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CV in ADBench #8

Open
prabhant opened this issue Aug 26, 2022 · 1 comment
Open

CV in ADBench #8

prabhant opened this issue Aug 26, 2022 · 1 comment

Comments

@prabhant
Copy link

Hi,

I am a little new to Anomaly detection but I was curious about what is the right way to do cross validation while using ADBench as the test and train samples are already split via datagenerator. An easy way will be to concatenate test and train datasets and then put them in the CV loop, but is there a cleaner way possible?

@Minqi824
Copy link
Owner

I sincerely apologize for my late reply. Since in anomaly detection problems, there often exist only few labeled samples (e.g., 5 labeled anomalies) in the training set, while the labeled samples would even be reduced further in the cross-validation (CV) scenario.
Some suggestions are that:

  1. You can apply some data augmentation methods like oversampling or SMOTE, and then use CV on the concatenated dataset of training and testing datasets.
  2. You can set the la (the ratio of labeled anomalies) to 1.00, therefore all the labeled anomalies are available in the training set, which can be further concatenated with testing set to perform cross-validation, although anomalies may still be very rare on some datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants