Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in One-Class SVM implementation #56

Open
compleathorseplayer opened this issue Jul 26, 2020 · 3 comments
Open

Bug in One-Class SVM implementation #56

compleathorseplayer opened this issue Jul 26, 2020 · 3 comments

Comments

@compleathorseplayer
Copy link

Because of some suspicious results in attempting to use OneClassSVM with linear kernel, I constructed a toy example which is solveable without error, but despite svmtrain claiming convergence, the solution was incorrect.

I double-checked by projecting the problem into a 2-class one and running svmtrain with (default) SVC and the correct solution was quickly found by the routine. This is an acceptable workaround but much slower, especially for large datasets.

It looks to me like OneClassSVM has a fundamental bug.

Here is how to reproduce:
generate 500 random 2-vectors x standard normal, and keep only the ones for which x1-x2 >1 [there should be 120 to 130 of them]. Next, run OneClassSVM on these which should be able to bound all of the observations by a plane, but it does not, despite claiming convergence.

I would appreciate someone verifying this bug (and hopefully it being put on the 'to-do' list).

Thank you!

@aviks
Copy link
Contributor

aviks commented Jul 27, 2020

So this package wraps the original C libsvm library. To fix this, it would be necessary to check if the issue is in the underlying C library, or in the wrapper. I would guess something like this is in the C library.

Would you be able to test this against the C library? Alternatively, you will need to provide some code that we can use to replicate this exactly.

@compleathorseplayer
Copy link
Author

compleathorseplayer commented Jul 27, 2020

I was of the impression that someone had translated parts of LIBSVM into native Julia(?) [otherwise, why not port the whole thing?]

Either way, for me the One-Class option failed the first test I gave it.

In the linear case, One-Class should be finding a bound to a half-space containing as many
points as possible (or all of them, in the separable case). The simplest test, then is just to generate 2D data in a half-plane and try to find a bounding line. This failed for all the examples I tried
(in the sense that a converged model did not separate in the separable case).
If you can't reproduce this, then I can give you exact code.

As I mentioned above, in each case, I was able to transform the problem artificially into a 2-class problem, and LIBSVM - SVC worked fine in that case (i.e. found the respective separators).

I should add that I know of two versions of the 'One-Class' problem - with and without respect to an origin. In my opinion by far the most useful (sensible?) one is the first - where separating hyperplanes are assumed to go through the origin [otherwise separation/isolation is always trivial for any finite sample].

I had hoped that this implementation would be that (sensible) one, but it is not. Nevertheless, even with the bias term version, if the error cost is increased, total separation should be achieved with a high enough error cost setting. I tried this on the examples of the above type and could not achieve this with the LIBSVM implementation.

@compleathorseplayer
Copy link
Author

Update, on Julia Discourse it has been pointed out that the default parameter values (nu in particular) are not very effective for One-Class (generic default nu in particular seems to be set for reasonable performance on other SVC-types). For considerably smaller values of nu, the routine appears to work for One-Class as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants