Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decision_function instead of predict_proba #157

Open
lkurlandski opened this issue Apr 21, 2022 · 5 comments
Open

decision_function instead of predict_proba #157

lkurlandski opened this issue Apr 21, 2022 · 5 comments

Comments

@lkurlandski
Copy link

Several non-probabilistic estimators, such as SVMs in particular, can be used with uncertainty sampling. Scikit-Learn estimators that support the decision_function method can be used with the closest-to-hyperplane selection algorithm [Bloodgood]. This is actually a very popular strategy in AL research and would be very easy to implement.

@e-rich
Copy link

e-rich commented Apr 28, 2022

if you just want to use the closest-to-hyperplane selection with a linear SVM, you could also write yourself a query strategy like:

 def closest_to_hyperplane_sampling(linearSVC_learner: BaseLearner, X_pool):
    y = linearSVC_learner.estimator.decision_function(X_pool)
    w_norm = np.linalg.norm(linearSVC_learner.estimator.coef_)
    dist = y / w_norm
    query_idx = np.argmin(np.abs(dist))
    return query_idx, X_pool[query_idx]

@lkurlandski
Copy link
Author

This is great, thank you for the tip. Did not realize how simple it would be.

@lkurlandski
Copy link
Author

lkurlandski commented May 3, 2022

I am curious why you normalize the confidence scores. I can't see how applying a linear scaling factor of (1 / w_norm) to the y vector would change the return value of the function. Is it simply to ensure all values are between 0 and 1?

@e-rich
Copy link

e-rich commented May 3, 2022

yes you are right. you do not need this step.

@lkurlandski
Copy link
Author

Thought so. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants