Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self Normalized Estimator _estimate_round_rewards is wrong? #187

Open
szsb26 opened this issue Nov 28, 2022 · 0 comments
Open

Self Normalized Estimator _estimate_round_rewards is wrong? #187

szsb26 opened this issue Nov 28, 2022 · 0 comments

Comments

@szsb26
Copy link

szsb26 commented Nov 28, 2022

in SelfNormalizedInverseProbabilityWeighting._estimate_round_rewards, what is returned in the denominator is iw.mean() when in fact this should be is iw.sum(). I think this computation affects the computation of the confidence intervals for this class.

Found this issue when i found that the SNIPS estimator had unusually higher variance than the IPW estimator.

This means that _estimate_policy_value in InverseProbabilityWeighting (the base class) may need to be changed as well, since the return for that is .mean(), and there is no such normalizing constant in the definition of the SNIPS estimator.

@szsb26 szsb26 changed the title Self Normalized Estimator _estimate_round_rewards is wrong. Self Normalized Estimator _estimate_round_rewards is wrong? Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant