Offline policy evaluation

Implementations and examples of common offline policy evaluation methods in Python. For more information on offline policy evaluation see this tutorial.

Installation

pip install offline-evaluation

Usage

from ope.methods import doubly_robust

Get some historical logs generated by a previous policy:

df = pd.DataFrame([
	{"context": {"p_fraud": 0.08}, "action": "blocked", "action_prob": 0.90, "reward": 0},
	{"context": {"p_fraud": 0.03}, "action": "allowed", "action_prob": 0.90, "reward": 20},
	{"context": {"p_fraud": 0.02}, "action": "allowed", "action_prob": 0.90, "reward": 10},
	{"context": {"p_fraud": 0.01}, "action": "allowed", "action_prob": 0.90, "reward": 20},     
	{"context": {"p_fraud": 0.09}, "action": "allowed", "action_prob": 0.10, "reward": -20},
	{"context": {"p_fraud": 0.40}, "action": "allowed", "action_prob": 0.10, "reward": -10},
 ])

Define a function that computes P(action | context) under the new policy:

def action_probabilities(context):
    epsilon = 0.10
    if context["p_fraud"] > 0.10:
        return {"allowed": epsilon, "blocked": 1 - epsilon}    
    return {"allowed": 1 - epsilon, "blocked": epsilon}

Conduct the evaluation:

doubly_robust.evaluate(df, action_probabilities)
> {'expected_reward_logging_policy': 3.33, 'expected_reward_new_policy': -28.47}

This means the new policy is significantly worse than the logging policy. Instead of A/B testing this new policy online, it would be better to test some other policies offline first.

See examples for more detailed tutorials.

Supported methods

Inverse propensity scoring
Direct method
Doubly robust (paper)

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
examples		examples
ope		ope
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
dev-requirements.txt		dev-requirements.txt
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

ope

ope

tests

tests

.gitignore

.gitignore

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

README.md

README.md

dev-requirements.txt

dev-requirements.txt

pyproject.toml

pyproject.toml

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

Offline policy evaluation

Installation

Usage

Supported methods

About

Releases 3

Packages

Contributors 6

Languages

License

banditml/offline-policy-evaluation

Folders and files

Latest commit

History

Repository files navigation

Offline policy evaluation

Installation

Usage

Supported methods

About

Topics

Resources

License

Stars

Watchers

Forks

Languages