README

The core procedure of the paper is simple -- we shuffle the data K times, and take half as training set and half as validation set.

In this code base, we provide two simulated environments that we used and data partitions we used.

envs/tutorbot/tutor_env.py shows how TutorEnv works, which was not described in detail in the appendix.
envs/sepsis/create_behavior_policy.py shows how we generate the dataset for the Sepsis domain.

The BVFT code with our FQE implementation is in the ope/ folder.

P-MDP code is inside the pmdp folder.

POIS code is inside the pois folder.

We ran the Robomimic experiments by modifying https://github.com/ARISE-Initiative/robomimic

results folder contains the aggregate CSV files of our experiment run. These won't work with the *_evaluation.py files; though they can still be processed and examined!

We ran the D4RL experiments by using the d3rlpy repository.

For additional inquiries about our experimental details or direct access to our logged results or our original code, please contact the corresponding author of the paper.

We plan to release a more user-friendly version of the code in the near future.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

envs

envs

ope

ope

pmdp

pmdp

pois

pois

README.md

README.md

Repository files navigation

README

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
envs		envs
ope		ope
pmdp		pmdp
pois		pois
README.md		README.md

StanfordAI4HI/Split-select-retrain

Folders and files

Latest commit

History

Repository files navigation

README

About

Resources

Stars

Watchers

Forks

Languages