User guide #325

TimotheeMathieu · 2023-06-27T13:19:46Z

I propose we do a user guide for rlberry. The outline of which would be something like this:

Installation
Basic Usage
- Quick Start RL
- Quick Start Deep RL
Set up of an experiment
- Agent Manager, agent, environment.
- Training phase, evaluation phase
- Logging
- Parallelization how to
Running an experiment
- Train an agent
- Evaluate agents
- Tune hyperparameters
- Plot relevant statistics
Saving and Loading
- Save and Load of agent
- Save and Load of managers
- Writers
- Save and Load of data for plots
Make your own agent or environment
- Interaction with Gymnasium
- Using environment from gymnasium
- Using agents from Stablebaselines
- Deep RL agents
  - Neural network utils
  - Interatctions with torch
- Seeding
Using Bandits in rlberry

Feel free to suggest any change to this outline. Once we all agree to the outline, we can distribute the work among us.

TimotheeMathieu · 2023-07-13T07:28:53Z

An I suggest we use rundoc or something similar to verify that the code in the user guide actually does something and have exit code 0.

I think this should go into the long tests because the user guide will contain some code to train agents and it would be too heavy for azure.

KohlerHECTOR · 2023-07-13T16:22:18Z

An example of a user guide section from pr #276 : https://rlberry--276.org.readthedocs.build/en/276/basics/comparison.html

We can try Jupytext to edit markdown in jupyter.

riiswa · 2023-07-21T12:18:41Z

I'm adding notes concerning Philippe's remarks (check your mailbox):

The user guide should telling "how rl-berry should used?". Example: experiments should be reproducible, and make sure that all the examples we give are reproducible
Example of what is a more clearer documentation: eval([eval_horizon, n_simulations, gamma])'': Monte-Carlo policy evaluation [1] of an agent to estimate the value at the initial state.''
- What do we evaluate? Do we eval the initial state or do we evaluate a policy/trained agent?
- Define the 3 arguments
How do we seed an agent? call to reseed() or some other way. The description of reseed() is very unclear to me: we provide a sequence of numbers? or one number/seed?
kwargs should be explained, their attributes listed in all different cases. (See Handling **kwargs #334)
- Regarding the save() method, what does ``Overwrite the 'save' function to manage CPU vs GPU save/load in torch agent'' mean? Does it save the RL-berry agent or just its Q-network? Q-network(s) in the case of DDQN? ...
  Same thing for load(). Moreover, we don't care that it overloads any other method (See Consistent naming #341). We want to know what it does.
Include all the arguments in the docstring
Why is the default value indicated for some arguments and not for all?
More details about, how evaluate an agent during training

Basically, we should pass on each function/methods, and write the documentation in a better way (if needed), so that everything is documented and explicit.

KohlerHECTOR added documentation Improvements or additions to documentation Marathon To do during Marathon labels Jul 13, 2023

KohlerHECTOR added this to To do in Marathon rlberry Jul 13, 2023

KohlerHECTOR mentioned this issue Jul 24, 2023

Improve docstring and code documentation #342

Closed

KohlerHECTOR moved this from To do to In progress in Marathon rlberry Jul 24, 2023

KohlerHECTOR self-assigned this Jul 24, 2023

KohlerHECTOR pinned this issue Jul 24, 2023

KohlerHECTOR linked a pull request Jul 24, 2023 that will close this issue

[WIP] User guide #325 #353

Closed

KohlerHECTOR mentioned this issue Jul 24, 2023

[WIP] User guide #325 #353

Closed

brahimdriss mentioned this issue Jul 24, 2023

Improve docstring and code documentation #355

Closed

6 tasks

JulienT01 mentioned this issue Sep 19, 2023

User guide #364

Merged

KohlerHECTOR mentioned this issue Apr 3, 2024

update user guide #446

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User guide #325

User guide #325

TimotheeMathieu commented Jun 27, 2023 •

edited

TimotheeMathieu commented Jul 13, 2023

KohlerHECTOR commented Jul 13, 2023

riiswa commented Jul 21, 2023

User guide #325

User guide #325

Comments

TimotheeMathieu commented Jun 27, 2023 • edited

TimotheeMathieu commented Jul 13, 2023

KohlerHECTOR commented Jul 13, 2023

riiswa commented Jul 21, 2023

TimotheeMathieu commented Jun 27, 2023 •

edited