Skip to content

Latest commit

 

History

History
35 lines (33 loc) · 1.05 KB

README.md

File metadata and controls

35 lines (33 loc) · 1.05 KB

Reinforcement Learning Theory Book (rus)

Full book on Arxiv: https://arxiv.org/abs/2201.09746

  • Ch. 1: Introduction
  • Ch. 2: Meta-heuristics
    • NEAT, WANN
    • CEM, OpenAI-ES, CMA-ES
  • Ch. 3: Classic theory
    • Bellman equations
    • RPI, policy improv. theorem
    • Value Iteration, Generalized Policy Iteration
    • Temporal Difference, Q-learning, SARSA
    • Eligibility Traces, TD-lambda, Retrace
  • Ch. 4: Value-based
    • DQN
    • Double DQN, Dueling DQN, PER, Noisy DQN, Multi-step DQN
    • c51, QR-DQN, IQN, Rainbow DQN
  • Ch. 5: Policy Gradient
    • REINFORCE, A2C, GAE
    • TRPO, PPO
  • Ch. 6: Continuous Control
    • DDPG, TD3
    • SAC
  • Ch. 7: Model-based
    • Bandits
    • MCTS, AlphaZero, MuZero
    • LQR
  • Ch. 8: Next Stage
    • Imitation Learning / Inverse Reinforcement Learning
    • Intrinsic Motivation
    • Multi-Task and Hindsight
    • Hierarchical RL
    • Partial observability
    • Multi-Agent RL