Skip to content

Latest commit

 

History

History
24 lines (19 loc) · 3.52 KB

algorithms.md

File metadata and controls

24 lines (19 loc) · 3.52 KB

Implemented Algorithms In MALib

Population-based Learning Algorithms

  • PSRO: Lanctot, Marc, et al. "A unified game-theoretic approach to multiagent reinforcement learning." Advances in neural information processing systems 30 (2017). [arXiv] | [official code]
  • P2SRO: McAleer, Stephen, et al. "Pipeline psro: A scalable approach for finding approximate nash equilibria in large games." Advances in neural information processing systems 33 (2020): 20238-20248. [arXiv] | [official code]
  • EPSRO: Zhou, Ming, et al. "Efficient Policy Space Response Oracles." arXiv preprint arXiv:2202.00633 (2022). [arXiv] | [offcial code]
  • ODO: Dinh, Le Cong, et al. "Online Double Oracle." arXiv preprint arXiv:2103.07780 (2021). [arXiv] | [official code]
  • XDO: McAleer, Stephen, et al. "XDO: A double oracle algorithm for extensive-form games." Advances in Neural Information Processing Systems 34 (2021): 23128-23139. [arXiv] | [official code]
  • NeurPL: Liu, Siqi, et al. "NeuPL: Neural Population Learning." International Conference on Learning Representations. 2021. [arXiv] | [official code]

Multi-agent Reinforcement Learning Algorithms

  • MADDPG: Lowe, Ryan, et al. "Multi-agent actor-critic for mixed cooperative-competitive environments." Advances in neural information processing systems 30 (2017). [arXiv]
  • MAPPO: Yu, Chao, et al. "The surprising effectiveness of ppo in cooperative, multi-agent games." arXiv preprint arXiv:2103.01955 (2021). [arXiv]
  • QMIX: Rashid, Tabish, et al. "Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning." International conference on machine learning. PMLR, 2018. [arXiv]

Single-agent Reinforcement Learning Algorithms

  • A3C: Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. PMLR, 2016. [arXiv]
  • DDPG: Lillicrap, Timothy P., et al. "Continuous control with deep reinforcement learning." arXiv preprint arXiv:1509.02971 (2015). [arXiv]
  • SAC: Haarnoja, Tuomas, et al. "Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor." International conference on machine learning. PMLR, 2018. [arXiv]
  • DQN: Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." nature 518.7540 (2015): 529-533. [arXiv]
  • PG: Sutton, Richard S., et al. "Policy gradient methods for reinforcement learning with function approximation." Advances in neural information processing systems 12 (1999). [arXiv]
  • PPO: Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017). [arXiv]