MiniMax Algorithm? #30

flaport · 2022-11-06T06:13:06Z

How would you implement a minimax q-learner with coax?

Hi there! I love the package and how accessible it is to relative newbies. The tutorials are pretty great and the accompanying videos are very helpful!

I was wondering what the best way to implement a minimax algorithm would be, would you recommend using two policies pi1 and pi2? Or is there something better suited for this?

I'd like to re-implement something like this old blogpost of mine in coax to get a better feel of the library.

Any help would be greatly appreciated :)

KristianHolsheimer · 2022-11-17T08:37:47Z

Hi @flaport

First of all thanks for your interest in coax!

It would be great to see multi-agent style setups in coax. I haven't thought much about it, to be honest.

The simplest setup would be to use separate policies and either update the policies individually or write your own policy objective that updates multiple policies at the same time.

Having said that, I'm not an expert in multi-agent RL myself, so I'm not aware of all the subtleties associated with such a setup.

But of course, I welcome contributions and I'm curious to see what you come up with!

flaport added the question Further information is requested label Nov 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MiniMax Algorithm? #30

MiniMax Algorithm? #30

flaport commented Nov 6, 2022

KristianHolsheimer commented Nov 17, 2022

MiniMax Algorithm? #30

MiniMax Algorithm? #30

Comments

flaport commented Nov 6, 2022

KristianHolsheimer commented Nov 17, 2022