Inquiry for possible performance improvement #162

EddieMataEwy · 2023-02-25T22:00:16Z

The performance of this repo is already amazing, but I wanted to ask a question.
Have you checked the family of improvements defined in this paper? (https://realworld-sdm.github.io/paper/27.pdf)
It derives existing algorithms like CFR+ or DCFR by computing "instant updates" to the counterfactual value, the regret and the strategy.
I don't know if this would add a lot of complexity to the existing codebase, but it allows, for example, for even faster convergence.
This would make CFR+ converge faster than DCFR without worrying about tuning alpha, beta and gamma.

bupticybee · 2023-02-27T01:20:58Z

No I havn't read the paper, will read it. Sounds promising

xuzy1975 · 2023-03-02T18:07:28Z

I don't understand the step(5), where to use the instant counterfactual value updated by σt+1?

xuzy1975 · 2023-03-03T07:41:35Z

   //ICFR要在这用新策略更新payoffs
      const vector<float> current_strategy_new = trainable->getcurrentStrategy();
      fill(payoffs.begin(),payoffs.end(),0);
      //收集数据
      for (int action_id = 0; action_id < actions.size(); action_id++) {
          vector<float>& action_utilities = results[action_id];
          if(action_utilities.empty())
              continue;
          for (int hand_id = 0; hand_id < action_utilities.size(); hand_id++) {
                  float strategy_prob = current_strategy_new[hand_id + action_id * node_player_private_cards.size()];
                  payoffs[hand_id] += strategy_prob * (action_utilities)[hand_id];
          }
      }

add to the end of actionUtility() , it indeed improve performance in some public, such as 6h6c6d, 7d7h2h...

EddieMataEwy · 2023-03-04T19:19:13Z

I believe you need calculation 5 to proceed with parent node calculations. I don't understand it very well. That is why I opened an issue instead of coding it myself and doing a pull request.

xuzy1975 · 2023-03-05T16:59:17Z

It seems to need to recalculate payoff use the new strategy, I tried, in some case like banchmark settings, it convergent faster, but in large scale game ,it works worse, maybe somewhere I misunderstood.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry for possible performance improvement #162

Inquiry for possible performance improvement #162

EddieMataEwy commented Feb 25, 2023

bupticybee commented Feb 27, 2023

xuzy1975 commented Mar 2, 2023

xuzy1975 commented Mar 3, 2023

EddieMataEwy commented Mar 4, 2023

xuzy1975 commented Mar 5, 2023

Inquiry for possible performance improvement #162

Inquiry for possible performance improvement #162

Comments

EddieMataEwy commented Feb 25, 2023

bupticybee commented Feb 27, 2023

xuzy1975 commented Mar 2, 2023

xuzy1975 commented Mar 3, 2023

EddieMataEwy commented Mar 4, 2023

xuzy1975 commented Mar 5, 2023