Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAR round_update method is wrong? #26

Open
BeniaminC opened this issue Dec 11, 2023 · 2 comments
Open

BAR round_update method is wrong? #26

BeniaminC opened this issue Dec 11, 2023 · 2 comments

Comments

@BeniaminC
Copy link

Currently, I am trying to convert all the rating systems to support teams.

I have been trying to understand the BAR round_update from the paper pertaining according to Algorithm 1 in https://jmlr.csail.mit.edu/papers/volume12/weng11a/weng11a.pdf. The variable info is instantiated on line 51. Line 63 is accumulating, but then on line 67 it is set again, replacing the accumulated value.

        standings.into_par_iter().for_each(|(player, my_lo, _)| {
            let my_rating = &player.approx_posterior;
            let old_sig_sq = my_rating.sig.powi(2);
            let mut info = 0.;
            let mut update = 0.;
            for (rating, lo) in &all_ratings {
                let outcome = match my_lo.cmp(lo) {
                    std::cmp::Ordering::Less => 1.,
                    std::cmp::Ordering::Equal => 0.5,
                    std::cmp::Ordering::Greater => 0.,
                };
                let c_sq = old_sig_sq + rating.sig.powi(2) + 2. * sig_perf_sq;
                let c = c_sq.sqrt();
                let probability = self.win_probability(c, my_rating, rating);

                info += probability * (1. - probability) / c_sq;
                update += (outcome - probability) / c;
            }
            // Treat the round as one highly informative match
            info = 0.25 / (old_sig_sq + 2. * sig_perf_sq); // value replaced?
            update /= all_ratings.len() as f64;
@EbTech
Copy link
Owner

EbTech commented Dec 12, 2023

That would be a really useful contribution which Paul and I would be happy to support you on, if you'd like to put us in the loop!

Regarding BAR, I regret not documenting my reasoning better. The overwrite was on purpose, though I really should have commented out line 63. I kept this first formula because it has a natural mathematical interpretation (likely related to the original paper, though I can't recall exactly). However, in practice it was hard to get BAR to work well with large numbers of players. The second formula (line 67) is a solution based on the same reasoning as Elo-MMR: namely, that the number of opponents is so large that we get a near-perfect measurement of each player's performance (but of course, not a perfect measurement of their underlying skill level). This seems to work better.

@BeniaminC
Copy link
Author

BeniaminC commented Dec 12, 2023

Recently, I took on a hobby project for an ELO rating system. Eventually, the codebase got massive. A Discord community needed an ELO system more robust than Microsoft Trueskill. Numerous problems arose from using Trueskill: good players can farm rating from noobs (mainly because their deviation and rating are higher than their true rating), and other problems.

Some ways of circumventing this issue are "placement lobbies" (allow noobs to gain a proper rating before affecting other player's rating), adding a priors from the previous season (order the players by relative rating and generate a normal distribution of these players), using game statistics such as kills/assists/deaths/wins/losses, etc.

Every season (3 months) the system is adjusted with new methods and parameters tuning. But these added features have their own problems. If a pro player plays noob lobbies, he may not see his rating change for awhile until the noobs gain a proper placement. There is a balance between enjoyment of utilizing an elo system and adding constraints to prevent exploitation. Every feature added came with an exploitation (i.e., enforcing a balancer increased lobby dodging, only allowing games longer than 15 minutes created leavers, etc). Generally, adding more constraints makes the elo system less enjoyable.

It is an interesting unique scenario in which the host of the game lobby can determine balance (no matchmaking system). Last season the elo system enforced lobbies to be balanced. Given the sets of team combinations, allow only the intersection of three conditions: the groupings of two players ordered by rating cannot be on the same team (i.e. the top two players cannot be on the same team, the next grouping of two players cannot be on the same team, etc); the average elo rating of both teams must be under a specific threshold; and given a list of game combination sorted by the difference of the elo average of of both teams, only allow top k results. However, even with this enforcement, pro players can still farm noobs.

I rewrote your entire codebase in Python to support PyTorch for GPU acceleration. There are also some nice libraries to support convex optimization such as SciPy: https://docs.scipy.org/doc/scipy/tutorial/optimize.html#. Hopefully, by next season, I can get the systems and parameters set (Dec. 20th). I will test some methods and let you know the outcome and perhaps find some ideas to create a more robust system for an elo system for teams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants