Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: New way to organize many tests #1353

Open
marcharper opened this issue Jul 2, 2020 · 4 comments
Open

Idea: New way to organize many tests #1353

marcharper opened this issue Jul 2, 2020 · 4 comments
Assignees

Comments

@marcharper
Copy link
Member

In the course of finding new seeds for many tests for #1288 it occurs to me that we can probably organize these tests in a more useful way. There are many tests in the library like the following:

 actions = [(C, C), (C, D), (D, C), (D, D), (C, C), (C, D), (C, C), (D, D), (D, C), (C, D)]
        self.versus_test(axelrod.Alternator(), expected_actions=actions, seed=887)

where the player is given by self.Player() from TestPlayer. Sometimes these tests use MockPlayer as an opponent rather than an actual opponent. Searching for a new seed involves manually extracting the info (opponent, histories, etc.) and looping over seeds until a new one producing the same behavior is found.

I think a better approach would be to have a large file of expected matches, encoding the range of expected behaviors of every strategy, with rows of the form
Player1Class, Player2Class, expected_history_1, expected_history_2, seed, other params (like noise), ...
e.g.

Cooperator, Defector, CCCCC, DDDDD, None, ...

Essentially it's a dataframe of tests. We could include a description of the test and other metadata. Maybe some other format would be better but hopefully you get the idea.

Such a structure has a few benefits:

  1. The file can encode all expected behaviors of a strategy by the histories it should at some point yield in a simple way, rather than being scattered across various tests as is currently, reducing a lot of redundant code (regardless of whether a seed is required)

  2. It's easier to find new seeds when we need them. An auxiliary script can easily scan for new seeds if we change something about how seeding works, or how a strategy works, etc. Right now there's no easy way to extract all the expected tests to systematically find new seeds because the necessary data is hard-coded into functions, and often there is more than one "row" per test function.

  3. Similarly, when adding a new strategy, generic search functions can look for an opponent, a seed, etc. that generates a specific sequence of outcomes. I think we're all currently doing these as one-offs, with MockPlayers, etc.

  4. The associated tests will be more single issue now rather than some of the test_strategy functions we have that test several different things at once. This will show all the failures rather than failures one at time now for the compound tests as each subtest fails.

  5. The collection of expected matches might itself be useful somehow

Similarly, we have a lot of example tournaments and Moran processes with expected outputs that are seed dependent; perhaps they could be encoded in a similar manner. Not every test can be written as so but I would guess that the majority of tests could be done this way. This bumps up against #421, having a way to configure a tournament in a code-free way.

Thoughts?

@drvinceknight
Copy link
Member

I like this idea a lot.

We'll be able to document this dataset as well and theoretically it could be a useful research asset in it's own right. 👍 💪

@marcharper
Copy link
Member Author

marcharper commented Jul 3, 2020

Great. I hacked out an early version to help me find seeds for #1288 by watching all the invocations of versus_test. It's not perfect because if a subtest fails, the others are not run, but it did capture a lot of matches as below, using dataclasses and YAML.

---
coplayer:
  init_kwargs: {}
  name: Cooperator
expected_outcome:
  coplayer_actions: CCCCCCCCCCCCCC
  player_actions: CCCCCCDDDDDDDD
  player_attributes: null
match_parameters:
  game: null
  noise: null
  prob_end: null
  seed: null
  turns: null
player:
  init_kwargs:
    initial_plays: null
  name: Adaptive
---
coplayer:
  init_kwargs: {}
  name: Defector
expected_outcome:
  coplayer_actions: DDDDDDDDDDDDDD
  player_actions: CCCCCCDDDDDDDD
  player_attributes: null
match_parameters:
  game: null
  noise: null
  prob_end: null
  seed: null
  turns: null
player:
  init_kwargs:
    initial_plays: null
  name: Adaptive

From there I was able to cook up a script to run these matches and look for new seeds, or potentially opponents. It's rough but here's how it works.

import axelrod
from axelrod import load_matches


def verify_match_outcomes(match, expected_actions1, expected_actions2, attrs):
    # Test expected sequence of plays from the match is as expected.
    player1, player2 = match.players
    for (play, expected_play) in zip(player1.history, expected_actions1):
        if play != expected_play:
            # print(play, expected_play)
            return False
    for (play, expected_play) in zip(player2.history, expected_actions2):
        # print(play, expected_play)
        if play != expected_play:
            return False
    # Test final player attributes are as expected
    if attrs:
        for attr, value in attrs.items():
            if getattr(player1, attr) != value:
                return False
    return True


def run_matches():
    match_configs = list(load_matches())
    for match_config in match_configs:
        try:
            match = match_config()
        except AttributeError:
            continue
        player, coplayer = match.players
        if isinstance(player, axelrod.Human) or isinstance(coplayer, axelrod.Human):
            continue

        print(match_config)

        seed = match_config.match_parameters.seed
        attrs = match_config.expected_outcome.player_attributes
        player_actions = match_config.expected_outcome.player_actions
        coplayer_actions = match_config.expected_outcome.coplayer_actions

        if seed is None:
            match.play()
            print(verify_match_outcomes(match, player_actions, coplayer_actions, attrs))
        else:
            # Search for a seed
            for seed in range(1, 200000):
                match.set_seed(seed)
                # axelrod.seed(seed)
                match.play()
                if verify_match_outcomes(match, player_actions, coplayer_actions, attrs):
                    print("Seed found:", seed)
                    break
        print()


if __name__ == "__main__":
    run_matches()

@marcharper
Copy link
Member Author

I'd like to do the same thing for full tournaments and Moran processes.

@drvinceknight
Copy link
Member

I'd like to do the same thing for full tournaments and Moran processes.

Sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants