Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding heterogeneous observation social dilemma environment #4

Draft
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

marimeireles
Copy link
Collaborator

No description provided.

…observations = 0.5 for partial observable agents
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -346,10 +347,64 @@ def TransitionTensor(self):

def RewardTensor(self):
return histSjA_RewardTensor(self.baseenv, self.h)


def ObservationTensor(self):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this function to be able to generate different observation tensors for each agent.

@@ -0,0 +1,170 @@
# AUTOGENERATED! DO NOT EDIT! File to edit: ../../nbs/Environments/02_HeterogeneousObservationsEnv.ipynb.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes are within this file. It's largely adapting the ebase file to deal with multiple observations.

@@ -0,0 +1,127 @@
# AUTOGENERATED! DO NOT EDIT! File to edit: ../../nbs/Environments/12_MultipleObsSocialDilemma.ipynb.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file simply implements the social dilemma layer into the heterogeneous observation env. file.
I've initially tried to incorporate the "contract" idea because I saw this in the Uncertain Environment file, however, I don't really understand the dynamics of contract and I don't think it's fully functional, I need to work on it.
I thought it wasn't relevant for our project as IPD only has the one state .. Please let me know if I misunderstood this.

@marimeireles marimeireles marked this pull request as draft March 15, 2024 15:09
@marimeireles
Copy link
Collaborator Author

marimeireles commented Mar 15, 2024

I'm also a bit confused on whether it's possible to have observation tensors summing for numbers > 1 or < 1. I guess the only reason why we cannot is because of the generate_stochastic_observations... But we could change that.
I'm not sure if it makes sense, but I thought it could be possible for an agent to have for example, 0% chance of observing something. Or having tensors looking like [0, 0.8, 0.6, 0.4] or like [0.7,0.,0.,0.]. If this is not possible are there other reasons why it is not possible other than using generate_stochastic_observations in the step function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant