New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Neighboring states are being affected by partial observability #6
Comments
Or maybe that is expected behavior after all, and I'm just confused. Number of agents, which in this case is 2. I think I'm mostly confused about how to correctly represent states as the ones @wbarfuss and I discussed. p(s=DD, o=D)=1 Originating an observation matrix looking like the following:
The question is: how to surely know one is observing states D or C when the matrix represents observation pairs? |
Closed as this is what makes RL especial |
I suspect there's something wrong with the way we're stepping through the different iterations to generate the flow graphs. I believe that the probabilities of observing states in one tensor are affecting the probabilities of seeing other states in other tensors.
Here's a
h=(2,2,2)
history plot.Agent 1 has partial observability in which the agent's
c,c.|c,c.|
state is completely obscure:And the agent 0 has complete observability.
Here's the plot of the first
c,c.|c,c.|
state and the next 3 states.My understanding is that by obscuring state
c,c.|c,c.|
we shouldn't have any influence in the neighboring states, but we do. As we can see if I plot now a historyh=(2,2,2)
with both agents homogeneously fully observing the environment. The expected result should be the three states precedingc,c.|c,c.|
to be exactly the same, however we observe that they're different:We observe similar results for heterogeneous agents (as described previously) in a
h=(1,2,2)
scenario:And for comparison here's the plot of homogeneous agents for a scenario with
h=(1,2,2)
:As you can see the changes are very slight! But they're more pungent in 4x4 matrixes because instead of 0.0625 we have 0.25 and they have a stronger influence in the neighboring graphs. My understanding is that this behaviour shouldn't be happening.
The text was updated successfully, but these errors were encountered: