Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature normalization ? #32

Open
unrealwill opened this issue Oct 30, 2018 · 0 comments
Open

Feature normalization ? #32

unrealwill opened this issue Oct 30, 2018 · 0 comments

Comments

@unrealwill
Copy link

Hello, I just read the paper today, and there are still two points that remains unclear to me.
I looked at the code to try understanding it better but it still remains not clear.

The first point :
In model.py the features function transforming the input state into feature space are defined in nipsHead, universeHead, ...
In these definitions and their usage, I see no trace of normalization (something like l2 normalize).
I am expecting to see a normalization because it seems very easy for the network to cheat. If it want to maximize the reward, it just have to scale the features up. (And scale down in the inverse model to not be penalized).

The second point :
It seems to me that every time the parameters of the features function are modified, the intrinsic rewards therefore the rewards for the whole episode are modified. Therefore we need to recompute the generalized advantages for the whole episode. Does this mean that we must process episodes in their entirety ? How does it play with experience replay ? Is there an approximation to avoid recomputing the advantages after an update ?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant