A question about the designHead weights in LstmPolicy and StateActionPredictor classes #23

lihuang3 · 2018-04-07T23:46:01Z

Hello, thanks for your great work!

I noticed that in /src/a3c.py line 271-277,
self.network = LSTMPolicy(env.observation_space.shape, numaction, designHead)
is defined within the scope "local", and
self.ap_network = StatePredictor(env.observation_space.shape, numaction, designHead, unsupType)
is defined within the scope "predictor" under the scope "local". I think (as I tested MNIST in a simple CNN) this indicates that the designHead weights used in both classes are different (even though designHead structures are the same) since they are under different scope.

In LstmPolicy class, the inputs are fed into the designHead and the outputs are fed into lstm for policy and value fcn prediction.
However in StatePredictor/StateActionPredictor class, the forward and inverse models are based on the designHead with different weights as I mentioned LstmPolicy and StatePredictor are within different scopes.

I was wondering here /src/a3c.py line 271-277, why LstmPolicy and StatePredictor are not under the same scope so their designHead would share weights. In other words, if they are using different weights, it seems that the forward and inverse models are trained regardless of the A3C policy and value function, while A3C policy/value fcn are affected by the forward loss as intrinsic reward.

Thank you,

Li

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about the designHead weights in LstmPolicy and StateActionPredictor classes #23

A question about the designHead weights in LstmPolicy and StateActionPredictor classes #23

lihuang3 commented Apr 7, 2018 •

edited

A question about the designHead weights in LstmPolicy and StateActionPredictor classes #23

A question about the designHead weights in LstmPolicy and StateActionPredictor classes #23

Comments

lihuang3 commented Apr 7, 2018 • edited

lihuang3 commented Apr 7, 2018 •

edited