Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

TimeDistributed LSTM Middleware #461

Open
OGordon100 opened this issue Sep 8, 2020 · 0 comments
Open

TimeDistributed LSTM Middleware #461

OGordon100 opened this issue Sep 8, 2020 · 0 comments

Comments

@OGordon100
Copy link

OGordon100 commented Sep 8, 2020

For many real-world situations, the task may have hidden state or partially observable features, making the Markovian assumption only semi-valid.

One way around this is to use frame stacking - doable already in Coach with filters.observation.observation_stacking_filter. It may be even better to use LSTM (and bi-directional) LSTM. Agents for this already exist, with the very well cited DRQN being one of them.

In Coach currently, there is the LSTMMiddleware layer. However, from what I understand of the source code it runs along the observations axis (for inputs such as text). Tensorflow of course has the TimeDistributed wrapper (with return_sequences=True) to run LSTM along the temporal axis between transitions.

Could timedistributed LSTM be added as a middleware? (or at the very least "hacked" in, as it would be of immense benefit to my current research, which I am using with a simple behavioural cloning agent)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant