-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifies loss storage in TD3 and Soft Actor Critic #195
base: master
Are you sure you want to change the base?
Conversation
Friendly reminder on this :) |
The changes will probably not break anything, but can you elaborate on why you suggest them? If it is just for consistency, I think it is better to force recorded losses to be floats, not scalar ndarrays, for simplicity. |
I think to justify it I need more data. But I did it mainly for consistency. But I think it may require less memory in some instances (at least some informal experiments showed slight improvements in memory usage though I can't be sure without a more complete experiment).
To be clear, we do indeed force them to be floats, right? We simply do Do you remember why we did |
Ah sorry I missed
IIRC I was just unaware it is possible to directly cast torch.Tensor to float. After I learned it is safe to cast (pytorch/pytorch#1129) I switched. I think it is better to choose the simpler way unless there is a memory leak issue in it. There seems to be yet another way of doing this,
|
Oh, thanks! Yes, This tells me that it is implicitly detaching from the computation graph (which is what we apparently we do when we detach()). Perhaps I can replace instances of |
Sounds good. I think it is ok to write like |
Note that the changes we have here follows what we do in DQN and in DDPG.
For example:
pfrl/pfrl/agents/dqn.py
Line 358 in b29533b