Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Balance, order volume is not explained in the docs. Reward gaming behavior observed in some models. #90

Open
astrologos opened this issue May 17, 2023 · 3 comments

Comments

@astrologos
Copy link

astrologos commented May 17, 2023

Hi @AminHP @bionicles @super-pirata , could you please update the docs to include an explanation of the agent's asset pool and how the volume of an order is determined?
An example on how to change these would also be helpful, as the README is opaque in this regard.

I have observed what I suspect is reward-gaming behavior in my trained agents, and I'm wondering if it is due to an under-specified environment. My agents (tested on DQN, PPO and A2C) fall into a stable valley where their optimal behavior is to first sell, then buy and continue buying. This results in profit slightly shy of 1.

Clearly this should not be valid behavior in any trading environment.

Thanks!
Jack

@astrologos
Copy link
Author

873b8264-0ca6-404f-b70e-2079a6047691
Training loss associated with this policy is 0.0008.

@astrologos
Copy link
Author

087e6047-fbd1-4ac8-975b-749c3befe08a

@astrologos
Copy link
Author

astrologos commented May 22, 2023

Hi, I am writing to report that I've created an alternative trading environment that is more advanced, highly customizable, and simpler to use, render and evaluate. It dodges the issues above.

You can find the repo here:
https://www.github.com/astrologos/tradinggym

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant