Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do the trading environments only allow for 1 share to be held at a time? #100

Open
ShinyOrbThing opened this issue Mar 8, 2024 · 2 comments

Comments

@ShinyOrbThing
Copy link

ShinyOrbThing commented Mar 8, 2024

Hi, I am reading through the code since I plan to use this environment for a project, and I've realised that an action is only considered a trade if we change from a buy-action sequence to a sell-action or vice versa.

If at a time point t, the agent buys, and then at the next time point t+1 it also suggests a buy action, will the agent now hold 2 shares? I haven't been able to find evidence in the code that the second buy action is considered, and that a second share is bought. It seems that once there is a buy action, all subsequent buy actions pertain to "holding"?

Could you outline how the subsequent buy orders are handled please?

@AminHP
Copy link
Owner

AminHP commented Mar 20, 2024

Hi @ShinyOrbThing , that's right. The second buy action is actually holding. The share amount is always 1, you can buy or sell one share.

@ShinyOrbThing
Copy link
Author

ShinyOrbThing commented Apr 1, 2024

Hi @ShinyOrbThing , that's right. The second buy action is actually holding. The share amount is always 1, you can buy or sell one share.

Hi Amin, as a follow-up: It appears that the update_profit calculation is assuming an "all-in" position every time we buy.

In stocks_env.py, the profit is calculated as:

if self._position == Positions.Long:
                shares = (self._total_profit * (1 - self.trade_fee_ask_percent)) / last_trade_price

                self._total_profit = (shares * (1 - self.trade_fee_bid_percent)) * current_price

Since (ignoring ask and bid fees) the profit update is
$$\text{current profit} \times \frac{\text{current price}}{\text{last price}}$$,

If the current price is 0, then regardless of the previous profit, we now have 0 profit. So a buy position assumes we are all-in every time rather than a single share, or am I missing something? This is different from the reward calculation behaviour, which considers absolute price differences instead. So there is often a disagreement between both metrics. Please see an example below from a signal I simulated.

a2c_sine8_plot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants