Binanace_trading_simulation

This project is to about finding the optimal Fee mechanism in the Exchange. RL agents acts as people under certain Fee policies. We observe how RL agents's behavior changes with Fee mechanism changes. Fee mechanism would change total trade volume and total fee. This project is maintained as Binance Fellowship.

The project overview video : https://youtu.be/kBjv4KmkEHU

Our project environment is based on https://github.com/Yvictor/TradingGym/

Project Explanation : https://medium.com/decon-simulation/dynamic-fee-mechanism-simulation-with-reinforcement-learning-97c847aa5c

Project Explanation[KR]: https://medium.com/@jeffrey_7616/dynamic-fee-mechanism-simulation-with-reinforcement-learning-6d15951dec05

Structure

agent Stores trading agents and specify how to train the agents and how to use them.
data Stores the historical data to train the agents
env Stores the environment where fee different fee mechanisms applied

Simulation Method

Train RL agents using trading gym.
Transfer agents to different environments where different fee mechanism is applied. Agents will trained again for 500 episodes more to adapt to each environment. Also, differentiate agents by varying risk_aversion ratio so that some agents prefer risk while others not.
Observe how agents behave in each environment. Especially watch the total_volume and total_fee from each environment. Derive insights from the observation what characteristics of fee mechanism makes the difference.

Future Plan

Provide environment where Limit order available -> lagged matching available to reflect more realistic trading environment

Adopted Fee mechanisms (Could be added more)

No fee
fee = 0.003 (0.3%)
fee = 0.005 (0.5%)
Bollinger band bound Environment
RSI bound Environment
MACD bound Environment
Stochastic slow bound Environment

Used Algorithms for trading agents

PPO

https://arxiv.org/abs/1707.06347

Rainbow

https://arxiv.org/abs/1710.02298

Attention

http://nlp.seas.harvard.edu/2018/04/03/attention.html

Performance at trading gym

Usages

pip install -r requirements.txt

Train original agent

cd agent/PPO
python ppo_start.py

cd agent/Attention
python attention_start.py

cd agent/DQN
python dqn_start.py

if you want to train multiple agents,

cd agent/DQN
bash run.sh

Transfer Learning

cd agent/DQN
python transfer_learning.py --environment=[environment]

if you want to transfer learn multiple agents,

cd agent/DQN
bash transfer.sh

Observation

cd agent/DQN

Open Observation notebook and run all cell

Results

Total fee and total volume under different fee rate

How Data feature affects TradingAgent's Decision

Using integrated_gradient, we can interpret how agents observe the data. X axis represents actions and Y axis represents the feature of data. The graph shows how the feature of data affects the action decision of trading agent. You can see that the weight distribution of feature is different depending on the training algorithms.

RAINBOW

Same Agent and Same OHLCV state, but different decision

The above figure shows the trading volume of Agent1 differing the fee environment. It shows same agent under same OHLCV situation makes different decisions.

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
agent		agent
dataset		dataset
envs		envs
figs		figs
.gitignore		.gitignore
LICENSE		LICENSE
License		License
README.md		README.md
ScorePlot-PPO, RAINBOW, Attention together.ipynb		ScorePlot-PPO, RAINBOW, Attention together.ipynb
arguments.py		arguments.py

License

deconlabs/TradingZoo-Dynamic-fee-simulation

Folders and files

Latest commit

History

Repository files navigation