Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

outstanding balance function #3059

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from

Conversation

orehunt
Copy link
Contributor

@orehunt orehunt commented Mar 12, 2020

Summary

This adds a function that should be able to calculate the balance of all open trades on each candle, using close prices.
However results don't match with the function in the issue :)
closes #3020

Quick changelog

  • added outstanding balance function

@coveralls
Copy link

coveralls commented Mar 12, 2020

Coverage Status

Coverage decreased (-0.3%) to 97.806% when pulling def8dec on orehunt:outstanding_balance_function into d758b0c on freqtrade:develop.

- comments for analyze_trade_parallelism
@orehunt orehunt force-pushed the outstanding_balance_function branch from 065dd75 to dcd85e3 Compare March 12, 2020 13:47
@xmatthias
Copy link
Member

i still struggle to see why we need this - and what exactly this delivers.

I think this is a rather odd approach to calculate this - and being honest, i think the formula used is wrong as in theory, all you should need to do is:

  • create a timeseries dataframe (per pair) with the total length of the timerange - exploding every trade into multiple lines (one line per candle) - code for this is available from analyze_trade_parallelism() and only needs to be modified a bit
  • multiply the amount of the trade with the selected pricecolumn at that point (probably open or close price)
  • combine all "value" columns (output of the above) into one dataframe, one column per pair
  • sum over all columns -> get total amount per candle.

The calculation (1 - open_rate / close - slippage) should not be necessary - and is wrong to me - unless i completely misunderstood what you're trying to do.

@orehunt
Copy link
Contributor Author

orehunt commented Mar 15, 2020

i still struggle to see why we need this - and what exactly this delivers.

The point is to show how much volatility there was intra trade. You only know that each trade was > stoploss and < roi during that time, but you don't know by how much.

* create a timeseries dataframe (per pair) with the total length of the timerange - exploding every trade into multiple lines (one line per candle) - code for this is available from `analyze_trade_parallelism()` and only needs to be modified a bit

I did it this way also for speed, this is very fast (although it also uses mostly numpy)

* multiply the amount of the trade with the selected pricecolumn at that point (probably open or close price)

this doesn't work if you have position stacking enabled, you need to keep track of the duration of each trade..unless I am mistaken the BacktestResult doesn't have amounts neither in base nor in quote, but with the base amount maybe it could be done by collapsing all the amounts...or not mmh not sure

  • combine all "value" columns (output of the above) into one dataframe, one column per pair

  • sum over all columns -> get total amount per candle.

this is what's done at the end

The calculation (1 - open_rate / close - slippage) should not be necessary - and is wrong to me - unless i completely misunderstood what you're trying to do.

you're right in the sense that calculating it this way, it assumes an amount of 1 for each pair, so it would have unequal amounts between pairs, but it should be inconsequential for what I was looking for, but it could be fixed, you would have to replace open_rate with stake / open_rate.

Copy link
Member

@xmatthias xmatthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been trying to engineer a case where i can see the output or test how this works for at least the last hour, but neither with data from a database (loaded via load_trades_from_db()), nor with data from a Backtest-result (loaded via load_backtest_data()) i've been able to get any output other than a one-line dataframe, with 0 balance and the date of the last day of the first month, so i'll have to assume it's somehow broken / non-functioning.

Please provide us with a working example on how this can be used so we can play and test this - otherwise we'll have to close this as it's currently non-functioning - at least in any way i tried to call it).
This can be either via a test-case which asserts the output (this should show us how to call it and how it works) - or via a small case in a comment / notebook so we can test this.

freqtrade/data/btanalysis.py Outdated Show resolved Hide resolved
@orehunt
Copy link
Contributor Author

orehunt commented Apr 30, 2020

I added an implementation as a loss function based on sortino.
Bear in mind that this is basically an upsampling, which means, it adds noise, which is not what you generally want in smaller timeframes; So it could be useful as a buy and hold metric with like 1d timeframes and 3-7d trades duration where you can see volatility day by day

Note that hyperopt needs to pass the holcv data...processed as kwarg

Copy link
Member

@xmatthias xmatthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the implementation of this (looping for hops_max) this is very slow.

Running this for a timeframe of 5m for 10 pairs for ~1 month takes south of >12s (after fixing the timeframe bug, obviously).

With that kind of performance, using this for hyperopt is near impossible.

I'm also still not 100% certain what this should return - is it the balance (in stake currency) at every moment in time?

i think a calculation should work too - while avoiding the "max_hops" loop and therefore beeing rather performant (<1s for the same dataset than the above test).

def calculate_outstanding_balance_1(
    results: pd.DataFrame,
    timeframe: str,
    min_date: datetime,
    max_date: datetime,
    hloc: Dict[str, pd.DataFrame],
    slippage=0,
) -> pd.DataFrame:
    """
    Sums the value of each trade (both open and closed) on each candle
    :param results: Results Dataframe
    :param timeframe: Frequency used for the backtest
    :param min_date: date of the first trade opened (results.open_time.min())
    :param max_date: date of the last trade closed (results.close_time.max())
    :param hloc: historical DataFrame of each pair tested
    :slippage: optional profit value to subtract per trade
    :return: DataFrame of outstanding balance at each timeframe
    """
    
    stake_amount = 0.1
    # Assume a stake amount of 0.1 (should be a parameter)
    results['amount'] = stake_amount / results['open_rate'] 

    from freqtrade.exchange import timeframe_to_minutes
    timeframe_min = timeframe_to_minutes(timeframe)
    dates = [pd.Series(pd.date_range(row[1].open_time, row[1].close_time,
                                     freq=f"{timeframe_min}min", tz='UTC'))
             for row in results[['open_time', 'close_time']].iterrows()]
    deltas = [len(x) for x in dates]
    dates1 = pd.Series(pd.concat(dates).values, name='date')
    df2 = pd.DataFrame(np.repeat(results.values, deltas, axis=0), columns=results.columns)
    # df2 is a dataframe containing one row per candle for the duration the trade was open
    df3 = pd.concat([dates1, df2], axis=1)
    df3 = df3.set_index('date')
    # df3 has the dates set correctly (one date per candle / trade). Date is only unique per pair.
    values = {}
    for pair in hloc:
        ohlc = hloc[pair].set_index('date')
        df_pair = df3.loc[df3['pair'] == pair]
        # filter on pair and convert dateindex to utc
        # * Temporary workaround 
        df_pair.index = pd.to_datetime(df_pair.index, utc=True)

        # Combine trades with ohlc data
        df4 = df_pair.merge(ohlc, left_on=['date'], right_on=['date'])
        # Calculate the value at each candle
        df4['current_value'] = df4['amount'] * df4['open']
        # 0.002 -> slippage / fees
        df4['value'] = df4['current_value'] - df4['current_value'] * 0.002 
        values[pair] = df4
#         display(df4[["pair", 'profit_percent', 'open_time', 'close_time', 'open_rate', 'close_rate', 'open', 'close', 'amount', 'value']].tail())
        
    balance = pd.concat([df[['value']] for k, df in values.items()])
    return balance

Results seem similar (the graphical representation is similar) to your results - but i think the scale is different... or some other detail i'm currently missing.

(this is the reason this took a while - as i had to find some uninterrupted time to work on this ...)

timedelta = pd.Timedelta(timeframe)

date_index: pd.DatetimeIndex = pd.date_range(
start=min_date, end=max_date, freq=timeframe, normalize=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong when used for short timeframes.

Assuming "5m" - if that's passed to pd.date_range - it'll create an index with a frequency of "5M" - which is 5 months.

what should be used instead is something along the following lines (a sample of this can be found in analyze_trade_parallelism).

    from freqtrade.exchange import timeframe_to_minutes
    timeframe_min = timeframe_to_minutes(timeframe)
    ... 
    freq=f"{timeframe_min}min"

freqtrade/data/btanalysis.py Outdated Show resolved Hide resolved
@orehunt
Copy link
Contributor Author

orehunt commented Jun 10, 2020

I imagine that doesn't work with position stacking?

# df3 has the dates set correctly (one date per candle / trade). Date is only unique per pair.

This makes me think so

@xmatthias
Copy link
Member

it should work with position stacking as well ...

it was just a quick comment thrown there to explain what it does ...
Mostly to explain that date is NOT unique - if you look above these lines you'll notice that it's not per pair, but per trade.

@orehunt
Copy link
Contributor Author

orehunt commented Jun 10, 2020

A couple of things I noticed:

  • the value is calculated using open while I was always using close except for the close_time candles(rows) that use the profits(abs) col. The reason is that when you buy, and you know the trade is not closed, using open doesn't give you information..however since the open of the next candle is usually around the same value it doesn't change much.
  • that calculation gives a compact array, while mine was returning a sparse array, 0 rows are important both as a loss function and as a source for plotting

I'm also still not 100% certain what this should return - is it the balance (in stake currency) at every moment in time?

Yes I would resample the values like this since you already used abs values

balance = (
            balance.resample(freq).agg({"value": sum}).asfreq(freq)
        )

and maybe do a cumsum and plot it like plot-profit

Due to the implementation of this (looping for hops_max) this is very slow.

It was the heaviest part, that's also why there was the other loop which was of similar perf. iirc I was testing with 1h tf and avg trade duration of max 6h so worst case was 6 hops while if you keep same trade duration over 5m it would be worst 72 hops

@xmatthias
Copy link
Member

and maybe do a cumsum and plot it like plot-profit

Cumsum cannot be used on this - the output is the total balance in trades at any point in time ...
if i own 100 now, and 101$ in 1 hour - then i won't own 201$ (which would be the outcome of a cumsum).


I'm also not certain on the usefulness of this in a loss function - all it'll do is analyze how much money was in a trades at any point in time - it'll not give you the account balance at that moment, so based on this function, you can't determine if the bot is successful or not.

This function can however be useful to analyze how much money is in trades at any point in time - so you might determine to use a lower (or higher) max_open_trades.

Objective function, returns smaller number for more optimal results.
Uses Sortino Ratio calculation.
"""
hloc = kwargs["processed"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can you access the "processed" data? In my loss function, they are not in the kwargs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this is a working example as is ...

following the PR history, it's an example to show how the data provided via the function below can / could be used.

@xmatthias xmatthias force-pushed the outstanding_balance_function branch from 933aea7 to 5aaa05f Compare March 25, 2021 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

a column for outstanding balance
4 participants