Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garman-Klass Volatility Estimator Returns Empty Series for Valid OHLC Data in mlfinlab 2.3.0 #539

Open
shrikantad opened this issue Jan 30, 2024 · 1 comment

Comments

@shrikantad
Copy link

shrikantad commented Jan 30, 2024

Description
When using the garman_klass function from mlfinlab version 2.3.0 on a dataset with 31 OHLC entries, I expected to receive a non-empty series with volatility estimates (exactly 1 value to be precise). Instead, the function returned an empty series and issued a RuntimeWarning related to an invalid value encountered in a square root operation. This unexpected behavior suggests that there may be a bug in the function's handling of the input data or within the computation itself.

To Reproduce

  1. Install mlfinlab via pip (pip install mlfinlab==2.3.0).
  2. Load the OHLC data from the attached CSV file.
  3. Execute the garman_klass function with the DataFrame and a window size of 30.
from mlfinlab.features.volatility_estimators import garman_klass
import pandas as pd

ohlc = pd.read_csv("data/ohlc_data.csv")  # Replace with the actual path to the CSV
garman_klass(ohlc, window=30)

Expected behavior
The garman_klass function should compute and return a Pandas Series with at least one volatility estimate based on the provided OHLC data.

Actual Behavior
The function returns an empty Pandas Series and raises the following warning:

/home/shrk/micromamba/envs/qc/lib/python3.9/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in sqrt
  result = getattr(ufunc, method)(*inputs, **kwargs)
Series([], dtype: float64)

Environment
Operating System: Windows 11 (Version 23H2, OS Build 22631.3085)
Python Version: 3.9.18
mlfinlab Version: 2.3.0
Pandas Version: 2.0.0

Attachments
ohlc_data.csv

ohlc_data.csv (attached) containing the dataset used when encountering the issue. I have obtained this data from Quantconnect (Basic S&P500 ETF TradeBar data for 31 days in 2016)

Additional context
The attached CSV file contains the OHLC data that replicates the issue. The dataset includes 31 rows of OHLCV data, which should be sufficient for the garman_klass function to calculate at least one value based on the window size of 30.

@sorensenj50
Copy link

It might be a problem with the data. I was testing my own implementation of the GK estimator on a futures dataset including the EuroBond, and I got nan results because the H / L term was smaller than the C / C term, resulting in negative values which were nan when passed through the square root. The issue also came up in the NK, Z, and G contracts. If you look at the formula, there is nothing stopping it from breaking if the H / L spread is small enough.

Implement the formula yourself and test to see if this is the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants