Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invoking BigVar with pandas DataFrame for data does not work, to_numpy() provides workaround #23

Open
extrospective opened this issue Jul 19, 2021 · 1 comment
Labels

Comments

@extrospective
Copy link

extrospective commented Jul 19, 2021

It is not unusual to pass a pandas DataFrame where an array is requested.
(Not sure whether it is responsibility of library to know about this.)

With xyz as a DataFrame, I successfully invoked:

mod=BigVAR(xyz, p=lag_max, struct="Basic", gran=[150,10], T1=T1, T2=T2, VARX={})

But then rolling_validate(mod) fails with an error such as:

--> 190         trainY = np.array(Y[p:Y.shape[0], :], copy=True)
    191         print(f'trainY shape is {trainY.shape}')
    192         trainY = np.array(trainY[0:T2, :], copy=True)

d:\Anaconda3\envs\py39\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   3022             if self.columns.nlevels > 1:
   3023                 return self._getitem_multilevel(key)
-> 3024             indexer = self.columns.get_loc(key)
   3025             if is_integer(indexer):
   3026                 indexer = [indexer]

d:\Anaconda3\envs\py39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3078             casted_key = self._maybe_cast_indexer(key)
   3079             try:
-> 3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:
   3082                 raise KeyError(key) from err

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '(slice(2, 365, None), slice(None, None, None))' is an invalid key

Workaround: be sure to convert DataFrame to a numpy array before calling BigVar.
This code does not encounter the same error when subsetting Y.

mod=BigVAR(xyz.to_numpy(), p=lag_max, struct="Basic", gran=[150,10], T1=T1, T2=T2, VARX={}) # {'k':var_count, 's':1})

Here, the DataFrame uses to_numpy() so the reduction to an array occurs before BigVAR sees the structure.

@extrospective
Copy link
Author

A python label seems appropriate here; maybe labels should be added to project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants