You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
First thank you guys for this wonderful library. It does many pd operations pretty well given mem constraints (except maybe cumsum() which i am eagerly waiting.)
I have a arrow file ~8GB which i load in vaex df of shape: (27_416_244, 32). System avlbl RAM: ~8GB. I do a group_agg like this:
#summary_df is a multi index pandas df with 76k rows, 20 cols
index_names = list(summary_df.index.names)
strfmt = '%Y-%m-%d'
vdf['_Period'] = vdf['Date'].dt.strftime(strfmt)
gd_column_ops_map = {
'PnL % Capital':'sum', 'PnL':'sum', '% High':'mean',
'% Close':'mean', '% Low':'mean', 'Charges':'sum', 'Sell Val':'sum', 'Buy Val':'sum',
'Qty':'sum', 'Cash Flow':'sum'
}
grpby_cols = index_names + ['_Period']
>> [Kernel CRASHES in next line after grpby happens perhaps in agg]
grp_trades_vdf = vdf.groupby(grpby_cols, progress=True).agg(gd_column_ops_map)
Description
First thank you guys for this wonderful library. It does many pd operations pretty well given mem constraints (except maybe cumsum() which i am eagerly waiting.)
I have a arrow file ~8GB which i load in vaex df of shape: (27_416_244, 32). System avlbl RAM: ~8GB. I do a group_agg like this:
Software information
The text was updated successfully, but these errors were encountered: