You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Example of the issue. I can create some full reproduction script later if needed.
This is some kind of data_raw and pos_join_enriched, parquet files with exactly the same structure. The names of the files are the same. The rows are aligned. These datasets are loaded via pyarrow then con.register. And then a data_enriched via the positional_join query as shown below.
The filter query is quite efficient. The cik column is an ordered in so I think Zonemaps implicitly are used.
However the same query on the positional joined data is 100x slower. I would think it should be at most 2x slower.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Example of the issue. I can create some full reproduction script later if needed.
This is some kind of
data_raw
andpos_join_enriched
, parquet files with exactly the same structure. The names of the files are the same. The rows are aligned. These datasets are loaded via pyarrow then con.register. And then adata_enriched
via the positional_join query as shown below.The filter query is quite efficient. The cik column is an ordered in so I think Zonemaps implicitly are used.
However the same query on the positional joined data is 100x slower. I would think it should be at most 2x slower.
Beta Was this translation helpful? Give feedback.
All reactions