Cannot filter on decimal fields in parquet files using is_in() #16150
Labels
bug
Something isn't working
needs triage
Awaiting prioritization by a maintainer
python
Related to Python Polars
Checks
Reproducible example
If you do a scan_parquet and then filter on it like:
you'll get an error:
InvalidOperationError: is_in cannot check for Float64 values in Decimal(Some(1), Some(1)) data
Though you can always cast, like:
df.filter(pl.col('value_decimal').cast(pl.Float32).is_in([0.0, 0.1])).collect() # ???
which runs without errors, but fails to return value 0.1 !!!
Filtering on "integer" decimals does fail as well:
df.filter(pl.col('integer').is_in([3.0,4.0])).collect()
InvalidOperationError: is_in cannot check for Float64 values in Decimal(Some(1), Some(0)) data
Though casting here works well:
df.filter(pl.col('integer').cast(pl.Float32).is_in([3.0,4.0])).collect()
Log output
Issue description
InvalidOperationError: is_in cannot check for Float64 values in Decimal(Some(1), Some(1)) data
InvalidOperationError: is_in cannot check for Int64 values in Decimal(Some(1), Some(0)) data
Expected behavior
Does not fail, returns the correct results
Installed versions
The text was updated successfully, but these errors were encountered: