You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's often situations where I want to do some computation on a LazyFrame, and conditionally throw an error, but I don't actually want to collect the LazyFrame yet. I'm curious if we can have some kind of lazily evaluated error expression that only raises when it actually tries to be instantiated.
For example
assertisinstance(df, pl.LazyFrame)
print(df.schema)
>>>OrderedDict([('x', Int64), ('y', Int64)])
df=df.with_columns(
pl.when(pl.col('y') !=0)
.then(pl.col('x') /pl.col('y'))
.otherwise(pl.raise("Division by zero"))
)
# (nothing happens yet)df=df.collect()
# Iff there are any zeros in column 'y':>>>ComputeError: encounterederror'Division by zero'
Just to stress: I understand the above division produces inf, but this is just an example, the point is there are other computations I may want to do, with business logic that isn't really captured by the typing system.
The text was updated successfully, but these errors were encountered:
I like this idea, but unfortunately the when/then architecture doesn't allow for this: when you supply a when/then chain, all columns are computed in parallel, and then filtered.
Description
There's often situations where I want to do some computation on a LazyFrame, and conditionally throw an error, but I don't actually want to collect the LazyFrame yet. I'm curious if we can have some kind of lazily evaluated error expression that only raises when it actually tries to be instantiated.
For example
Just to stress: I understand the above division produces inf, but this is just an example, the point is there are other computations I may want to do, with business logic that isn't really captured by the typing system.
The text was updated successfully, but these errors were encountered: