You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
I have a lot of tables created via delta-rs from an older version, around 0.15.3. I'm looking to upgrade, but I'm running into breaking errors, I think from the 0.16.0 breaking change to how timestamps work.
I have tables that have a timestamp column, and due to older versions of delta-rs, they are written without a timezone. When I upgraded to 0.17.2, and I read in the table, the schema says timestamp[us, tz=UTC] - okay, that's reasonable, it's attaching a UTC timezone.
But, when I try to write to the table via a merge and the predicate compares the timestamp columns, it fails with
My theory is that delta-rs is interpreting the schema of the table as having a UTC timezone, whereas the physical data actually reflects the table having no timezone. Thus, when it casts the source table I pass in, it applies the UTC timezone, which Arrow then cannot compare against the physical table, which has no timezone.
What you expected to happen:
Is there any way to upgrade delta-rs such that this doesn't break? Would be really unfortunate to have to rewrite all of these tables because of this breaking change.
How to reproduce it:
Run this with an older (< 0.16) delta-rs:
Ideally we cast the table scan to use the delta schema, but we have never added this. You could look into that or just rewrite the tables
@ion-elgreco Okay, it would be really difficult for us to rewrite all of our tables. Can you provide some pointers to code locations that we'd need to change to get the table scan to use the delta schema? I can look into that.
Ideally we cast the table scan to use the delta schema, but we have never added this. You could look into that or just rewrite the tables
@ion-elgreco Okay, it would be really difficult for us to rewrite all of our tables. Can you provide some pointers to code locations that we'd need to change to get the table scan to use the delta schema? I can look into that.
Somewhere in the scan builder I guess, and then on the parquet scan
Environment
Delta-rs version: 0.15.3 => 0.17.2
Binding: python
Bug
What happened:
I have a lot of tables created via delta-rs from an older version, around 0.15.3. I'm looking to upgrade, but I'm running into breaking errors, I think from the 0.16.0 breaking change to how timestamps work.
I have tables that have a
timestamp
column, and due to older versions of delta-rs, they are written without a timezone. When I upgraded to 0.17.2, and I read in the table, the schema saystimestamp[us, tz=UTC]
- okay, that's reasonable, it's attaching a UTC timezone.But, when I try to write to the table via a merge and the predicate compares the timestamp columns, it fails with
My theory is that delta-rs is interpreting the schema of the table as having a UTC timezone, whereas the physical data actually reflects the table having no timezone. Thus, when it casts the source table I pass in, it applies the UTC timezone, which Arrow then cannot compare against the physical table, which has no timezone.
What you expected to happen:
Is there any way to upgrade delta-rs such that this doesn't break? Would be really unfortunate to have to rewrite all of these tables because of this breaking change.
How to reproduce it:
Run this with an older (< 0.16) delta-rs:
Then, run this with a new delta-rs:
The text was updated successfully, but these errors were encountered: