Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support timezone conversion in filter pushdown #26819

Open
sthm opened this issue Apr 26, 2024 · 1 comment
Open

Support timezone conversion in filter pushdown #26819

sthm opened this issue Apr 26, 2024 · 1 comment
Labels
A-PERSIST Area: persistence of (meta)data C-feature Category: new feature or request

Comments

@sthm
Copy link
Contributor

sthm commented Apr 26, 2024

Feature request

It's not uncommon to use timezone conversion when working with timestamps, but they are currently not supported by filter pushdown.

EXPLAIN SELECT * FROM impressions WHERE mz_now() <= created_at AT TIME ZONE 'America/Los_Angeles' + INTERVAL '1hour';
                                                           Optimized Plan
------------------------------------------------------------------------------------------------------------------------------------
 Explained Query:                                                                                                                  +
   Filter (mz_now() <= timestamp_tz_to_mz_timestamp((timezone_America/Los_Angeles_ts(#3{created_at}) + 01:00:00))) // { arity: 23 }+
     ReadStorage materialize.public.impressions // { arity: 23 }                                                                   +
                                                                                                                                   +
 Source materialize.public.impressions                                                                                             +
   filter=((mz_now() <= timestamp_tz_to_mz_timestamp((timezone_America/Los_Angeles_ts(#3{created_at}) + 01:00:00))))               +
                                                                                                                                   +
 Target cluster: mz_introspection                                                                                                  +

(1 row)

Timezone conversion seems non-monotonic when transitioning from daylight savings to standard time. But it would be great if we could find a way to support them nonetheless.

In the case of a recent prospect, supporting timestamp conversion in filter pushdown would have helped to reduce the amount of data that is read from persist from 3.5 TB to well below 11 GB. https://materializeinc.slack.com/archives/C06U7BLEXHS/p1714136385095289?thread_ts=1714081612.940729&cid=C06U7BLEXHS

@sthm sthm added C-feature Category: new feature or request A-PERSIST Area: persistence of (meta)data labels Apr 26, 2024
@bkirwi
Copy link
Contributor

bkirwi commented Apr 26, 2024

Quick braindump:

  • Timezone conversion is not monotonic, so this is not the "easy" case.
  • OTOH, there are some special cases that are probably not too hard to support? But some timestamp expert should weigh in. It's possible that we could handle some interesting cases with not too much work.
    • If the timezone is a fixed offset, maybe this function is monotonic?
    • If we're just annotating a timezone on a tz-free timestamp, maybe this function is a noop?
  • Even though arbitrary timezone conversion is nondeterministic, I think there's an upper bound on how much it can change the value of a timestamp by. (Say, up to 24h?) If that's true then you could probably special-case this function by implementing SpecialUnary for it. But that's substantially more work / risk than the trivial cases.
    • You can do a hacky version of this manually by adding a second filter clause: EXPLAIN SELECT * FROM impressions WHERE mz_now() <= created_at AT TIME ZONE 'America/Los_Angeles' + INTERVAL '1hour' AND mz_now() <= created_at + '25 hours';. Gross, but maybe an acceptable workaround in a pinch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-PERSIST Area: persistence of (meta)data C-feature Category: new feature or request
Projects
None yet
Development

No branches or pull requests

2 participants