feat: Add projection for batch exports on inserted_at #21839
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
The most confusing thing for batch export users is their data not appearing as it was ingested with a delay. Having a proper projection that sorts the data by
COALESCE(inserted_at, _timestamp)
would allow us to remove any timestamp bounds that exclude events with delay, thus solving the problem. Also, it removes the need for a list of exceptions set viaUNCONSTRAINED_TIMESTAMP_TEAM_IDS
.Changes
Adds projection, materializes last month. Only last month should work for ongoing realtime exports, we'll worry about backfills later.
馃憠 Stay up-to-date with PostHog coding conventions for a smoother review.
Does this work well for both Cloud and self-hosted?
How did you test this code?
Takes a second to create and materialize in ClickHouse cloud, what could go wrong?