Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to add display timezone based on user timezone #21929

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

amerissa
Copy link

@amerissa amerissa commented May 10, 2024

A Trino query against an iceberg table timestamp column renders the query result differently from how Spark renders the query result for the same iceberg table:

Spark-SQL:

set time zone local;

CREATE TABLE spark_catalog.ts_iceberg.example_timestamp_ltz (ltz TIMESTAMP) USING iceberg 
LOCATION '<a_path>/ts_iceberg.db/example_timestamp_ltz' 
TBLPROPERTIES ('current-snapshot-id' = 'none', 
                                 'format' = 'iceberg/parquet', 
                                 'format-version' = '2')

insert into ts_iceberg.example_timestamp_ltz values(timestamp '2024-03-25 12:12:12.123456 America/Toronto');


select * from ts_iceberg.example_timestamp_ltz;
2024-03-25 16:12:12.123456  <- UTC normalized and timestamp is rendered using local time zone (UTC)

set time zone 'America/Toronto';
select * from ts_iceberg.example_timestamp_ltz;
2024-03-25 12:12:12.123456  <- timestamp rendered using time zone ‘America/Toronto’

Trino SQL (against the same Spark generated iceberg table above):

set time zone local;
select * from iceberg.ts_iceberg.example_timestamp_ltz;
2024-03-25 16:12:12.123456 UTC  <- timestamp rendered with ‘UTC’ suffix which is different than Spark

set time zone 'America/Toronto';
select * from iceberg.ts_iceberg.example_timestamp_ltz;
2024-03-25 16:12:12.123456 UTC  <- set time zone‘America/Toronto’ ignored – UTC localized with UTC suffix

Postgres rendering behavior for a table with a timestamptz column is like Spark:

create table public.test_iceberg (test_column timestamptz NULL);

ez_utility_uat=> show timezone;
TimeZone
----------
 UTC
(1 row)

ez_utility_uat=> insert into public.test_iceberg_ts_wtz values(timestamptz '2024-03-25 12:12:12.123456 America/Toronto');
INSERT 0 1

ez_utility_uat=> select * from public.test_iceberg_ts_wtz;
          test_column
-------------------------------
 2024-03-25 16:12:12.123456+00  <- UTC normalized and timestamp is rendered using local time zone (UTC)
(1 row)

ez_utility_uat=> set time zone 'America/Toronto';
SET

ez_utility_uat=> select * from public.test_iceberg_ts_wtz;
          test_column
-------------------------------
 2024-03-25 12:12:12.123456-04  <- timestamp rendered using session time zone ‘America/Toronto’
(1 row) 

The only option available in Trino for rendering the timestamp column at the desired timezone is to employ the Trino at_timezone function. This is undesirable as it requires a code change to existing queries. Adding the ability for Trino to transparently render the Timestamp based on a set/configured Timezone similar to how Spark and Postgres behaves is preferred because it is less intrusive.

This PR introduces configurable Trino query behavior for rendering Iceberg table timestamp columns in the same manner that Spark and Postgres render timestamp with time zone data type columns. It simplifies and encourages the adoption of Iceberg format tables that have columns of type timestamp with time zone.

The enhancement ensures that an Iceberg table timestamp column value:

Is normalized to UTC time
Is rendered at query time according to a default or specified session Time Zone

Copy link

cla-bot bot commented May 10, 2024

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Amer Issa.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@amerissa amerissa changed the title [WIP] Changes to add display timezone based on user timezone Changes to add display timezone based on user timezone May 10, 2024
Copy link

cla-bot bot commented May 10, 2024

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Amer Issa.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@findepi
Copy link
Member

findepi commented May 10, 2024

Should this be a client-side thing, @martint ?

@sbrackenbury-teranet
Copy link

This PR originates from this Issue:
#21871

Copy link

cla-bot bot commented May 11, 2024

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Amer Issa.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link

cla-bot bot commented May 11, 2024

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Amer Issa.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@martint
Copy link
Member

martint commented May 11, 2024

Should this be a client-side thing, @martint ?

Yes, see my comment in the issue. IMO, it should either be a CLI rendering option or an addition to the protocol to allow clients to specify that they want all timestamp with time zone converted to the client's time zone before they are transported to the client. I'm not a big fan of the latter, as it introduces an entire new layer of processing and complexity.

The downside is that either of these options requires updates to clients and client applications. Controlling the latter it via a session property instead of a protocol option also has the downside that it only works for clients that support multi-statement interactions with the server.

@wendigo
Copy link
Contributor

wendigo commented May 14, 2024

@martint WITH SESSION that allows for tz switch would address the last concern, correct?

@martint
Copy link
Member

martint commented May 14, 2024

WITH SESSION that allows for tz switch would address the last concern, correct?

Yes, but that's the smallest of the issues. It still requires introducing a processing layer to translate all values to such timezone before returning the results to the client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants