New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-6358] Support all PostgreSQL 14 date/time patterns #3773
base: main
Are you sure you want to change the base?
Conversation
normanj-bitquill
commented
Apr 25, 2024
- Splits the PostgreSQL toChar function off to its own function
- Does not implement FX or TM prefix
- Does not implement SP suffix
- Timezone patterns are supported but all datetimes are in local timezone
Need to properly separate the MySQL/Oracle implementation of |
The amount of testing is very impressive. These won't be easy to review. I assume they were checked against postgres. |
8ec9cf8
to
54a2004
Compare
These were tested against PostgreSQL 14. The CI issues have been addressed. Is there anything that I can be done to make reviewing this easier? |
The statement that they were run against Postgres helps a lot. Thanks! |
VI | ||
!ok | ||
|
||
select to_char(timestamp '2022-06-03 13:15:48.678', 'rm'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a day of the week test for a date before the gregorian calendar?
I noticed in the past that these could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a test for this on line 316. It tests Jan 1 of year 1 to see if it is a Monday.
final Locale originalLocale = Locale.getDefault(); | ||
try { | ||
Locale.setDefault(Locale.US); | ||
assertEquals(" MONDAY", PostgresqlDateTimeFormatter.toChar("DAY", date1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the rule about leading spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From this table:
https://www.postgresql.org/docs/14/functions-formatting.html#FUNCTIONS-FORMATTING-DATETIME-TABLE
it says that days are blank padded to 9 characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The padding should be at the end. I'll update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
private interface FormatPattern { | ||
/** | ||
* Checks if the current position in the format string applies to this format | ||
* element. Will produce a formatted string if matched that is the format element |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the second phrase difficult to parse,maybe you can rephrase it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to reword this.
"a.m.", "p.m."), | ||
new NumberFormatPattern(dt -> { | ||
final String formattedYear = String.format(Locale.ROOT, "%0,4d", dt.getYear()); | ||
if (formattedYear.length() == 4 && formattedYear.charAt(0) == '0') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this implies that years above 10000 wont' be correct. Maybe that's within spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%0,4d
means to print out at least 4 digits and formatted with commas. A value like 12345
will become 12,345
.
There is the special case, where we get a 4 digit string back that has at least one leading 0. In this case, Java will not add the comma. For example 1
will become 0001
. Need to insert the comma to match PostgreSQL behaviour.
return yearString.substring(yearString.length() - 1); | ||
}, | ||
"I"), | ||
new StringFormatPattern( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure Calcite can handle dates BC, but maybe this code should not concern itself with that point.
But you may not be able to test this codepath end-to-end.
dt -> { | ||
final String monthName = | ||
dt.getMonth().getDisplayName(TextStyle.FULL, | ||
Locale.getDefault()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about the Locale?
I thought that the spec requires the month name to be in English.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. It should default to English and I should only translate when the TM
prefix is provided. I'll make the changes.
https://www.postgresql.org/docs/14/functions-formatting.html#FUNCTIONS-FORMATTING-DATETIME-TABLE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to make use of the TM flag.
dt -> { | ||
final String monthName = | ||
dt.getMonth().getDisplayName(TextStyle.SHORT, | ||
Locale.ROOT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar question
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my answer above. I'll make the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to make use of the TM flag.
I have approved, but also left some questions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember that calcite currently has support for the postgres to_char function. Should we remove the previous support?
1d02fc0
to
6d69b71
Compare
Until https://issues.apache.org/jira/browse/CALCITE-6281 has a solution, the best way to test this may be to create a Quidem test file (.iq suffix). If that file is identical with the Postgres shell output, you're done. (I have no idea whether the quidem output is compatible with Postgres, though). |
The existing TO_CHAR function is used by MySQL and Oracle, so it is left in here. This PR adds a new implementation that only PostgreSQL uses. |
Thanks, I'll take a look. |
@mihaibudiu Here is the output of a PostgreSQL script running all of the This is the SQL script. This is the output when running it on PostgreSQL 14. |
This is pretty close, the existing file |
Added the queries here: Here is a script that is able to generate the *.iq file: Ran the |
This looks very good, when they unfreeze the main branch we can merge this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
23fa524
to
4f2459b
Compare
@LibraryOperator(libraries = {POSTGRESQL}) | ||
public static final SqlFunction TO_CHAR_PG = | ||
new SqlBasicFunction("TO_CHAR", SqlKind.OTHER_FUNCTION, | ||
SqlSyntax.FUNCTION, true, ReturnTypes.VARCHAR, null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not ReturnTypes.VARCHAR_NULLABLE here? The original Calcite TO_CHAR function result should also be nullable, by the way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, this has been updated.
4f2459b
to
b0a732d
Compare
I don't understand why the CI fails. |
b0a732d
to
d1cfc2a
Compare
@mihaibudiu I have rebased to the latest for main. CI still fails for "Windows (JDK 8)" and "Windows (JDK 17)". |
* Splits the PostgreSQL toChar function off to its own function * Does not implement SP suffix * Timezone patterns are supported but all datetimes are in local timezone
d1cfc2a
to
3e87490
Compare
I have no idea about the Windows build errors. It fails trying to do task I don't see why that is missing due to the changes for this PR. |
Locally with a fresh checkout I get:
For the Windows build, I get:
|
@mihaibudiu I tried creating a new PR here, that did build correctly. If you are OK with it, we should just use that PR and close this one. I think that there are a couple of things going on.
|
Quality Gate passedIssues Measures |
Anything that works correctly is fine for me. Please note that I am on vacation so I won't have time to review this soon. |