Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-28208: WITH column list doesn't work with CTE materialization #5232

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

okumin
Copy link
Contributor

@okumin okumin commented May 2, 2024

What changes were proposed in this pull request?

ANSI SQL supports column aliases for CTEs. Hive also supports it but it fails when CTEs are materialized.

https://issues.apache.org/jira/browse/HIVE-28208

Why are the changes needed?

This patch allows materialized CTEs to obey the standard.

Does this PR introduce any user-facing change?

No. The affected queries never succeed now.

Is the change a dependency upgrade?

No.

How was this patch tested?

I added test queries to cte_mat_1.q.

@@ -8379,7 +8402,9 @@ private ColsAndTypes deriveFileSinkColTypes(
String colName = colInfo.getInternalName(); //default column name
if (columns != null) {
FieldSchema col = new FieldSchema();
if (!("".equals(nm[0])) && nm[1] != null) {
if (i < withColList.size()) {
Copy link
Contributor Author

@okumin okumin May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an input name when is shorter than the input. This follows the behavior of non-materialized CTEs.
If I understand correctly, with-column-list is supported only when the number of columns is exactly equal to that of the query expression. So, it could be an option to disallow colInfos.size() != withColList.size().

set hive.optimize.cte.materialize.threshold=2;
set hive.optimize.cte.materialize.full.aggregate.only=false;
-- Use a format that retains column names
set hive.default.fileformat=parquet;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default one positionally accesses physical columns, and we can't find some kinds of errors.

Copy link

sonarcloud bot commented May 2, 2024

Quality Gate Passed Quality Gate passed

Issues
6 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants