Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement large array constructing through chunk concatenation #22622

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

arararz
Copy link

@arararz arararz commented Apr 28, 2024

Description

This PR splits long array literals into chunks concatenated together. It addressed the challenge of creating large arrays with over 200 elements, which makes it complicated for users to create arrays of such sizes

Motivation and Context

Previously, any array of size 255 or greater is not supported, so users need to concatenate together many arrays to create a array of a large size (e.g. 1000), which would be very inconvenient using large arrays.
Fixes #21601

Impact

Users will now be able to create arrays of size more than 255 directly (New maximum would be 50800)

Test Plan

Incorporated unit tests to test our function that implemented the large array splitting logic and verify that the array output is correct. Ensured arrays with size of 255 or more can be created directly.

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Add automatic array splitting and concatenation for Presto SQL arrays with more than 200 elements. When users write an array with a large number of elements, such as `ARRAY[1,2,3..1000]`, Presto now automatically splits the array into multiple smaller arrays and concatenates them during the plan/RowExpression generation phase. This enhancement simplifies query construction and execution for large arrays, ensuring better performance and user convenience without requiring manual array segmentation.

arararz and others added 3 commits April 27, 2024 21:26
- Updated SqlToRowExpressionTranslator.java
- Split array into chunks of 200 and concatenated if the size is greater than 200
- Added test case to ensure that creating arrays of size 255 or more are now supported
Tested for arrays with size larger than 200
Checked that each chunk/sublist is concatenated correctly
@arararz arararz requested review from jaystarshot and a team as code owners April 28, 2024 02:12
@arararz arararz requested a review from presto-oss April 28, 2024 02:12
Copy link

linux-foundation-easycla bot commented Apr 28, 2024

CLA Signed


The committers listed above are authorized under a signed CLA.

arararz and others added 5 commits April 27, 2024 22:17
- Tested for arrays with size larger than 200
- Checked that each chunk/sublist is concatenated correctly
- Fix style issues in SqlToRowExpressionTranslator.java and TestRowExpressionSerde.java
@arararz arararz changed the title Implement large array constructing with concatenating smaller chunks Implement large array constructing through chunk concatenation Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split long array literals to use concat
2 participants