-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
awswrangler.athena.to_iceberg not supporting to synchronous/parallel lambda instances. #2651
Comments
Hi @B161851 , if you are inserting concurrently, you ned to make sure |
@kukushking hi, facing with same, even with two concurrent writers (lambdas). Table exists. Trying to perform upsert (MERGE INTO) operation. In my case upsert happens even on diffrent partitions (different parts of a table), so I don't think it's a race condition.
ICEBERG_COMMIT_ERROR: Failed to commit Iceberg update to table |
Just wanted to bump this issue up as well. Particular use case is uses Have had to lock lambda concurrency to 1 to avoid the |
Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed. |
bump |
bump. addressing this feature will be very helpful |
All, looks like this is service-side issue. Please raise a support request. @ChanTheDataExplorer @Salatich @vibe is it also |
in my side it is just |
We're seeing a lot of |
Describe the bug
For parallel writing, If keep_files=True then it is resulting the duplicates and I tried appending the nano timestamp to the temporary path so it's unique but now I have "ICEBERG_COMMIT_ERROR"
If keep_files=False then it is giving "HIVE_CANNOT_OPEN_SPLIT NoSuchKey Error" when ingesting iceberg data in parallel
and we observed if keep_files=False then in that library entire temp_path was removed from the s3 and getting the above error.
It's not supporting to write to the iceberg table using wrangler from lambda.
So, how can we overcome the above issues in lambda parallel writing to iceberg table using awswrangler.
How to Reproduce
we observed if keep_files=False then in that library entire temp_path was removed from the s3 and resulted "HIVE_CANNOT_OPEN_SPLIT NoSuchKey Error"
if you remove the particular parquet file from the temp_path instead of removing entire temp_path from s3, I think might give the above error.
Expected behavior
No response
Your project
No response
Screenshots
No response
OS
Win
Python version
3.8
AWS SDK for pandas version
12
Additional context
No response
The text was updated successfully, but these errors were encountered: