Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Is there a mechanism to detect when the rotatingWriter finishes writing to a file and to be notified of the file that was written? #344

Closed
calvinlfer opened this issue Feb 20, 2024 · 1 comment

Comments

@calvinlfer
Copy link
Contributor

calvinlfer commented Feb 20, 2024

I was looking at the postWriteHandler mechanism on viaParquet, I see that there's a flush mechanism but I'm a bit confused on the semantics.

  • It seems like postWriteHandler gets called after each chunk of data is written?
  • Is there a way to use this mechanism such that you know when a file has been written and which file it was?

To add some context: I am trying to add Parquet files written by Parquet4S to an Iceberg table and integrate with Apache Iceberg's Java API

@mjakubowski84
Copy link
Owner

mjakubowski84 commented Feb 21, 2024

It seems like postWriteHandler gets called after each chunk of data is written?

That's true. You can use postWriteHandler to implement flushing based on your own business logic.

Is there a way to use this mechanism such that you know when a file has been written and which file it was?

The handler gets a parameter (PartitionState) that informs which partitions (not particular files) have been modified and how many writes were done to those partitions so far (when I look at those counts I think there might be a bug there - I need to check ).
There's no handler for file disposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants