Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python, rust): allow for deferring optimize.compact() commits in python client #2267

Closed
wants to merge 5 commits into from

Conversation

gacharya
Copy link

@gacharya gacharya commented Mar 8, 2024

Description

This pull request allows for deferring commits when calling optimize.compact().

This is useful when running a compaction thread asynchronously. For example, an application may run separate writer and compaction threads. We cannot run them completely async, as simultaneous commits to the the transaction may result in a race condition and subsequent data loss. Having a separate commit step allows for the compaction to run asynchronously, and only block the writer when committing, which takes far less time than compacting.

This commit also releases the GIL in the compact_optimize function in src/lib.rs.

Related Issue(s)

Documentation

gautham acharya and others added 4 commits March 8, 2024 07:39
fix: clean up unnecessay mut and dead code warnings
allow for optimize without commits, surface commit API to python client for deferred committing
@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Mar 8, 2024
Copy link

github-actions bot commented Mar 8, 2024

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@gacharya gacharya changed the title feat(python, rust): Allow for deferring optimize.compact() commits in python client feat(python, rust): allow for deferring optimize.compact() commits in python client Mar 8, 2024
Some(CommitContext{
actions: total_actions,
app_metadata,
snapshot: Some(snapshot.clone()), // using original table snapshot as all commits are added at once
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I should be used table.snapshot()? instead of snapshot.clone().

table.snapshot() is the updated version, if any commits happened while this was running, they won't be picked up by snapshot.clone(), is that correct?

@gacharya gacharya force-pushed the main branch 2 times, most recently from 1099229 to 1b30730 Compare March 9, 2024 04:49
@Blajda
Copy link
Collaborator

Blajda commented Mar 10, 2024

Hi @gacharya, would like to hear more about the race conditions and data loss you are encountering while using optimize. Even with concurrent operations on the table this must not occur and the expected behavior is for the conflict checker to abort to prevent any corruption.

There is an existing PR #2208 to address this problem.
If the above PR is merged is there still a need for this change? Once corruption issues are resolved optimize can be scheduled periodically / right after all writers are finished.

@gacharya
Copy link
Author

gacharya commented Mar 11, 2024

would like to hear more about the race conditions and data loss you are encountering while using optimize. Even with concurrent operations on the table this must not occur and the expected behavior is for the conflict checker to abort to prevent any corruption.

@Blajda Happy to clarify! In this case, we have two threads -- a writer thread and a compaction thread.

We want the writers to be able to continuously write without stopping. We do not want to halt writing in order to run a compaction process, due to latency concerns.

Therefore, I've created this change, which allows for the optimizer to optimize a partition while the writer continues to commit. Then, we temporarily halt the writer while committing the optimize.compact() results.

Essentially - writes will be happening constantly -- every second. We want to halt writes as little as possible. Halting the writer during the compaction would be undesirably slow. So, I've added this change, which allows the client to halt the compaction only during committing of optimize.compact(). The commit is fast, so halting the writer during the commit is not a concern.

If the above PR is merged is there still a need for this change? Once corruption issues are resolved optimize can be scheduled periodically / right after all writers are finished.

Yes, this is still necessary, as the use case here is being able to compact and then commit separately. Here is a timeline that demonstrates the use case:

  1. Writer is writing to table
  2. Optimization starts
  3. Writer continues to write to table
  4. Optimization completes, but is not committed
  5. Writer continues to write to table
  6. Optimizer thread acquires lock, writer thread pauses
  7. Optimize thread commits
  8. Writer thread resumes writing to table.

This ensures that we can constantly write to the table, while continuously compacting without dealing with the small file problem.

@Blajda
Copy link
Collaborator

Blajda commented Mar 13, 2024

@gacharya are your writers only appending / inserting data to the table? If so, then I still don't see a need for this locking mechanism. You can have background process continuous execute optimize without conflicts as mentioned in this article.

Sounds like the root cause of your issue lies with the current conflict checker implementation. The behavior you are observing is not consistent with the mentioned article and when you attempt to execute optimize with concurrent writers a failure occurs somewhere in your workload.

Fixing the conflicting checker will result in greater benefits since fine locking between writers is removed.

@gacharya
Copy link
Author

gacharya commented Mar 13, 2024

@gacharya are your writers only appending / inserting data to the table? If so, then I still don't see a need for this locking mechanism. You can have background process continuous execute optimize without conflicts as mentioned in this article.

Sounds like the root cause of your issue lies with the current conflict checker implementation. The behavior you are observing is not consistent with the mentioned article and when you attempt to execute optimize with concurrent writers a failure occurs somewhere in your workload.

Fixing the conflicting checker will result in greater benefits since fine locking between writers is removed.

Can you elaborate on how the conflict checker resolves the issue that I am seeing?

From first principles, if two threads are trying to commit at the same time, we need to make sure the commits are serialized. Otherwise, with S3 being an eventually consistent store, we may have a race condition and subsequent data loss. The commit file cannot be concurrently written to.

@Blajda
Copy link
Collaborator

Blajda commented Mar 13, 2024

From first principles, if two threads are trying to commit at the same time, we need to make sure the commits are serialized. Otherwise, with S3 being an eventually consistent store, we may have a race condition and subsequent data loss.

Got it. The fact you are writing to S3 storage without the usage of a locking client was a key piece of information that was missing. If you don't want to use the existing dynamo implementation due to latency requirements then I recommend creating your own custom log store with your locking logic there.

Once your writers are configured to use your log store you can take use Delta operations without needing to make an escape hatch.

@Blajda
Copy link
Collaborator

Blajda commented Mar 13, 2024

One reason I'm a bit hesitant to merge this is this it creates a special escape hatch for operations. I.E Perform the work for this operation but don't commit the log. If we want to enable this type of functionality then I'd recommend waiting for #2154 to be merged. Then remove your commit_writes and instead have a to_pre_commit which will return the a PreCommit (i.e Similar to your CommitContext) and the operation's metrics.
Then you can do whatever you finalize your PreCommit to the log as you wish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants