feat(python, rust): allow for deferring optimize.compact() commits in python client #2267

gacharya · 2024-03-08T22:16:20Z

Description

This pull request allows for deferring commits when calling optimize.compact().

This is useful when running a compaction thread asynchronously. For example, an application may run separate writer and compaction threads. We cannot run them completely async, as simultaneous commits to the the transaction may result in a race condition and subsequent data loss. Having a separate commit step allows for the compaction to run asynchronously, and only block the writer when committing, which takes far less time than compacting.

This commit also releases the GIL in the compact_optimize function in src/lib.rs.

Related Issue(s)

Documentation

fix: clean up unnecessay mut and dead code warnings

allow for optimize without commits, surface commit API to python client for deferred committing

github-actions · 2024-03-08T22:16:38Z

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

gacharya · 2024-03-08T22:23:07Z

crates/core/src/operations/optimize.rs

+                Some(CommitContext{
+                        actions: total_actions,
+                        app_metadata,
+                        snapshot: Some(snapshot.clone()), // using original table snapshot as all commits are added at once


Not sure if I should be used table.snapshot()? instead of snapshot.clone().

table.snapshot() is the updated version, if any commits happened while this was running, they won't be picked up by snapshot.clone(), is that correct?

Blajda · 2024-03-10T23:27:50Z

Hi @gacharya, would like to hear more about the race conditions and data loss you are encountering while using optimize. Even with concurrent operations on the table this must not occur and the expected behavior is for the conflict checker to abort to prevent any corruption.

There is an existing PR #2208 to address this problem.
If the above PR is merged is there still a need for this change? Once corruption issues are resolved optimize can be scheduled periodically / right after all writers are finished.

gacharya · 2024-03-11T06:06:36Z

would like to hear more about the race conditions and data loss you are encountering while using optimize. Even with concurrent operations on the table this must not occur and the expected behavior is for the conflict checker to abort to prevent any corruption.

@Blajda Happy to clarify! In this case, we have two threads -- a writer thread and a compaction thread.

We want the writers to be able to continuously write without stopping. We do not want to halt writing in order to run a compaction process, due to latency concerns.

Therefore, I've created this change, which allows for the optimizer to optimize a partition while the writer continues to commit. Then, we temporarily halt the writer while committing the optimize.compact() results.

Essentially - writes will be happening constantly -- every second. We want to halt writes as little as possible. Halting the writer during the compaction would be undesirably slow. So, I've added this change, which allows the client to halt the compaction only during committing of optimize.compact(). The commit is fast, so halting the writer during the commit is not a concern.

If the above PR is merged is there still a need for this change? Once corruption issues are resolved optimize can be scheduled periodically / right after all writers are finished.

Yes, this is still necessary, as the use case here is being able to compact and then commit separately. Here is a timeline that demonstrates the use case:

Writer is writing to table
Optimization starts
Writer continues to write to table
Optimization completes, but is not committed
Writer continues to write to table
Optimizer thread acquires lock, writer thread pauses
Optimize thread commits
Writer thread resumes writing to table.

This ensures that we can constantly write to the table, while continuously compacting without dealing with the small file problem.

Blajda · 2024-03-13T00:25:41Z

@gacharya are your writers only appending / inserting data to the table? If so, then I still don't see a need for this locking mechanism. You can have background process continuous execute optimize without conflicts as mentioned in this article.

Sounds like the root cause of your issue lies with the current conflict checker implementation. The behavior you are observing is not consistent with the mentioned article and when you attempt to execute optimize with concurrent writers a failure occurs somewhere in your workload.

Fixing the conflicting checker will result in greater benefits since fine locking between writers is removed.

gacharya · 2024-03-13T01:32:37Z

@gacharya are your writers only appending / inserting data to the table? If so, then I still don't see a need for this locking mechanism. You can have background process continuous execute optimize without conflicts as mentioned in this article.

Sounds like the root cause of your issue lies with the current conflict checker implementation. The behavior you are observing is not consistent with the mentioned article and when you attempt to execute optimize with concurrent writers a failure occurs somewhere in your workload.

Fixing the conflicting checker will result in greater benefits since fine locking between writers is removed.

Can you elaborate on how the conflict checker resolves the issue that I am seeing?

From first principles, if two threads are trying to commit at the same time, we need to make sure the commits are serialized. Otherwise, with S3 being an eventually consistent store, we may have a race condition and subsequent data loss. The commit file cannot be concurrently written to.

Blajda · 2024-03-13T02:01:36Z

From first principles, if two threads are trying to commit at the same time, we need to make sure the commits are serialized. Otherwise, with S3 being an eventually consistent store, we may have a race condition and subsequent data loss.

Got it. The fact you are writing to S3 storage without the usage of a locking client was a key piece of information that was missing. If you don't want to use the existing dynamo implementation due to latency requirements then I recommend creating your own custom log store with your locking logic there.

Once your writers are configured to use your log store you can take use Delta operations without needing to make an escape hatch.

Blajda · 2024-03-13T02:50:42Z

One reason I'm a bit hesitant to merge this is this it creates a special escape hatch for operations. I.E Perform the work for this operation but don't commit the log. If we want to enable this type of functionality then I'd recommend waiting for #2154 to be merged. Then remove your commit_writes and instead have a to_pre_commit which will return the a PreCommit (i.e Similar to your CommitContext) and the operation's metrics.
Then you can do whatever you finalize your PreCommit to the log as you wish.

gautham acharya and others added 4 commits March 8, 2024 07:39

fix: clean up unnecessay mut and dead code warnings

8a68f86

Merge pull request #2 from gacharya/gautham/warnings

61b72fa

fix: clean up unnecessay mut and dead code warnings

Optimize commit v1

444d4d6

Merge pull request #1 from gacharya/gautham/release-gil

04b5d32

allow for optimize without commits, surface commit API to python client for deferred committing

gacharya requested review from MrPowers, wjones127, fvaleye, roeap, ion-elgreco and rtyler as code owners March 8, 2024 22:16

github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Mar 8, 2024

gacharya changed the title ~~feat(python, rust): Allow for deferring optimize.compact() commits in python client~~ feat(python, rust): allow for deferring optimize.compact() commits in python client Mar 8, 2024

gacharya commented Mar 8, 2024

View reviewed changes

gacharya force-pushed the main branch 2 times, most recently from 1099229 to 1b30730 Compare March 9, 2024 04:49

cargo fmt

c95b9db

gacharya force-pushed the main branch from 1b30730 to c95b9db Compare March 9, 2024 04:50

ion-elgreco closed this May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(python, rust): allow for deferring optimize.compact() commits in python client #2267

feat(python, rust): allow for deferring optimize.compact() commits in python client #2267

gacharya commented Mar 8, 2024

github-actions bot commented Mar 8, 2024

gacharya Mar 8, 2024

Blajda commented Mar 10, 2024

gacharya commented Mar 11, 2024 •

edited

Blajda commented Mar 13, 2024

gacharya commented Mar 13, 2024 •

edited

Blajda commented Mar 13, 2024

Blajda commented Mar 13, 2024 •

edited

feat(python, rust): allow for deferring optimize.compact() commits in python client #2267

feat(python, rust): allow for deferring optimize.compact() commits in python client #2267

Conversation

gacharya commented Mar 8, 2024

Description

Related Issue(s)

Documentation

github-actions bot commented Mar 8, 2024

gacharya Mar 8, 2024

Choose a reason for hiding this comment

Blajda commented Mar 10, 2024

gacharya commented Mar 11, 2024 • edited

Blajda commented Mar 13, 2024

gacharya commented Mar 13, 2024 • edited

Blajda commented Mar 13, 2024

Blajda commented Mar 13, 2024 • edited

gacharya commented Mar 11, 2024 •

edited

gacharya commented Mar 13, 2024 •

edited

Blajda commented Mar 13, 2024 •

edited