Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Delta Lake as storage backend #5350

Open
pfwnicks opened this issue Mar 21, 2024 · 1 comment
Open

Implement Delta Lake as storage backend #5350

pfwnicks opened this issue Mar 21, 2024 · 1 comment
Labels
feature Change that does not break compatibility, but affects the public interfaces.

Comments

@pfwnicks
Copy link

Motivation

Delta Lake is a very performant and robust storage system with ACID (Atomic, Consistent, Isolated and Durable) transactions. Therefore it would be a good match and alternative to using an SQL table especially when computing in a parallel environment.

Description

It would be nice to have delta-rs: https://github.com/delta-io/delta-rs

integrated into the framework in some way or another. To start off, it could probably be achieved by implementing this in a similar way to the JournalFileStorage and JournalStorage implementation, but in the long run it would be nice to have this fully fledged into the storage backends available. Such that one could just provide a url perhaps with a local or remote path and some storage options to configure the local or remote path.

Alternatives (optional)

To start off with, it could of course be implemented as a similar implementation to JournalFileStorage, I am working on this already and will post some code when it is ready.

Additional context (optional)

https://delta-io.github.io/delta-rs/how-delta-lake-works/delta-lake-acid-transactions/

https://delta-io.github.io/delta-rs/

@pfwnicks pfwnicks added the feature Change that does not break compatibility, but affects the public interfaces. label Mar 21, 2024
@Ademord
Copy link

Ademord commented Apr 25, 2024

@pfwnicks i come here to ask, since i am running into problems in a distributed tuning scenario, where sqlite is crashing im assuming because of shared access etc; have u tried "JournalFileStorage" and do u know if it helps? what would be the difference to Delta Lake

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Change that does not break compatibility, but affects the public interfaces.
Projects
None yet
Development

No branches or pull requests

2 participants