Skip to content

Merge Two Deeplake datasets #2187

Closed Answered by istranic
gilvikra asked this question in General
Feb 18, 2023 · 1 comments · 2 replies
Discussion options

You must be logged in to vote

Hey @gilvikra thank you for raising this question. We have partial solutions to your issue, and we also have additional improvements on our roadmap. Here's a summary.

  • Currently, the best way to accelerate ingestion is via the deeplake.compute API. It only works on a single machine, but is uses multiple processes and threads.
  • We're releasing speedups to the API above later this week. Here's the PR.
  • In the future, we will extend the API above to work using Ray on multi-node systems.
  • A very popular request from users has been to combine datasets virtually (i.e. datasets are stored separately, but can be concatenated into a single virtual dataset that can be used in the Deep Lake API). We wi…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@gilvikra
Comment options

@istranic
Comment options

Answer selected by tatevikh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants