-
I like this lib, but i am now having a storage issue with just a little amount of objects. It will fetch more embeddings and do a save_collection for a fixed period in my use case. My question is how to do save_collection/persist the right way? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 6 replies
-
Hey, first of all, thank you for using OasysDB 😁 It'd be great if you can share a minimum reproducible example of your code you use to save the collection so I can help you troubleshoot it better. But in general, what you want to do is:
# This is the directory the database will be stored in.
# This directory shouldn't change unless you want to create a new database.
db = Database("data/example")
# Create a collection or get collection from the database.
# ...
# If you only want to save an existing collection, make sure to use the same namespace.
# Otherwise, it will create a new collection.
db.save_collection("my_collection", collection) Let me know if that helps 😁 |
Beta Was this translation helpful? Give feedback.
-
Hey @Catstyle 👋 Sorry for the ping. I come up with a solution here in PR: #84. This adds 2 methods I'll make a release of v0.4.5 with this that you should be able to use soon. After you try out this solution, please let me know if this new feature solves your issue. |
Beta Was this translation helpful? Give feedback.
-
Hi @Catstyle I just want to let you know that we just release the v0.5.0 which will solve the problem we had when performing the save operation repeatedly. I have tested this functionality in this notebook and the database size of one-time save and periodical saves are both around 70MB: Please let me know if this works for you 😁 |
Beta Was this translation helpful? Give feedback.
You are actually right!
I tested the code you provided againts the optimal way to do persistence and they produce totally different result:
https://colab.research.google.com/drive/1bCOa0e6F8eZvRVo89h5CXp66V9438K4y?usp=sharing
This is the most storage optimal way to save a collection.