Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using BadgerDB: DropCollection returns a "Txn is too big to fit into one request" #147

Open
willie68 opened this issue Mar 4, 2024 · 2 comments

Comments

@willie68
Copy link

willie68 commented Mar 4, 2024

I just do a simple test. Importing 1.000.000 simple documents and than i call DropCollection.
The error "Txn is too big to fit into one request" returns.
I think the error comes from the Clover DropCollection Implementation, which simply tries to delete all documents in one BadgerDB transaction. However, BadgerDB transactions are limited. In my case, the end is around 35,000 documents.
Specifically, this is 15% of the table size.
For more information: dgraph-io/badger#1325

@ostafen
Copy link
Owner

ostafen commented Mar 4, 2024

Hey, @willie68, the quickest option I see is to delete documents in batches inside DropCollection().
However, such an approach would make clover a bit tight to a specific storage engine, in this case badgerdb, so I'm not sure about the benefit of implementing it, as in general different storage engine may have very different characteristics and limitations.
Since you can achieve the sam, by just calling clover Delete() methods and selecting documents in batches (using offset and limit), I would suggest you to do so.
Does this help you?

@willie68
Copy link
Author

willie68 commented Mar 4, 2024

Thank you for the fast feedback.

First I tried bbolt. But I already failed at Query. Just a simple search with
results, err := db.FindAll(q.NewQuery(dbTable).Where(q.Field("datatime").Lt(queryTime)))
failed. (There was an index on datatime) In addition, the import performance was not sufficient for 1,000,000 data records. That's why I try to use BadgerDB. (I have already used it successfully in other projects)
#DropCollection Of course I could do this manually. Since I only have one collection at the moment, it's easier for unit tests to simply delete the file system. In the main application DropCollection will never be executed.
However, I think it is important that functions offered, should work with all options.
Maybe you can possibly extend the store interface, and put the special implementation into the store/badger/badger.go

PS.: Just found the problem of time queries in bbolt. But the performance issue remains.
100.000 simple records inserts
bbolt: 4m22.2910021s
badger: 2.9658109s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants