[Question] How to produce dataset with millions of data in batches #677

Q-Bug4 · 2024-04-03T12:36:48Z

Hi, we love using hollow, it is very nice.

I wanna know if there is a properly way to produce data in batches? Like I have 10 million objects to produce, I wanna produce them divided into 10 parts and produce 1 million objects every time. I need to produce data in batches because my vm does not have enough memory to store 10 million objects.

I am using Incremental and withNumStatesBetweenSnapshots to make it publish snapshot only at begining and at last so that it run like "in batches". But I met a problem that sometimes the Incremental did not publish dataset because some batch do not change the dataset.
I have fork hollow-reference-implementation and make 2 test cases to show what we are looking for. You can check my test cases: ProducerTest

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] How to produce dataset with millions of data in batches #677

[Question] How to produce dataset with millions of data in batches #677

Q-Bug4 commented Apr 3, 2024

[Question] How to produce dataset with millions of data in batches #677

[Question] How to produce dataset with millions of data in batches #677

Comments

Q-Bug4 commented Apr 3, 2024