Skip to content
This repository has been archived by the owner on May 13, 2021. It is now read-only.

scout:import on large data (5 million records) taking a lot of time #58

Closed
Kladislav opened this issue Nov 9, 2020 · 8 comments
Closed

Comments

@Kladislav
Copy link

No description provided.

@Kladislav Kladislav changed the title Import on large data (5 million records) taking a lot of time scout:import on large data (5 million records) taking a lot of time Nov 9, 2020
@shokme
Copy link
Collaborator

shokme commented Nov 9, 2020

@curquiza do you have an idea if it's related to scout or meilisearch, I have never experienced on a such large dataset.

@curquiza
Copy link
Member

curquiza commented Nov 9, 2020

Hello @Kladislav and @shokme!
This might indeed be linked to MeiliSearch and not to scout.
@Kladislav, can you explain what you mean by "taking a lot of time": do you get any error message? Does the update take time to be preccessed? How many times?

@Kladislav
Copy link
Author

Kladislav commented Nov 10, 2020

Idk, no errors, api works fine, but i spend 18 hours and got only 2/5 million records imported, can i speed up it?

@shokme
Copy link
Collaborator

shokme commented Nov 11, 2020

If you don't already do it, you can try to run multiple queue with scout.

@curquiza
Copy link
Member

Hello @Kladislav!
MeiliSearch allows you to send documents by batch. You can increase the size of these batches with this MeiliSearch parameter: https://docs.meilisearch.com/guides/advanced_guides/configuration.html#payload-limit-size.
If you have a huge amount of batches, this leads to a huge indexation time. So, you should increase the number of documents sent per batch to reduce this indexation time. But be careful, do not create too big batches: when the batches are big the memory usage can be high and MeiliSearch could be killed.
If your documents have around 20 fields, you can try to send your documents by batch of 10 000 documents.

Also, we are currently actively working on improving the core-engine to reduce this indexation time 😉

@Kladislav
Copy link
Author

php artisan scout:import -c 10000, 5.5 million records in 6 hours, sounds good

@curquiza
Copy link
Member

can I close this issue then @Kladislav? 🙂

@Kladislav
Copy link
Author

Kladislav commented Nov 25, 2020

@curquiza =)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants