Milvus backup tool: How to speed up restore of 200 million vectors? #32750
-
Hi! I'm trying to restore 200 million vectors using Milvus backup tool v0.4.12 (https://github.com/zilliztech/milvus-backup/). Restore and create index takes 6 hours. Ideal would be <1 hour. Problem
Questions
Thanks! DetailsClusterMilvus v2.4.0 on AWS EKS with two c6i.4xlarge. show collection
Milvus backup tool config
Milvus backup log
|
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 13 replies
-
"Restore" calls bulkinsert interface to import data files. Each data node executes task one by one. Multiple data nodes can execute tasks parallelly. The log shows a task with rowcount=281153 takes 20 seconds, from 23:58:39 to 23:58:59. So, with one data node, 200M rows will takes about 14227 seconds = 237 minutes. Try use multiple data nodes to restore. |
Beta Was this translation helpful? Give feedback.
-
@bigsheeper |
Beta Was this translation helpful? Give feedback.
-
Milvus 2.4.0+ supports importing multiple segments concurrently per node. However, based on the logs, it appears that the milvus-backup tool restores segments sequentially. To leverage import concurrency in the datanode to the fullest extent, I think we should restore by the partition binlog prefix instead of the segment binlog prefix. @wayblink
|
Beta Was this translation helpful? Give feedback.
-
Test results Milvus Backup Tool v0.4.13-rc1I've repeated the tests from #32750 (comment) using Backup Tool v0.4.13-rc1. Thanks to everyone involved for your help and the fix! Cluster
Data setSame as in #32750 (comment) plus some new vectors. About 207 mio. vectors. 739 segments. Restore durationbackup.parallelism.restoreCollection: 8
backup.parallelism.restoreCollection: 32
backup.parallelism.restoreCollection: 739 (= number of segments)
Index creation duration
|
Beta Was this translation helpful? Give feedback.
-
nice! |
Beta Was this translation helpful? Give feedback.
@schuberttobias Hi, please try the latest version https://github.com/zilliztech/milvus-backup/releases/tag/v0.4.13-rc1