Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

semantic_search速度的一点疑问 #55

Closed
xxllp opened this issue Jan 13, 2023 · 6 comments
Closed

semantic_search速度的一点疑问 #55

xxllp opened this issue Jan 13, 2023 · 6 comments
Labels
question Further information is requested

Comments

@xxllp
Copy link

xxllp commented Jan 13, 2023

Describe the Question

semantic_search 这个方法 在cpu下 我发现不断扩大语料的数量 预测的用时都基本一致 。这个有点奇怪吧
就算是一次计算 也应该数据越多 速度越慢才对吧

@xxllp xxllp added the question Further information is requested label Jan 13, 2023
@shibing624
Copy link
Owner

正常的,因为内部原理可以参考https://www.sbert.net/examples/applications/semantic-search/README.html?highlight=_search#util-semantic-search

底层是100个query并行计算提速,适用于大规模预测检索,不是简单的暴力匹配,快很多倍的,所以用户感知数据量大时速度没变慢。

By default, up to 100 queries are processed in parallel. Further, the corpus is chunked into set of up to 500k entries. You can increase query_chunk_size and corpus_chunk_size, which leads to increased speed for large corpora, but also increases the memory requirement.

@xxllp
Copy link
Author

xxllp commented Jan 13, 2023

多谢~~

@xxllp xxllp closed this as completed Jan 13, 2023
@xxllp
Copy link
Author

xxllp commented Jan 16, 2023

我还有个问题想咨询下:有没有什么简单的办法可以压缩下现在的文本表征718维度到128维度啥的

@xxllp xxllp reopened this Jan 16, 2023
@shibing624
Copy link
Owner

shibing624 commented Jan 16, 2023

readme有写模型压缩(模型蒸馏)操作。

@xxllp
Copy link
Author

xxllp commented Jan 16, 2023

模型压缩能压缩维度吗 而且是对你开源的这个bert模型

@shibing624
Copy link
Owner

能,pca

@xxllp xxllp closed this as completed Jan 17, 2023
@shibing624 shibing624 pinned this issue May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants