Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy Between Documented and Actual Memory Usage for ML Model Allocations in Elasticsearch #107829

Open
oldcodeoberyn opened this issue Apr 24, 2024 · 1 comment
Labels
:ml Machine learning Team:ML Meta label for the ML team

Comments

@oldcodeoberyn
Copy link

oldcodeoberyn commented Apr 24, 2024

Description

The current Elasticsearch documentation describes that scaling throughput by adding more allocations to a deployment allows for more parallel inference requests and that all allocations assigned to a node share the same copy of the model in memory.

Throughput can be scaled by adding more allocations to the deployment; it increases the number of inference requests that can be performed in parallel. All allocations assigned to a node share the same copy of the model in memory. The model is loaded into memory in a native process that encapsulates libtorch, which is the underlying machine learning library of PyTorch. The number of allocations setting affects the amount of model allocations across all the machine learning nodes. Model allocations are distributed in such a way that the total number of used threads does not exceed the allocated processors of a node.

However, in practice, each additional allocation requires extra memory, and this increase appears to be linear with the number of allocations. and finally, we will reach the memory limitation by scale up allocation

image
image
image
image
image

@oldcodeoberyn oldcodeoberyn added >enhancement needs:triage Requires assignment of a team area label :ml Machine learning and removed needs:triage Requires assignment of a team area label >enhancement labels Apr 24, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

2 participants