Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing GPU plugin documentation items #1401

Open
Tracked by #28
eero-t opened this issue May 2, 2023 · 2 comments
Open
Tracked by #28

Missing GPU plugin documentation items #1401

eero-t opened this issue May 2, 2023 · 2 comments

Comments

@eero-t
Copy link
Contributor

eero-t commented May 2, 2023

GPU plugin documentation is missing following things:

@tkatila
Copy link
Contributor

tkatila commented May 4, 2023

I'd add another item to your list: Simplify GPU plugin deployment options.

I would vote to only have two examples:

  • NFD + GPU-plugin with shared-dev-num=1 and monitoring
    • Basic use case, should work for most
  • NFD + GPU-plugin with shared-dev-num>1, resource management, monitoring and extended resources
    • GAS use case, for those who need it

Then I'd add notes about configuration options, using shared-dev-num without GAS etc. into a different file (advanced-deployment.md or similar). As you say, the current README is pretty long.

@eero-t
Copy link
Contributor Author

eero-t commented May 5, 2023

IMHO "using shared-dev-num without GAS" should be documented as being for "dedicated cluster/nodes with a single GPU workload, where share count equals to how many instances of that workload fit into a single GPU (with required QoS)".

AFAIK other uses for it are non-production ones, so they do not need to be mentioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants