Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NeuralChat] Add Multi-Socket LLM Inference Example #1073

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

letonghan
Copy link
Contributor

@letonghan letonghan commented Dec 25, 2023

Type of Change

Add NeuralChat example
API not changed

Description

Add Multi-Socket LLM inference example for NeuralChat.
Related DeepSpeed PR: microsoft/DeepSpeed#4750 (not merged yet)

Expected Behavior & Potential Risk

Custormers are able to run LLM inference using multi-socket with DeepSpeed following this example.

How has this PR been tested?

Local tested on SPR server.

Dependency Change?

no.

Signed-off-by: LetongHan <letong.han@intel.com>
Signed-off-by: LetongHan <letong.han@intel.com>
mengfei25 pushed a commit to mengfei25/intel-extension-for-transformers that referenced this pull request Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant