Skip to content
This repository has been archived by the owner on May 28, 2024. It is now read-only.

[doc] Cannot deploy an LLM model on EKS with KubeRay #80

Open
enori opened this issue Oct 27, 2023 · 3 comments · May be fixed by #86
Open

[doc] Cannot deploy an LLM model on EKS with KubeRay #80

enori opened this issue Oct 27, 2023 · 3 comments · May be fixed by #86
Assignees

Comments

@enori
Copy link

enori commented Oct 27, 2023

I deployed on avitary head/worker pod on EKS cluster using KubeRay and
tried to deploy the LLM model due to an error in the following command.

serve run serve/meta-llama--Llama-2-7b-chat-hf.yaml

However, I couldn't deploy it.

I think the problem is compatibility with Python packages.
Is there a requirements.txt file (e.g. pydantic) with the appropriate package versions?

The following is the commands I ran and the output.

$ kubectl exec -it aviary-head-vjlb4 -- bash

(base) ray@aviary-head-vjlb4:~$ pwd
/home/ray

(base) ray@aviary-head-vjlb4:~$ export HUGGING_FACE_HUB_TOKEN=${MY_HUGGUNG_FACE_HUB_TOKEN}

(base) ray@aviary-head-vjlb4:~$ serve run serve/meta-llama--Llama-2-7b-chat-hf.yaml
2023-10-27 00:34:01,394 INFO scripts.py:471 -- Running import path: 'serve/meta-llama--Llama-2-7b-chat-hf.yaml'.
Traceback (most recent call last):
  File "/home/ray/anaconda3/bin/serve", line 8, in <module>
    sys.exit(cli())
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/scripts.py", line 473, in run
    import_attr(import_path), args_dict
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/_private/utils.py", line 1378, in import_attr
    module = importlib.import_module(module_name)
  File "/home/ray/anaconda3/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'serve/meta-llama--Llama-2-7b-chat-hf'

(base) ray@aviary-head-vjlb4:~$ serve run models/continuous_batching/meta-llama--Llama-2-7b-chat-hf.yaml
2023-10-27 00:47:36,307 INFO scripts.py:418 -- Running config file: 'models/continuous_batching/meta-llama--Llama-2-7b-chat-hf.yaml'.
Traceback (most recent call last):
  File "/home/ray/anaconda3/bin/serve", line 8, in <module>
    sys.exit(cli())
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/scripts.py", line 462, in run
    raise v1_err from None
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/scripts.py", line 449, in run
    config = ServeApplicationSchema.parse_obj(config_dict)
  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for ServeApplicationSchema
import_path
  field required (type=value_error.missing)

Thank you for checking.

@akshay-anyscale
Copy link
Collaborator

try using serve run serve_configs/meta-llama--Llama-2-7b-chat-hf.yaml . I'll fix the docs to reflect that

@akshay-anyscale akshay-anyscale self-assigned this Nov 1, 2023
@akshay-anyscale
Copy link
Collaborator

Docs fixed here #85

@enori
Copy link
Author

enori commented Nov 2, 2023

Thank you!
I used anyscale/ray-llm as the docker image, as mentioned in another Issue

The result is changed.
Ray Serve status is DEPLOY_FAILED, the message is The deployments ['VLLMDeployment:meta-llama--Llama-2-7b-chat-hf'] are
UNHEALTHY.

Moreover, the container I deployed does not have a serve_configs directory, so I had to create a new one.
I think it is better to add the serve_configs directory in the docker image or mention the need for a copy of serve_configs file in the documentation.

I will commit the changes to the documentation later from here as well.

@enori enori linked a pull request Nov 2, 2023 that will close this issue
alanwguo pushed a commit that referenced this issue Jan 25, 2024
- Hide Gradio footer, replace with TOS and PP
- Update stats table for a consistent header font
- add github and anyscale links to the top right corner
- Update model selector buttons
- Make model description tab scrollable
- Render model output as markdown

<img width="1322" alt="image"
src="https://github.com/ray-project/aviary/assets/3463757/d8c4a0be-3cae-4724-81c9-e73904b3517f">
<img width="1278" alt="image"
src="https://github.com/ray-project/aviary/assets/3463757/abd79e88-ba53-4246-a919-88ebf3e984ba">
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants