Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gettign error while running start_ui.py #64

Open
dkiran1 opened this issue Jan 18, 2024 · 1 comment
Open

Gettign error while running start_ui.py #64

dkiran1 opened this issue Jan 18, 2024 · 1 comment

Comments

@dkiran1
Copy link

dkiran1 commented Jan 18, 2024

I I followed the steps mentioned in setup file to build dcoker image for Gaudi
https://github.com/intel/llm-on-ray/blob/main/docs/setup.md, I could run the ray server using below command
ray start --head --node-ip-address 127.0.0.1 --dashboard-host='0.0.0.0' --dashboard-port=8265
I ran python -u ui/start_ui.py --master_ip_port "$node_ip:6379" since I could not figure out where to get node_user_name , conda_env_name name from the below command
python -u ui/start_ui.py --node_user_name $user --conda_env_name $conda_env --master_ip_port "$node_ip:6379"
I got many missing installations error, I installed the missing ones After that Iam getting the below error while installing. Can I please know what I might me missing , where to get node_user)name and conda_env_name from?

Error while connecting to Ray UI
Traceback (most recent call last):
File "/root/llm-ray/ui/start_ui.py", line 26, in
from inference.predictor_deployment import PredictorDeployment
File "/root/llm-ray/ui/../inference/predictor_deployment.py", line 21, in
from ray import serve
File "/usr/local/lib/python3.10/dist-packages/ray/serve/init.py", line 4, in
from ray.serve.api import (
File "/usr/local/lib/python3.10/dist-packages/ray/serve/api.py", line 15, in
from ray.serve.built_application import BuiltApplication
File "/usr/local/lib/python3.10/dist-packages/ray/serve/built_application.py", line 7, in
from ray.serve.deployment import Deployment
File "/usr/local/lib/python3.10/dist-packages/ray/serve/deployment.py", line 22, in
from ray.serve.context import _get_global_client
File "/usr/local/lib/python3.10/dist-packages/ray/serve/context.py", line 12, in
File "/usr/local/lib/python3.10/dist-packages/ray/serve/_private/client.py", line 28, in
from ray.serve._private.deploy_utils import get_deploy_args
File "/usr/local/lib/python3.10/dist-packages/ray/serve/_private/deploy_utils.py", line 8, in
from ray.serve.schema import ServeApplicationSchema
File "/usr/local/lib/python3.10/dist-packages/ray/serve/schema.py", line 141, in
class DeploymentSchema(BaseModel, allow_population_by_field_name=True):
File "/usr/local/lib/python3.10/dist-packages/ray/serve/schema.py", line 269, in DeploymentSchema
def num_replicas_and_autoscaling_config_mutually_exclusive(cls, values):
File "/usr/local/lib/python3.10/dist-packages/pydantic/deprecated/class_validators.py", line 231, in root_validator
return root_validator()(*__args) # type: ignore
File "/usr/local/lib/python3.10/dist-packages/pydantic/deprecated/class_validators.py", line 237, in root_validator
raise PydanticUserError(
pydantic.errors.PydanticUserError: If you use @root_validator with pre=False (the default) you MUST specify skip_on_failure=True. Note that @root_validator is deprecated and should be replaced with @model_validator.

@KepingYan
Copy link
Contributor

What is the version of pydantic? Could you downgrade it to 1.10 and try again? By the way, web ui has only been tested on Intel CPU, I'm not sure if it can run successfully on Guadi. And it needs to use the environment created by conda/miniconda, so the conda_env_name is the name of environment used. node_user_name is the username of the node or docker container.

zhangjian94cn pushed a commit to zhangjian94cn/llm-on-ray that referenced this issue Feb 4, 2024
* slim dockerfile

* remove credentials

* rename

* add postfix

* update

* push 1 version of dp

* add new code

* remove the old code

* revert

* remove unused libs

* add new package

* add parquet support

* change name

* use output_dir instead of output_prefix

* merge

* remove unused file

* fix typo

* add more automation

* add dp config

* add saving csv

* add dp config yaml

* add stop containers

* add stop containers

* remove dp config

* tokenier as input

* add a file to count row numbers

* change dockerfile name

* some refinement

* add file numbers

* add real script

* add mulit-processing code

* add file name

* add pyrecdp

* refine

* remove

* remove developer name

* remove pyrecdp

* change name oder

* remove files

* add use-slow flag

---------

Co-authored-by: N <matrix.yao@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants