Skip to content
This repository has been archived by the owner on May 28, 2024. It is now read-only.

Podman Error on red hat 9? #127

Open
jayteaftw opened this issue Jan 25, 2024 · 0 comments
Open

Podman Error on red hat 9? #127

jayteaftw opened this issue Jan 25, 2024 · 0 comments

Comments

@jayteaftw
Copy link

I am trying to deploy rayLLM locally, and these are the commands I am running

cache_dir=${XDG_CACHE_HOME:-$HOME/.cache}

podman run -it  --device nvidia.com/gpu=0  --security-opt=label=disable --shm-size 20g -p 8000:8000 -e HF_HOME=/home/ray/data -v $cache_dir:/home/ray/data  anyscale/ray-llm:latest bash

# Inside docker container
serve run ~/serve_configs/amazon--LightGPT.yaml

However, when I run the serve command, I will get
2024-01-24 16:34:15,360 INFO scripts.py:411 -- Running config file: '/home/ray/serve_configs/amazon--LightGPT.yaml'. 2024-01-24 16:34:18,947 INFO worker.py:1715 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265

And it will hang for a bit and not create a Ray instance.
I am running this on RedHat9, and I know the GPU is connect because I can run nvidia-smi.

Interestingly, I cannot cd into data when in the pod as it says I need sudo permissions. Could this be part of the problem?

alanwguo pushed a commit that referenced this issue Jan 25, 2024
addressing the comments left in #121 as follow-up

---------

Signed-off-by: Max Pumperla <max.pumperla@googlemail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant