RAY-LLM stuck at replica step #143

NBTrong · 2024-03-24T10:18:30Z

Hi,

I'm trying to run a rayllm as the tutorial in README.

But now my serving seemed to stuck at replica. It looked like this:

The warning message:
{"levelname": "WARNING", "asctime": "2024-03-24 03:17:59,398", "component_name": "controller", "component_id": "1684", "message": "deployment_state.py:2152 - Deployment 'VLLMDeployment:amazon--LightGPT' in application 'ray-llm' 1 replicas that have taken more than 30s to be scheduled. This may be due to waiting for the cluster to auto-scale or for a runtime environment to be installed. Resources required for each replica: [{"CPU": 1.0, "accelerator_type_a10": 0.01}, {"CPU": 1.0, "accelerator_type_a10": 0.01, "GPU": 0.1}], total resources available: {}. Use ray status for more details."}

Can you please help me with this. Should I wait longer or is there any configuration I have missed?

Thank you.

The text was updated successfully, but these errors were encountered:

nkwangleiGIT · 2024-04-09T01:53:09Z

same issue here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAY-LLM stuck at replica step #143

RAY-LLM stuck at replica step #143

NBTrong commented Mar 24, 2024

nkwangleiGIT commented Apr 9, 2024

RAY-LLM stuck at replica step #143

RAY-LLM stuck at replica step #143

Comments

NBTrong commented Mar 24, 2024

nkwangleiGIT commented Apr 9, 2024