Is this project still actively being maintained? #148

nkwangleiGIT · 2024-04-17T09:52:41Z

There is no release for 3 months and just few commits recently, so will this project be actively maintained?

I tried serve using ray-llm with some LLM, and need to update transformers, install tiktoken, update vllm etc... to make it work.

Hopefully, we can take some time to maintain this project, so we can use Ray as a unified framework for data processing, serving, tuning, training.

Thanks and looking forward to your response.

XBeg9 · 2024-04-21T16:10:06Z

I previously raised a question in Slack community channel regarding ongoing support for this project. About a month ago, there was a discussion promising continued development and updates. However, I have not seen any recent changes or updates since then.

Specifically, I am eager to see support for the new vllm/transformer packages, which are crucial for my current use cases. Could we get an update on the progress towards integrating these packages? Any timeline or roadmap would be greatly appreciated, as it would help us plan our projects accordingly.

nkwangleiGIT · 2024-04-28T01:47:01Z

I'm using fastchat previously, and now plan to use vllm and Ray serve for LLM inference, seems it's also working well.
So ray-llm is not my dependent project now :-)

leiwen83 · 2024-04-28T02:06:50Z

I'm using fastchat previously, and now plan to use vllm and Ray serve for LLM inference, seems it's also working well. So ray-llm is not my dependent project now :-)

I am also interested in found fastcaht replacement, but I wonder how to implement model registry, dynamic auto scale, and unique entry URL with Ray? ;)

nkwangleiGIT · 2024-04-28T10:30:49Z

I think ray serving ingress can do the mode registry, ray auto scale for scaling, and multiple application deployment may achieve the unique entry URL.
I will write a document about how to do this once they're tested, by now, I just test ray serve with vllm serving, and can scale manually using serveConfig like below:

  serveConfigV2: |
    applications:
      - name: llm-serving-app
        import_path: llm-serving:deployment
        route_prefix: /
        runtime_env:
          working_dir: FILE:///vllm-workspace/llm-app.zip
        deployments:
          - name: VLLMPredictDeployment
            num_replicas: 2

nkwangleiGIT · 2024-05-01T01:48:19Z

@leiwen83 here is the doc about how to run ray serve and autoscaling:
http://kubeagi.k8s.com.cn/docs/Configuration/DistributedInference/deploy-using-rary-serve/

For model registry or unique entry URL/ingress, need to take a further look, may need to customize on FastAPI?

leiwen83 · 2024-05-01T13:47:05Z

fastapi change may not be enough... For fastchat, it implement controller which track status of all workers, which make registry possible.

XBeg9 · 2024-05-01T13:55:12Z

@xwu99 is heavily working on updates, let's 🤞 and see the progress here #149

depenglee1707 · 2024-05-17T07:35:57Z

I have upgrade vllm to 0.4.1 in an earlier version in my fork, check the details if you are interested ^_^: https://github.com/OpenCSGs/llm-inference/tree/main/llmserve/backend/llm/engines/vllm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is this project still actively being maintained? #148

Is this project still actively being maintained? #148

nkwangleiGIT commented Apr 17, 2024

XBeg9 commented Apr 21, 2024

nkwangleiGIT commented Apr 28, 2024

leiwen83 commented Apr 28, 2024

nkwangleiGIT commented Apr 28, 2024

nkwangleiGIT commented May 1, 2024

leiwen83 commented May 1, 2024

XBeg9 commented May 1, 2024

depenglee1707 commented May 17, 2024

Is this project still actively being maintained? #148

Is this project still actively being maintained? #148

Comments

nkwangleiGIT commented Apr 17, 2024

XBeg9 commented Apr 21, 2024

nkwangleiGIT commented Apr 28, 2024

leiwen83 commented Apr 28, 2024

nkwangleiGIT commented Apr 28, 2024

nkwangleiGIT commented May 1, 2024

leiwen83 commented May 1, 2024

XBeg9 commented May 1, 2024

depenglee1707 commented May 17, 2024