Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQ] Azure.AI.OpenAI: Support HuggingFace chat completion streaming API #44135

Open
dluc opened this issue May 19, 2024 · 1 comment
Labels
Client This issue points to a problem in the data-plane of the library. needs-team-attention This issue needs attention from Azure service team or SDK team OpenAI Service Attention This issue is responsible by Azure service team.

Comments

@dluc
Copy link
Member

dluc commented May 19, 2024

Library name

Azure.AI.OpenAI

Please describe the feature.

HuggingFace chat completion streaming API is designed to imitate OpenAI streaming response. However, due to a couple of minor differences, when pointing Azure SDK OpenAIClient to HuggingFace, method GetCompletionsStreaming hangs indefinitely:

  1. HF doesn't terminate a stream with [DONE], so SseAsyncEnumerator while loop never breaks.

  2. HF doesn't support NucleusSamplingFactor 0.0, and returns an error {"error":"Input validation error: `top_p` must be > 0.0 and < 1.0","error_type":"validation"}. Unfortunately the response status code is 200 OK so it doesn't trigger any exception. Users could workaround this issue by passing 0.01 instead, but there's no exception suggesting to change the value.

It would be great if Azure AI SDK had a way to workaround these issues, for instance:

  • Detect non-deserializable responses, e.g. SseAsyncEnumerator<Completions> should throw an exception when deserializing {"error":"Input validation error: `top_p` must be > 0.0 and < 1.0","error_type":"validation"}.
  • Detect when the remote endpoint stops sending data - as far as I know the HTTP connection is closed, so the client could stop waiting, without reaching a Task timeout

See also huggingface/text-generation-inference#1896 and microsoft/kernel-memory#388

Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @trrwilson.

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. needs-team-attention This issue needs attention from Azure service team or SDK team OpenAI Service Attention This issue is responsible by Azure service team. labels May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. needs-team-attention This issue needs attention from Azure service team or SDK team OpenAI Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

1 participant