[FEATURE REQ] Azure.AI.OpenAI: Support HuggingFace chat completion streaming API #44135
Labels
Client
This issue points to a problem in the data-plane of the library.
needs-team-attention
This issue needs attention from Azure service team or SDK team
OpenAI
Service Attention
This issue is responsible by Azure service team.
Library name
Azure.AI.OpenAI
Please describe the feature.
HuggingFace chat completion streaming API is designed to imitate OpenAI streaming response. However, due to a couple of minor differences, when pointing Azure SDK
OpenAIClient
to HuggingFace, method GetCompletionsStreaming hangs indefinitely:HF doesn't terminate a stream with
[DONE]
, so SseAsyncEnumeratorwhile
loop never breaks.HF doesn't support
NucleusSamplingFactor
0.0, and returns an error{"error":"Input validation error: `top_p` must be > 0.0 and < 1.0","error_type":"validation"}
. Unfortunately the response status code is200 OK
so it doesn't trigger any exception. Users could workaround this issue by passing0.01
instead, but there's no exception suggesting to change the value.It would be great if Azure AI SDK had a way to workaround these issues, for instance:
SseAsyncEnumerator<Completions>
should throw an exception when deserializing{"error":"Input validation error: `top_p` must be > 0.0 and < 1.0","error_type":"validation"}
.See also huggingface/text-generation-inference#1896 and microsoft/kernel-memory#388
The text was updated successfully, but these errors were encountered: