Add support for AzureOpenAI #197

rajib76 · 2023-11-21T19:57:49Z

I see that langkit is not supporting Azure Open AI. When can this be supported?

jamie256 · 2023-11-21T22:09:15Z

Hi @rajib76! Thanks for opening an issue.

Are you looking for support of an Azure OpenAI deployed LLM in LangKit's OpenAIDefault kind of thing like we do in the Choosing an LLM example? or something else?

I wanted to get some better support for Azure hosted models into LangKit soon, we could probably focus on changes to support the Azure OpenAI models as a first iteration if that is helpful (e.g. gpt-35-turbo, gpt4)?

rajib76 · 2023-11-21T22:14:36Z

Yes, looking for support in LangKit's OpenAIDefault. Currently if I need to do hallucination checks, looks like I cannot do it using Azure Open AI. It is only supported for open ai. I am looking at using to evaluate the response from Azure Open AI for hallucination, prompt injection, contextual relevancy and all

jamie256 · 2023-11-22T00:27:34Z

Ok, working on it. If you want to try an initial dev build:
pip install langkit==0.0.26.dev0

New class and usage looks like this:

from langkit import response_hallucination
from langkit.openai import OpenAIAzure

response_hallucination.init(llm=OpenAIAzure(engine="LangKit-test-01"), num_samples=1)

Also need to set these new env vars:

os.environ["AZURE_OPENAI_ENDPOINT"]
os.environ["AZURE_OPENAI_KEY"]

As referenced in this example: https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?tabs=python&pivots=programming-language-chat-completions#working-with-the-gpt-35-turbo-and-gpt-4-models

rajib76 · 2023-11-22T20:25:45Z

Was able to run it but saw one thing. I tried the example code for hallucination score. I did not see a way to add a context and then tell the LLM to grade the response based on question and context

rajib76 · 2023-11-22T21:00:12Z

Was able to run it but saw one thing. I tried the example code for hallucination score. I did not see a way to add a context and then tell the LLM to grade the response based on question and context

I think I got how it is working. It is self check validation. The prompt is being sent to the same model to get and answer again and then we are checking back with the response. Is it possible to do below

Send a ground truth instead of LLM creating sample
If I need to create a sample with a llm, can I use a different LLM than the one which actually does the hallucination check(or is the hallucination check done by the LLM or through a ML model)

FelipeAdachi · 2023-11-27T19:20:11Z

Hi, @rajib76

Yes, this is exactly how response_hallucination works. To try to answer your questions:

This module was designed with the zero-resource scenario in mind - without ground truth or context available - so sending a ground truth is impossible. This is something we could look into, though. Would it work for your use case if we had a variant where you pass the response and ground truth/context columns, thus removing the need to generate additional samples? (one llm call would still be required for the consistency check)
Currently, both processes (creating the samples and consistency check) use the same LLM, so it is not possible to use one llm to create the sample and another one to do the hallucination check. Can you explain your scenario a bit more?

rajib76 · 2023-11-28T01:25:34Z

Thanks Felipe, for#1, it will work if we can just pass the response and the ground truth. But do we need a LLM to do the consistency check. Can we not have an option to do a semantic match with an embedding model and then put a threshold score.

For #2, I am planning to implement chain of verification as I mentioned in this recording. I wanted to check if this can be out of the box from langkit

https://www.youtube.com/watch?v=qRyCmi0DeU8

FelipeAdachi · 2023-11-28T19:49:51Z

Thanks for the reply @rajib76

For #1, yes, it should be possible to perform the semantic similarity based consistency check without the presence of an LLM.

And #2 also makes a lot of sense for your and others' scenarios

I created two issues to reflect both topics we are discussing:

Response Hallucination: decouple LLMs for sample generation and consistency checking #205
response_hallucination: remove llm requirement for consistency check #204

We'll plan those changes in future sprints

jamie256 mentioned this issue Nov 22, 2023

Add support for Azure OpenAI hosted models #199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for AzureOpenAI #197

Add support for AzureOpenAI #197

rajib76 commented Nov 21, 2023

jamie256 commented Nov 21, 2023

rajib76 commented Nov 21, 2023

jamie256 commented Nov 22, 2023

rajib76 commented Nov 22, 2023

rajib76 commented Nov 22, 2023

FelipeAdachi commented Nov 27, 2023

rajib76 commented Nov 28, 2023

FelipeAdachi commented Nov 28, 2023

Add support for AzureOpenAI #197

Add support for AzureOpenAI #197

Comments

rajib76 commented Nov 21, 2023

jamie256 commented Nov 21, 2023

rajib76 commented Nov 21, 2023

jamie256 commented Nov 22, 2023

rajib76 commented Nov 22, 2023

rajib76 commented Nov 22, 2023

FelipeAdachi commented Nov 27, 2023

rajib76 commented Nov 28, 2023

FelipeAdachi commented Nov 28, 2023