-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure 'name' on initial message #2635
base: main
Are you sure you want to change the base?
Conversation
I've updated existing test cases that now show the name on the messages. |
Thanks! I think this addition is useful! Have you tested on non-OpenAI but OpenAI compatible endpoints? E.g., LiteLLM + Ollama. |
I haven't yet, I've been trying to clarify with LiteLLM dev and they don't think Ollama takes it in. I will run this test program and see if it does utilise name. If not, then we definitely need to consider getting the name into the content, either through message transforms or some other way. |
Okay, testing with LiteLLM + Ollama... Name included in messages:
Output
So, no name included. Tried again with a more direct question.
Again, no name included and it also referred to Joe as Cathy. I noticed this on a couple of different message sets. Now, will try with adding the name directly into the content.
So, adding Now, adding a more direct system message about there being multiple people in the conversation.
This shows improvement, but it shows that the So, we'll try a different set of messages to try and get them to have a conversation with each other. In this one we put in who is speaking to who.
And by adding
Looks much better, they act as their namesakes and feel like they are interacting with each other. I also ran this again and reduced the system messages back to Okay - so I think we can make the following observations fore LiteLLM+Ollama and likely other non-OpenAI setups:
I'll continue to try different system messages and content adjustments, but I think we would need to be able to transform messages and be able to put in the sender name and the recipient name. Here's the joke one again with
|
Thanks for the update! This is indeed very interesting. Looks like adding the For endpoints that bark at the |
Does the |
In LM Studio it's accepted, but it's not being used. For Mistral AI API, I'll check that. On a side note with LM Studio, it won't accept messages where content is blank, which does happen sometimes with local LLMs which I'm investigating. |
@ekzhu, I've tested the Mistral models through Together.ai and the "name" field is being accepted (not breaking), but isn't being used. The only way to get the name known is to inject it into the content itself. However, testing it through Mistral.ai's API it did fail when it has the Through the testing, to me it seems that having the |
I feel one way to do this is to have a built-in client for each API endpoint, similar to AzureOpenAI so user can specify the API type using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The analysis of alt APIs are super helpful. We can merge this once the #2748 is addressed -- so that the changes in this PR is not going to break two agent chats for Mistral AI API.
Included "name" on messages in select speaker nested chat, as per this comment. |
Why are these changes needed?
When initiating a chat through ConversableAgent's
initiate_chat
the passed in message in the conversation doesn't get thename
of the agent initiating the conversation attached to it. This is, then, not passed through to the LLM and it cannot use that name information.So, running this block of code:
will result in a list of messages to the LLM like this:
I would expect there to be a
name
key, with the valuejoe
, attached to the second message as that agent initiated the chat.The response from the LLM is:
'Why did Cathy and her comedy partner decide to open a bakery together?\n\nBecause they realized they "dough"n\'t just make people laugh, they can also make them delicious pastries!'
... and it can be seen that it doesn't reference Joe's name.
By including the
name
key/value in the messages, like this:... the LLM can use the name and returns this:
'Why did Cathy and Joe go to the comedy club?\n\nBecause they heard it was a great place for two funny people to crack jokes and make people laugh!'
... indicating that the name field is utilised (and LLMs aren't great at jokes!).
To fix this, the ConversableAgent's
_append_oai_message
has been updated to add the name field, if it doesn't exist and is not a function/tool message. To support this addition, anis_sending
parameter is used on the function to indicate whether theself
agent orconversation_id
agent is the sender and, hence, the name to attach to the message.I have not added this to documentation or updated tests for this. Please let me know if it needs specific tests added.
Related issue number
No related issue.
Checks