[FEATURE] avoid second round trip for function call #656

Grogdunn · 2024-04-29T15:25:35Z

Reefer issue #652

For some use case after local function is called no need to pass the results to LLMs. (eg: data extraction from documents)

tzolov

Thank you @Grogdunn,
I believe we are getting closer and this will be very useful function calling feature.

Still I believe that the CompleteRoundTripBox and the runtime detection of needCompleteRoundTrip are unnecessary.
Since in the FunctionCallbackContext you have already detected Void response (and Consumer) functions it should be possible add new boolean noReturnFunction() method to the FunctionCallback interface.
Then in the AbstractFunctionCallSupport#handleFunctionCallOrReturn you should be able to use this to decide how to handle the return?
This way you can implement everything inside the abstract support classes without changing the Model implementations?
Maybe I'm missing some detail?
Let me know if you would be interested to explore the suggested approach? or should I give it a try.
Thank you very much for the contribution!

...ai-azure-openai/src/main/java/org/springframework/ai/azure/openai/AzureOpenAiChatClient.java

tzolov · 2024-05-13T09:11:30Z

@Grogdunn i'm afraid you've committed many changes unrelated to this PR.
Please fix the PR or create new one that includes only the related changes.

To prevent such situations, please don't reformat the project code. Especially code not related to the PR.
Also use pull rebase requests (e.g. git pull -r upstream main) to rebase to the top of main. Never merge with upstream main.

Grogdunn · 2024-05-13T09:34:04Z

@tzolov removed commit with all import * fixed and re-played only for touched subproject.
rebased on top on upstream master (as requested)

conflicts are gone! 🎉

tzolov · 2024-05-17T04:12:41Z

Thanks @Grogdunn.
Still the majority of the changes (> 100 files) in the PR are code reformatting unrelated to the PR goal. For example import reordering in AnthropicChatOptions.java, AnthropicApi.java ....
I'm presuming that actual changes would affect much fewer files.
Can you please undo those changes (i guess your IDE is doing them automatically) or copy only the related changes in new PR.

Grogdunn · 2024-05-17T06:34:15Z

Sure! Today I'll try to reduce PR changeset.

Grogdunn · 2024-05-17T08:31:25Z

Ok, squashed, rebased onto main, cleaned up.

tzolov · 2024-05-19T18:11:12Z

Hi @Grogdunn ,
I've been reviewing the PR and doing some improvements when I realised that this approach (and likely any approach) will be incorrect.

First, I noticed that the PR doesn't support streaming function calling and while reasoning about ways to provide such support I realised that we are on a wrong path.

Basically we can not safely abruptly stop the function calling message exchange and leave the conversation in a safe state.

Simply we don't know what other actions the LLM might need to perform before safely answering the user prompt.
For example if you your prompt is " Cancels my flight and let me know what is the weather in Amsterdam" and the Function<FlightToCancel, Void> doesn't return value. The PR will abruptly stop the conversation and will not allow the LLM to call the second WeatherInfo function and complete the conversation.
It gets even more complicated for streaming case.

So IMO the only way this can ever work is if the LLM Function Calling API provide a protocol to send EmptyResponse for a Tool call. Any other heuristics will be incorrect.

Please let me know what do you think.

Grogdunn · 2024-05-20T06:42:23Z

🤔
Well the PR was made before the other PR with function streaming come to main, so I can port the changes to function-streaming.

I think that is possible because before call local function we wait for streaming-stop signal (I don't remeber the correct terms), then the stream stops and we call the local functions.
After that we have, with your example, two results: 1 - null and 2 - {"temperature": 30, "sky": "sunny", "blabla": "lorem ipsum"}
So we evalutate all the local responses and complete the second roundtrip because we have the answer number 2.

I'll try to spike that today.

tzolov · 2024-05-20T07:42:12Z

@Grogdunn , the problem is not related to the streaming but to the sync calls as well.
Simply, current LLM API demands that you should always return a function call result message back to the LLM. Even for a void functions.
Perhaps a way to emulate a void function call behaviour is to return an empty string or something like "DONE" in case of void functions.

If you do not return a value you break the conversation and prevent the LLM to do some follow up function calls.

Grogdunn · 2024-05-20T08:30:33Z

Well in terms of conversation you are right. But you can use LLM to make something else instead of "chatbot". Some of my customers need to grab structured data from unstructured data, like name, surname, other information from emails, attachments, extract information from datasheet ad so on...
In this case we do not have a "conversation" but a single-shot request with a lot of context that ends with the function call.

A simple prompt can be, for instance:

U: Extract name, surname, emotional state, address, city, and summarize the following text, and call the function SAVEITNOW with extracted data:
--
Lorem ipsum dolor sit amet...[a lot of text]

the function is described as usual. No other interaction with this data will be done in the future.

Grogdunn · 2024-05-20T08:32:24Z

I add, in my case, the second roundtrip is useless, and make double the costs for any interaction.

tzolov · 2024-05-20T09:07:10Z

I understand your use case and the goal to reduce the cost.
But the solution is sort of a hack that breaks the LLM APIs/protocols. Therefore we will be reluctant to add such hacks to the general purpose Spring AI.
Perhaps the Structured Output Converter can help to returns one-shot structured response for other use case?
Other options is to think of some extension/customisation points that might allow one hack the spring ai function calling API in their applications.

At the same time i've reworked our PR to returns "DONE" in case of void function definition, which is still a valid user case and modified the test like this:
https://github.com/tzolov/spring-ai/blob/8f832e337c6d4fc7a244de68cd61c5bd8e0c5876/models/spring-ai-azure-openai/src/test/java/org/springframework/ai/azure/openai/function/AzureOpenAiChatClientFunctionCallIT.java#L123

This seems to work (most of the time) for sync and stream calls.

Grogdunn · 2024-05-20T10:09:44Z

The "done" message is exactly how we have addressed the "second round trip" at the moment.

I think, if I've understood well, the Structured Output Converter is not enough because when you extract data and call the functions the LLM better respects the data structure, while the other way around (parse the text in output and grab the JSON provided) is more prone to hallucinations (I've used some time ago before functions/tools era).

So the better way is to think an extension point to leave the hack external to te spring-ai.

Grogdunn marked this pull request as ready for review April 30, 2024 10:12

Grogdunn force-pushed the avoid-second-roundtrip-in-function-calls branch from d57914f to 78e24fd Compare April 30, 2024 14:33

tzolov self-assigned this May 4, 2024

tzolov added this to the 1.0.0-M1 milestone May 4, 2024

tzolov added the function calling label May 4, 2024

tzolov requested changes May 5, 2024

View reviewed changes

...ai-azure-openai/src/main/java/org/springframework/ai/azure/openai/AzureOpenAiChatClient.java Outdated Show resolved Hide resolved

...ai-azure-openai/src/main/java/org/springframework/ai/azure/openai/AzureOpenAiChatClient.java Show resolved Hide resolved

Grogdunn force-pushed the avoid-second-roundtrip-in-function-calls branch from 5948013 to 87308e2 Compare May 13, 2024 09:32

✨ avoid second roundtrip in function callbacks

5edafc9

Grogdunn force-pushed the avoid-second-roundtrip-in-function-calls branch from 87308e2 to 5edafc9 Compare May 17, 2024 08:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] avoid second round trip for function call #656

[FEATURE] avoid second round trip for function call #656

Grogdunn commented Apr 29, 2024

tzolov left a comment

tzolov commented May 13, 2024

Grogdunn commented May 13, 2024

tzolov commented May 17, 2024

Grogdunn commented May 17, 2024

Grogdunn commented May 17, 2024

tzolov commented May 19, 2024

Grogdunn commented May 20, 2024 •

edited

tzolov commented May 20, 2024 •

edited

Grogdunn commented May 20, 2024

Grogdunn commented May 20, 2024

tzolov commented May 20, 2024

Grogdunn commented May 20, 2024

[FEATURE] avoid second round trip for function call #656

Are you sure you want to change the base?

[FEATURE] avoid second round trip for function call #656

Conversation

Grogdunn commented Apr 29, 2024

tzolov left a comment

Choose a reason for hiding this comment

tzolov commented May 13, 2024

Grogdunn commented May 13, 2024

tzolov commented May 17, 2024

Grogdunn commented May 17, 2024

Grogdunn commented May 17, 2024

tzolov commented May 19, 2024

Grogdunn commented May 20, 2024 • edited

tzolov commented May 20, 2024 • edited

Grogdunn commented May 20, 2024

Grogdunn commented May 20, 2024

tzolov commented May 20, 2024

Grogdunn commented May 20, 2024

Grogdunn commented May 20, 2024 •

edited

tzolov commented May 20, 2024 •

edited