Support tool use assertions for Claude #640

CamdenClark · 2024-04-06T01:07:54Z

Hey, I'd like to work on support for tool use in Claude, which was just launched https://docs.anthropic.com/claude/docs/tool-use

Unfortunately, Anthropic didn't use the OpenAI pattern for supporting this and changed a few things.

Tool definition

here's the tool definition in claude's API.

[
  {
    "name": "get_stock_price",
    "description": "Get the current stock price for a given ticker symbol.",
    "input_schema": {
      "type": "object",
      "properties": {
        "ticker": {
          "type": "string",
          "description": "The stock ticker symbol, e.g. AAPL for Apple Inc."
        }
      },
      "required": ["ticker"]
    }
  }
]

Basically all that's changed here is that there's no base type of function and input_schema is changed from parameters. I think we can one to one map between anthropic and openai tool definitions. So I'm thinking we can support both the openai and anthropic tool definitions in anthropic.

Forcing tool use

Anthropic doesn't support forcing tool use unless you specifically reference the tool in a user message. Kind of a strange choice but I don't think there's anything to worry about here.

Tool responses

So, putting in the tool responses is a bit annoyingly different from the behavior of OpenAI.

They have tool_use responses as part of the content block that's sent back from Anthropic.

{
  "id": "msg_01Aq9w938a90dw8q",
  "model": "claude-3-opus-20240229",
  "stop_reason": "tool_use",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "<thinking>I need to use the get_weather, and the user wants SF, which is likely San Francisco, CA.</thinking>"
    },
    {
      "type": "tool_use",
      "id": "toolu_01A09q90qw90lq917835lq9",
      "name": "get_weather",
      "input": {"location": "San Francisco, CA", "unit": "celsius"}
    }
  ]
}

Then, you have to send these as part of a content block of a user message.

{
  "role": "user",
  "content": [
    {
      "type": "tool_result",
      "tool_use_id": "toolu_01A09q90qw90lq917835lq9",
      "content": "65 degrees"
    }
  ]
}

is_error

you can also support giving an explicit flag that the tool had an error.

{
  "role": "user",
  "content": [
    {
      "type": "tool_result",
      "tool_use_id": "toolu_01A09q90qw90lq917835lq9",
      "content": "ConnectionError: the weather service API is not available (HTTP 500)",
      "is_error": true
    }
  ]
}

There's no way to map this from openai right now so I think I'll just allow it as part of an anthropic response.

I'm going to give how to approach this from a high-level some more thought over the next day or two. I'm thinking directionally promptfoo should try to support the OpenAI prompt format as the lingua franca, and then maybe we can support anthropic specific prompts too?

The text was updated successfully, but these errors were encountered:

typpo · 2024-04-06T12:31:08Z

Really appreciate the rundown of Claude tools, which I haven't had the chance to look into yet, and the thought you've put into next steps.

I'm thinking directionally promptfoo should try to support the OpenAI prompt format as the lingua franca, and then maybe we can support anthropic specific prompts too?

I agree with this approach, and in some cases we're starting to move in that direction for regular prompting too. It's still good to give the user fine-grained control over provider-specific prompts if they desire.

CamdenClark · 2024-04-07T02:58:57Z

Really appreciate the rundown of Claude tools, which I haven't had the chance to look into yet, and the thought you've put into next steps.

I'm thinking directionally promptfoo should try to support the OpenAI prompt format as the lingua franca, and then maybe we can support anthropic specific prompts too?

I agree with this approach, and in some cases we're starting to move in that direction for regular prompting too. It's still good to give the user fine-grained control over provider-specific prompts if they desire.

Awesome, will get to work on it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support tool use assertions for Claude #640

Support tool use assertions for Claude #640

CamdenClark commented Apr 6, 2024 •

edited

typpo commented Apr 6, 2024

CamdenClark commented Apr 7, 2024

Support tool use assertions for Claude #640

Support tool use assertions for Claude #640

Comments

CamdenClark commented Apr 6, 2024 • edited

Tool definition

Forcing tool use

Tool responses

is_error

typpo commented Apr 6, 2024

CamdenClark commented Apr 7, 2024

CamdenClark commented Apr 6, 2024 •

edited