Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bedrock AI models add usage information. #605

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

maxjiang153
Copy link
Contributor

This PR closes: #600

When Bedrock invokes model API, Bedrock will return input, output token count, and latency information from the SDKHttpResponse header.

extract this information as ChatResponseMetadata to ChatResponse

@tzolov tzolov added this to the 1.0.0-M1 milestone Apr 20, 2024
@tzolov tzolov self-assigned this Apr 20, 2024
@tzolov
Copy link
Collaborator

tzolov commented Apr 21, 2024

@wmz7year thank you for looking at this. It would be a valuable contribution to add more, structured usage metadata.

From the PR code, I have the impression that we can implement (most of) this inside AnthropicChatBedrockApi.java without having to expose it via the AmazonBedrockInvocationContex and reduce some necessary code repetition.

For example It looks like the BedrockAnthropicChatResponseMetadata , BedrockAnthropic3ChatResponseMetadata, CohereChatResponseMetadata, BedrockAi21Jurassic2ChatResponseMetadata, BedrockLlama2ChatResponseMetadata and BedrockTitanChatResponseMetadata are the same? Then why not create one instance and use it inside the AnthropicChatBedrockApi to generate the metadata inside the AnthropicChatBedrockApi?

What do you think about this approach?

@maxjiang153
Copy link
Contributor Author

The idea behind AmazonBedrockInvocationContext is that the Amazon Bedrock invoke model API will return input/output tokens and latency information from HTTP headers, not only from the response body. So I wrap it up as an AmazonBedrockInvocationContext to include more information besides the API response.

Check this out:

And also, the reason why I create multiple metadata classes like BedrockAnthropicChatResponseMetadata is because each model has a different response structure. Currently, it seems to look similar, but for example, the response message ID is different. So I decided to keep this up for future changes in each model.

How do you think about this? btw I can reduce metadata classes to a single one maybe we can add a dynamic field like Map<String, Object> attribute fields to cover the different metadata fields?

@tzolov tzolov modified the milestones: 1.0.0-M1, 1.0.0-M2 Apr 26, 2024
@tzolov
Copy link
Collaborator

tzolov commented Apr 26, 2024

Thanks, @wmz7year I need to look into this further, but wont be able to do it in coming 2 weeks.

@maxjiang153
Copy link
Contributor Author

Thanks, @wmz7year I need to look into this further, but wont be able to do it in coming 2 weeks.

Sure, if you have any thoughts just let me know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BedrockAnthropic3ChatClient returns ChatResponse without AnthropicUsage
2 participants