-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best practices to control the size of the ChatHistory to avoid exceeding a models maximum context length #6155
Comments
Beyond samples, I think we should have some built-in support for this, e.g. an interface that can be queried for to reduce the size of the chat history and some implementations of it readily available, including ones that trim to a max number of tokens or messages, ones that's summarize and replace the previous history with just the most salient points, ones that's remove less important messages and keep only the important ones, etc. This (and possibly other features) might drive the need for taking a dependency on a tokenizer; we'll want to think that through, in conjunction with the abstraction for a tokenizer in Microsoft.ML.Tokenizers cc: @tarekgh (Tarek, and @ericstj, we should think about whether the Tokenizer abstraction should be moved to an abstractions library... today in order to get the abstraction you also need to pay to get all the implementations). |
In what situation would abstraction be necessary without requiring one of the specific tokenizers? The scenario mentioned doesn't seem to clarify this for me. |
I'll turn around the question and ask.. what's the reason for having the Tokenizer abstraction at all if every use would require a specific tokenizer? :) Imagine for this issue there were an IChatHistoryReducer with a method like It's a similar need for something like TextChunker. Today it has methods that take a delegate to do token counting, but all uses of that today just point to a token counting method. It'd be nice if overloads on TextChunker could just take a Tokenizer directly, for example. |
Thanks for the thoughts @stephentoub. I have some experience with Encoding in the framework. In .NET Core, we attempted to separate most of the concrete encodings (those with significant data) into their own libraries. We retained only the abstraction and a few concrete encodings that we believed would be commonly used, such as UTF-8. However, we found that many users wanted access to the other encodings, leading us to include these concrete encodings by default. I'm asking to gain insight into whether we might encounter similar situations with the Tokenizers, or if we anticipate that many libraries will rely on the abstraction without requiring real concrete implementations. |
Here's an example of the type of error a developer can run into
Some options to mitigate this
The text was updated successfully, but these errors were encountered: