Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfcs: add proposal for indirect KV cache support #1894

Open
wants to merge 1 commit into
base: rfcs
Choose a base branch
from

Conversation

igorsafo
Copy link
Contributor

@igorsafo igorsafo commented May 3, 2024

Description

A link to the document: link

RFC PR

  • Does RFC document follow the template?
  • Have you added a link to the rendered document?

@igorsafo igorsafo added the RFC A design document label May 3, 2024
@igorsafo igorsafo force-pushed the igorsafo/rfcs/indirect-kv-cache branch 6 times, most recently from 3496d5c to 49965e6 Compare May 4, 2024 04:09
@igorsafo igorsafo force-pushed the igorsafo/rfcs/indirect-kv-cache branch from 49965e6 to 4579bf2 Compare May 7, 2024 16:51
that is responsible for the influence of the context on the output token.

Here is the diagram:
![MHA pattern](mha-pattern.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this pattern graph, out shape of BMM(N, T) seems be (N, H, T, T) not (N, H, 1, T) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC A design document
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants