-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: AI Proxy streaming #7293
feat: AI Proxy streaming #7293
Conversation
✅ Deploy Preview for kongdocs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
{% mermaid %} | ||
sequenceDiagram | ||
actor Client | ||
participant Kong | ||
Note right of Kong: AI Proxy plugin | ||
Client->>+Kong: | ||
Kong->>+Cloud LLM: Sets proxy request information | ||
Cloud LLM->>+Kong: | ||
Kong->>+Readframe: | ||
Readframe->>+Transform frame: | ||
Transform frame->>+Kong: | ||
Kong->>+Client: ngx.EXIT | ||
{% endmermaid %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lena-larionova Is there a better way to configure the transform frame and read frame part of the original diagram in mermaid?
My diagram in Mermaid: https://deploy-preview-7293--kongdocs.netlify.app/hub/kong-inc/ai-proxy/how-to/streaming/
Would a diagram type other than sequence work better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to do this in flowchart, but you'll likely have to use a subgraph to make it work.
Edit: as I thought, it won't be straightforward. I really wish mermaid wouldn't throw nodes into random places sometimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big pain with various subgraphs and styles. Here's something you can work with though:
flowchart LR
A(client)
B(Kong Gateway with
AI Proxy plugin)
C(Cloud LLM)
D[[transform frame]]
E[[read frame]]
subgraph main
direction LR
subgraph 1
A
end
subgraph 3
C
end
subgraph 2
D
E
end
A --> B --request--> C
C --response--> B
B --> D-->E
E --> B
B --> A
end
linkStyle 2,3,4,5,6 stroke:#b6d7a8,color:#b6d7a8
style 1 color:#fff,stroke:#fff
style 2 color:#fff,stroke:#fff
style 3 color:#fff,stroke:#fff
style main color:#fff,stroke:#fff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lena-larionova Thanks! Those diagrams look great, so I'll use those!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure that this diagram illustrates the flow properly, honestly. The simpler one is fine, but this one keeps moving elements around in a way that isn't intuitive, so you might want to play around with it a bit more.
{% mermaid %} | ||
sequenceDiagram | ||
actor Client | ||
participant Kong | ||
Note right of Kong: AI Proxy plugin | ||
Client->>+Kong: | ||
Kong->>+Cloud LLM: Sets proxy request information | ||
Cloud LLM->>+Client: Sends chunk to client | ||
{% endmermaid %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lena-larionova Is there a better way to configure the mermaid diagram I made so it looks more like the original?
Mine: https://deploy-preview-7293--kongdocs.netlify.app/hub/kong-inc/ai-proxy/how-to/streaming/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use a flowchart, you can use something like this:
flowchart LR
A(client)
B(Kong Gateway with
AI Proxy plugin)
C(Cloud LLM)
A --> B
B --sends request
information--> C
C --> A
Hi @ttyS0e ! I left some questions for you on this PR, if you wouldn't mind checking them out when you have the time. I'll be out on PTO all next week, so I'll respond to any comments/get to revisions when I'm back on the 13th. Thanks! |
I added quite a lot here. Including “how to use with an SDK” which was requested by many users and customers. This is ready to go, from my tech standpoint. |
a8181c8
to
7938181
Compare
@tysoekong Thanks for adding those pages! I made some copyedits and left a few more questions on the PR for you. |
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
2ac0c1a
to
c5a8da8
Compare
… fix Vale errors Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Co-authored-by: lena-larionova <54370747+lena-larionova@users.noreply.github.com> Co-authored-by: Angel <Guaris@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
* Llama2 (raw, OLLAMA, and OpenAI formats) | ||
The following table describes which providers and requests the AI Proxy plugin supports: | ||
|
||
| Provider | Chat | Completion | Streaming | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we support streaming with all of these as well? This feels like a strange table, but I guess it doesn't hurt to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lena-larionova I wasn't quite sure, so I asked @tysoekong to review this for accuracy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one comment on the new table, mostly because I didn't realize it was going to be all checkmarks 😁 . Otherwise, LGTM - merge whenever you feel like it's ready.
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Description
Adds documentation for the new streaming mode for the AI proxy plugin for 3.7.
DOCU-3768
Testing instructions
Preview link:
Checklist