Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: AI Proxy streaming #7293

Merged
merged 15 commits into from May 22, 2024
Merged

feat: AI Proxy streaming #7293

merged 15 commits into from May 22, 2024

Conversation

cloudjumpercat
Copy link
Contributor

@cloudjumpercat cloudjumpercat commented Apr 23, 2024

Description

Adds documentation for the new streaming mode for the AI proxy plugin for 3.7.

DOCU-3768

Testing instructions

Preview link:

Checklist

@cloudjumpercat cloudjumpercat added do not merge Issues/ PRs whose changes should not be merged at this time review:sme Request for SME review, external to the docs team. labels Apr 23, 2024
@cloudjumpercat cloudjumpercat added this to the Gateway 3.7 milestone Apr 23, 2024
Copy link

netlify bot commented Apr 23, 2024

Deploy Preview for kongdocs ready!

Name Link
🔨 Latest commit e032f17
🔍 Latest deploy log https://app.netlify.com/sites/kongdocs/deploys/664e0060f1a9300009023889
😎 Deploy Preview https://deploy-preview-7293--kongdocs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
9 paths audited
Performance: 94 (🟢 up 1 from production)
Accessibility: 93 (no change from production)
Best Practices: 98 (🟢 up 8 from production)
SEO: 91 (no change from production)
PWA: -
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify site configuration.

Comment on lines 30 to 62
{% mermaid %}
sequenceDiagram
actor Client
participant Kong
Note right of Kong: AI Proxy plugin
Client->>+Kong:
Kong->>+Cloud LLM: Sets proxy request information
Cloud LLM->>+Kong:
Kong->>+Readframe:
Readframe->>+Transform frame:
Transform frame->>+Kong:
Kong->>+Client: ngx.EXIT
{% endmermaid %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lena-larionova Is there a better way to configure the transform frame and read frame part of the original diagram in mermaid?

Original:
317228164-fd763491-a540-4209-a1f7-fe54e5453660

My diagram in Mermaid: https://deploy-preview-7293--kongdocs.netlify.app/hub/kong-inc/ai-proxy/how-to/streaming/

Would a diagram type other than sequence work better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to do this in flowchart, but you'll likely have to use a subgraph to make it work.

Edit: as I thought, it won't be straightforward. I really wish mermaid wouldn't throw nodes into random places sometimes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big pain with various subgraphs and styles. Here's something you can work with though:

flowchart LR
  A(client)
  B(Kong Gateway with 
  AI Proxy plugin)
  C(Cloud LLM)
  D[[transform frame]]
  E[[read frame]]

subgraph main
direction LR
  subgraph 1
  A
  end
  subgraph 3
  C
  end
  subgraph 2
  D
  E
  end
  A --> B --request--> C
  C --response--> B
  B --> D-->E
  E --> B
  B --> A
end

  linkStyle 2,3,4,5,6 stroke:#b6d7a8,color:#b6d7a8
  style 1 color:#fff,stroke:#fff
  style 2 color:#fff,stroke:#fff
  style 3 color:#fff,stroke:#fff
  style main color:#fff,stroke:#fff

Here's the editor link: https://mermaid.live/edit#pako:eNqFk01v2zAMhv-KoF0yQAWazyU5DGiS7gNLgWEZdmjcA23RiVBZ8mh5iVfkv09SktZdN_QikHwfUjQtPvDMSuRTnmu7y7ZAji2_JYaxq06mFRr3NjizzhdrNuwjONxBw3bKbVmEPrOvZPcNK3W9USay885c21qy5fIm-ov12hGYKrdUsJygwLu7EL9erwlBPoUSU9XphqDcsgKUd6UizJyy5tTSo9yNd4cDjXym9GMH_1J6sZd4cUu-YhcX79nMn4Q_a6xccGP-PMaq0poKIxPnEPGFP2KRa_aX4nuKhUNEK3O_co1G1hN9MRBDMWKVI3uP0zfpSL6DscistnT2YreR77KTkOe5OKd4-4novUr0XyXCjP8DccELJA9I_zAeQkrC3RYLTPjUmylU3hKt-A8gBanGKgAxIUi5Ne4DFEo3x7xPqH-hUxmcks_MSv0-Ve4Oy31LLEkVQM08NHkE4ie8AGaWJFIbG8B4hOOX5Hfcu2fcZJiNBi3O_zVsA5NJepldJjzoh8Qc_GigdnbVmIxPHdUoeF1KvxcLBf6hFXyag658FKVylm6OyxV3TPASzK21Z-bwB3ekHU8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lena-larionova Thanks! Those diagrams look great, so I'll use those!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure that this diagram illustrates the flow properly, honestly. The simpler one is fine, but this one keeps moving elements around in a way that isn't intuitive, so you might want to play around with it a bit more.

Comment on lines 18 to 25
{% mermaid %}
sequenceDiagram
actor Client
participant Kong
Note right of Kong: AI Proxy plugin
Client->>+Kong:
Kong->>+Cloud LLM: Sets proxy request information
Cloud LLM->>+Client: Sends chunk to client
{% endmermaid %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lena-larionova Is there a better way to configure the mermaid diagram I made so it looks more like the original?

Original:
317228104-9b9b41ff-4cbb-4512-bf04-06c2092b573b

Mine: https://deploy-preview-7293--kongdocs.netlify.app/hub/kong-inc/ai-proxy/how-to/streaming/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use a flowchart, you can use something like this:

flowchart LR
  A(client)
  B(Kong Gateway with 
  AI Proxy plugin)
  C(Cloud LLM)

  A --> B
  B --sends request 
  information--> C
  C --> A

@cloudjumpercat
Copy link
Contributor Author

Hi @ttyS0e ! I left some questions for you on this PR, if you wouldn't mind checking them out when you have the time. I'll be out on PTO all next week, so I'll respond to any comments/get to revisions when I'm back on the 13th. Thanks!

@Guaris Guaris self-assigned this May 8, 2024
@tysoekong tysoekong marked this pull request as ready for review May 13, 2024 22:52
@tysoekong tysoekong requested a review from a team as a code owner May 13, 2024 22:52
@tysoekong
Copy link
Contributor

I added quite a lot here. Including “how to use with an SDK” which was requested by many users and customers.

This is ready to go, from my tech standpoint.

@tysoekong tysoekong force-pushed the feat/ai-proxy-streaming branch 2 times, most recently from a8181c8 to 7938181 Compare May 13, 2024 22:57
@cloudjumpercat
Copy link
Contributor Author

@tysoekong Thanks for adding those pages! I made some copyedits and left a few more questions on the PR for you.

cloudjumpercat and others added 8 commits May 16, 2024 10:29
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
… fix Vale errors

Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/_changelog.md Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_sdk-usage.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_sdk-usage.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_sdk-usage.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_sdk-usage.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
app/_hub/kong-inc/ai-proxy/how-to/_streaming.md Outdated Show resolved Hide resolved
cloudjumpercat and others added 2 commits May 17, 2024 15:05
Co-authored-by: lena-larionova <54370747+lena-larionova@users.noreply.github.com>
Co-authored-by: Angel <Guaris@users.noreply.github.com>
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
* Llama2 (raw, OLLAMA, and OpenAI formats)
The following table describes which providers and requests the AI Proxy plugin supports:

| Provider | Chat | Completion | Streaming |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support streaming with all of these as well? This feels like a strange table, but I guess it doesn't hurt to have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lena-larionova I wasn't quite sure, so I asked @tysoekong to review this for accuracy.

Copy link
Contributor

@lena-larionova lena-larionova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one comment on the new table, mostly because I didn't realize it was going to be all checkmarks 😁 . Otherwise, LGTM - merge whenever you feel like it's ready.

@Guaris Guaris removed their assignment May 21, 2024
Signed-off-by: Diana <75819066+cloudjumpercat@users.noreply.github.com>
@cloudjumpercat cloudjumpercat merged commit 23ac541 into main May 22, 2024
15 checks passed
@cloudjumpercat cloudjumpercat deleted the feat/ai-proxy-streaming branch May 22, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do not merge Issues/ PRs whose changes should not be merged at this time review:sme Request for SME review, external to the docs team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants