Add support for Vision API #500

lectrician1 · 2023-11-26T07:25:19Z

Adds the ability to use the vision API in the client. Resolves #488 and #473

Message interface:

Editing interface:

Add image to blob (base64) conversion when a file is uploaded, and store that base64 in the browser (localstorage). Not perfect and localstorage might run out of space but I don't have any other better solutions to keep the images you store from previous chats and IndexedDB looks too hard to implement.
Add image detail selection for users to select which detail their image should be processed at. see: https://platform.openai.com/docs/guides/vision/low-or-high-fidelity-image-understanding
Add a new data structure to differentiate models based on the types of inputs they can take and update editor to check if model supports images or not. I added a const modelTypes and check for it in EditView.tsx
Clean up the image upload UI (it's not perfect, but works)
Need to make some data structure version migration function so that clients with old data structures will get theirs updated (I need help with this)

Notable changes

This PR changes the the data structures of ChatInterface, MessageInterface, and creates ContentInterface, TextContentInterface and ImageContentInterface so that it follows the updated API JSON structure for prompts that contain images.

Before:

export interface MessageInterface {
  role: Role;
  content: string;
}

After:

export type Content = 'text' | 'image_url';

export interface ImageContentInterface extends ContentInterface {
  type: 'image_url';
  image_url: {
    url: string;
  }
}

export interface TextContentInterface extends ContentInterface {
  type: 'text';
  text: string;
}

export interface ContentInterface {
  type: Content;
}

export interface MessageInterface {
  role: Role;
  content: ContentInterface[];
}

The url parameter stores the URL of the image locally (the blob: URL) and at generation-time the client converts all the blob URLs into base64 for the API to take in.

OldKrab · 2024-01-19T07:43:44Z

Now yarn run build fail with type errors

Ahmet-Dedeler · 2024-02-17T11:36:21Z

Has there been any updates with this?

lectrician1 · 2024-02-17T15:09:35Z

Uhhh well haven't had time to finish this up but maybe will try this weekend...

Ahmet-Dedeler · 2024-02-18T11:25:03Z

@lectrician1 That would be amazing, this feature would literally be the revolutionary feature for this repo.

Lately the repo hasn't been very active, so this might just change the future of how active and innovative it becomes.

Please keep us updated here, will try to help :)

…the openai request

…l when adding image selector

lectrician1 · 2024-02-19T08:30:38Z

@Ahmet-Dedeler it's almost ready.

I just need help from @ztjhz @akira0245 or @ayaka14732 to write the migration code in migrate.ts and chat.ts to migrate the old data structure to the new one for ChatInterface and associated interfaces (see the PR description). I tried doing this and it's really confusing. I'm not sure how the whole migrate process works.

ztjhz · 2024-02-25T09:38:09Z

I'll take a look at this.

Ahmet-Dedeler · 2024-04-21T11:14:21Z

Sorry for tagging but did any of you chance to look at this @ztjhz @akira0245 @ayaka14732 ?

first try that works

313cfc8

Ahmet-Dedeler mentioned this pull request Feb 17, 2024

[Major Feature Request]: GPT-4 Vision Support, Code Interpreter (via Assistants) #541

Open

lectrician1 added 4 commits February 19, 2024 02:05

switch message data structure over to mimicing the data structure of …

95d4405

…the openai request

Merge branch 'main' into add-vision

3a91653

add modelTypes object and check that image is supported for that mode…

a58c791

…l when adding image selector

add comment about modelTypes

9bdaaea

lectrician1 marked this pull request as ready for review February 19, 2024 08:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Vision API #500

Add support for Vision API #500

lectrician1 commented Nov 26, 2023 •

edited

OldKrab commented Jan 19, 2024

Ahmet-Dedeler commented Feb 17, 2024

lectrician1 commented Feb 17, 2024

Ahmet-Dedeler commented Feb 18, 2024

lectrician1 commented Feb 19, 2024 •

edited

ztjhz commented Feb 25, 2024

Ahmet-Dedeler commented Apr 21, 2024

Add support for Vision API #500

Are you sure you want to change the base?

Add support for Vision API #500

Conversation

lectrician1 commented Nov 26, 2023 • edited

Notable changes

OldKrab commented Jan 19, 2024

Ahmet-Dedeler commented Feb 17, 2024

lectrician1 commented Feb 17, 2024

Ahmet-Dedeler commented Feb 18, 2024

lectrician1 commented Feb 19, 2024 • edited

ztjhz commented Feb 25, 2024

Ahmet-Dedeler commented Apr 21, 2024

lectrician1 commented Nov 26, 2023 •

edited

lectrician1 commented Feb 19, 2024 •

edited