Embed media like images, audio, 3d, video or etc? #79

fire · 2024-02-26T16:49:29Z

Hi,

I was wondering if it was in scope to embed media?

emrgnt-cmplxty · 2024-02-26T19:31:47Z

That's definitely in scope. The best way to approach this would be to introduce the necessary embedding providers and to modify or create a new pipeline that shows an example of this in action.

I'm happy to team up on this.

fire · 2024-02-26T19:50:41Z

I have two primary usecases:

The basic use-case is taking an image and making it an embedding for use. Like stable diffusion or the various combined vision-text models. There are a few models that can also also do video.
My pet emerging technologies use-case is to take a 3d mesh from https://github.com/lucidrains/meshgpt-pytorch and have it auto complete vertices or search a database of other embedded meshes using the mesh-token-embedding.
Someday maybe: audio, speech. I am not familiar at all with this.

emrgnt-cmplxty · 2024-02-27T20:51:19Z

For image embedding, do you think we can fit it into the pipeline here [https://github.com/SciPhi-AI/R2R/blob/main/r2r/pipelines/basic/ingestion.py] with a specific embedding provider, or do you think we need to fundamentally rework the structure of the codebase in some way?

I think multi-modal is an important use case and I am very interested in figuring out how to best support this.

fire · 2024-02-29T01:40:56Z

I don't think I can drive multi-modal too much, but I'll see what spare time I can gather.

fire · 2024-02-29T01:41:53Z

The obvious question are like what happens when we have two different embedding models like token integers, how do we sync them?

fire changed the title ~~In scope or out of scope to embed media like images, audio, 3d, video or etc?~~ Embed media like images, audio, 3d, video or etc? Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embed media like images, audio, 3d, video or etc? #79

Embed media like images, audio, 3d, video or etc? #79

fire commented Feb 26, 2024

emrgnt-cmplxty commented Feb 26, 2024

fire commented Feb 26, 2024 •

edited

emrgnt-cmplxty commented Feb 27, 2024

fire commented Feb 29, 2024

fire commented Feb 29, 2024

Embed media like images, audio, 3d, video or etc? #79

Embed media like images, audio, 3d, video or etc? #79

Comments

fire commented Feb 26, 2024

emrgnt-cmplxty commented Feb 26, 2024

fire commented Feb 26, 2024 • edited

emrgnt-cmplxty commented Feb 27, 2024

fire commented Feb 29, 2024

fire commented Feb 29, 2024

fire commented Feb 26, 2024 •

edited