You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to know whether this tool can also do text-to-image attention views for large vision-language models such as MiniGPT-4, LLaVA, InstructBLIP, etc.?
Thanks!
The text was updated successfully, but these errors were encountered:
Thanks so much for checking out AttentionViz! We have not tried visualizing text-to-image attention yet but I think our tool/technique can feasibly be extended to vision-language models and this is definitely a great direction for future work.
First thank you for the great work!
I would like to know whether this tool can also do text-to-image attention views for large vision-language models such as MiniGPT-4, LLaVA, InstructBLIP, etc.?
Thanks!
The text was updated successfully, but these errors were encountered: