Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can u provide the subset only for ViP except llava mix original? #7

Open
lucasjinreal opened this issue Mar 1, 2024 · 5 comments
Open

Comments

@lucasjinreal
Copy link

So that it can not be easily missleading model using llava original dataset.

Meanwhile, it looks like the images are missing..

    "id": "vcr-52941",
    "image": "vcr1images/lsmdc_3034_IDES_OF_MARCH/3034_IDES_OF_MARCH_01.27.04.788-01.27.10.308@2.jpg",
    "meta_dir": "./dataset/vcr1images/lsmdc_3034_IDES_OF_MARCH/3034_IDES_OF_MARCH_01.27.04.788-01.27.10.308@2.json",
    "class_names": [
        "person",
@mu-cai
Copy link
Collaborator

mu-cai commented Mar 1, 2024

Hello,

Thanks for your interest in our work!

@lucasjinreal
Copy link
Author

Hi The meta is needed to prepare training images?

@mu-cai
Copy link
Collaborator

mu-cai commented Mar 2, 2024

The meta data is included in the vcr_images directory. Therefore, do not worry. The metadata is there.

@lucasjinreal
Copy link
Author

I wanna using official llava base. I just need to add a vipProceessor to process image right?

@mu-cai
Copy link
Collaborator

mu-cai commented Mar 2, 2024

Correct. Visual prompt blending is all you need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants