Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to train daclip in my dateset #30

Open
xuridongshengpxd opened this issue Feb 27, 2024 · 10 comments
Open

Not able to train daclip in my dateset #30

xuridongshengpxd opened this issue Feb 27, 2024 · 10 comments

Comments

@xuridongshengpxd
Copy link

This is a very outstanding job!!
whenI use 256 * 256*3 images for training daclip,the following issues will occur

File "main.py", line 495, in
main(sys.argv[1:])
File "main.py", line 423, in main
train_one_epoch(model, data, loss, epoch, optimizer, scaler, scheduler, dist_model, args, tb_writer=writer)
File "/data_160TB/2022/panxudong/code/daclip-uir-main/da-clip/src/training/train.py", line 106, in train_one_epoch
losses = loss(**model_out, output_dict=True)
File "/data_160TB/2022/panxudong/.conda/envs/py8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data_160TB/2022/panxudong/code/daclip-uir-main/da-clip/src/open_clip/loss.py", line 190, in forward
clip_loss = super().forward(image_features, text_features, logit_scale)
File "/data_160TB/2022/panxudong/code/daclip-uir-main/da-clip/src/open_clip/loss.py", line 122, in forward
logits_per_image, logits_per_text = self.get_logits(image_features, text_features, logit_scale)
File "/data_160TB/2022/panxudong/code/daclip-uir-main/da-clip/src/open_clip/loss.py", line 115, in get_logits
logits_per_image = logit_scale * image_features @ text_features.T
RuntimeError: The size of tensor a (4) must match the size of tensor b (512) at non-singleton dimension 1

@xuridongshengpxd
Copy link
Author

Do I need to modify da-clip/src/open_clip/model_configs/daclip_ViT-B-32.json? If so, how do I modify it?

@Algolzw
Copy link
Owner

Algolzw commented Feb 28, 2024

Maybe you don't need to change the code. To train your own dataset, the only update is to generate a 'csv' file that contains the input image paths, captions, and degradations in the format: LQ image path \t caption: degradation. Please refer to this script for more details.

@xuridongshengpxd
Copy link
Author

i have generate a 'csv' file that contains the input image paths, captions, and degradations
image
When encoding text, encode captions, and degradations into a total of 77 dimensions and then divide them into 38 and 39 dimensions, but 77 dimensions are used for positional encoding
image
image

Maybe you don't need to change the code. To train your own dataset, the only update is to generate a 'csv' file that contains the input image paths, captions, and degradations in the format: LQ image path \t caption: degradation. Please refer to this script for more details.

@Algolzw
Copy link
Owner

Algolzw commented Feb 28, 2024

Aha, but I haven't met this error yet. BTW, have you modified the dataset loader? And can you print the dimension of the tokenized text here?

@xuridongshengpxd
Copy link
Author

At this location, I used your pre trained model, would this result in the error mentioned above?
image

Aha, but I haven't met this error yet. BTW, have you modified the dataset loader? And can you print the dimension of the tokenized text here?

@Algolzw
Copy link
Owner

Algolzw commented Feb 28, 2024

This pretrained model is always for the original CLIP model.

Actually, we haven't provided the code for finetuning on our DA-CLIP weights. You can easily retrain the model on your dataset from scratch (maybe ~10 hours, depending on your dataset).

But it's a good suggestion to have the finetuning function in training, we will fix that later.

@xuridongshengpxd
Copy link
Author

xuridongshengpxd commented Feb 29, 2024

Due to network issues, I am unable to use this parameter to download weights, so I downloaded it offline and customized the path. However, there was an issue with weights not matching the model. The download weight URL is https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K/resolve/main/open_clip_pytorch_model.bin , and the parameter is --pretrained="/data_160TB/2022/panxudong/code/daclip-uir-main/pretrained/open_clip_pytorch_model.bin"
image
image
image

This pretrained model is always for the original CLIP model.

Actually, we haven't provided the code for finetuning on our DA-CLIP weights. You can easily retrain the model on your dataset from scratch (maybe ~10 hours, depending on your dataset).

But it's a good suggestion to have the finetuning function in training, we will fix that later.

@Algolzw
Copy link
Owner

Algolzw commented Feb 29, 2024

Can you use the official open_clip to load that weight?

@wjkbigface
Copy link

i have generate a 'csv' file that contains the input image paths, captions, and degradations image When encoding text, encode captions, and degradations into a total of 77 dimensions and then divide them into 38 and 39 dimensions, but 77 dimensions are used for positional encoding image image

Maybe you don't need to change the code. To train your own dataset, the only update is to generate a 'csv' file that contains the input image paths, captions, and degradations in the format: LQ image path \t caption: degradation. Please refer to this script for more details.

Hello, I would like to ask how to generate ’csv‘ file.

@Algolzw
Copy link
Owner

Algolzw commented Mar 26, 2024

i have generate a 'csv' file that contains the input image paths, captions, and degradations image When encoding text, encode captions, and degradations into a total of 77 dimensions and then divide them into 38 and 39 dimensions, but 77 dimensions are used for positional encoding image image

Maybe you don't need to change the code. To train your own dataset, the only update is to generate a 'csv' file that contains the input image paths, captions, and degradations in the format: LQ image path \t caption: degradation. Please refer to this script for more details.

Hello, I would like to ask how to generate ’csv‘ file.

Hello, the script is generate_captions.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants