QualiCLIP

Quality-aware Image-Text Alignment for Real-World Image Quality Assessment

This is the official repository of the paper "Quality-aware Image-Text Alignment for Real-World Image Quality Assessment".

Overview

Abstract

No-Reference Image Quality Assessment (NR-IQA) focuses on designing methods to measure image quality in alignment with human perception when a high-quality reference image is unavailable. The reliance on annotated Mean Opinion Scores (MOS) in the majority of state-of-the-art NR-IQA approaches limits their scalability and broader applicability to real-world scenarios. To overcome this limitation, we propose QualiCLIP (Quality-aware CLIP), a CLIP-based self-supervised opinion-unaware method that does not require labeled MOS. In particular, we introduce a quality-aware image-text alignment strategy to make CLIP generate representations that correlate with the inherent quality of the images. Starting from pristine images, we synthetically degrade them with increasing levels of intensity. Then, we train CLIP to rank these degraded images based on their similarity to quality-related antonym text prompts, while guaranteeing consistent representations for images with comparable quality. Our method achieves state-of-the-art performance on several datasets with authentic distortions. Moreover, despite not requiring MOS, QualiCLIP outperforms supervised methods when their training dataset differs from the testing one, thus proving to be more suitable for real-world scenarios. Furthermore, our approach demonstrates greater robustness and improved explainability than competing methods.

Overview of the proposed quality-aware image-text alignment strategy. Starting from a pair of two random overlapping crops from a pristine image, we synthetically degrade them with $L$ increasing levels of intensity, resulting in $L$ pairs. Then, given two quality-related antonym prompts, we fine-tune the CLIP image encoder by ranking the similarity between the prompts and the images, according to their corresponding level of degradation. At the same time, for each pair of equally distorted crops, we force the similarity between the crops and the prompts to be comparable.

Citation

@article{agnolucci2024qualityaware,
      title={Quality-Aware Image-Text Alignment for Real-World Image Quality Assessment}, 
      author={Agnolucci, Lorenzo and Galteri, Leonardo and Bertini, Marco},
      journal={arXiv preprint arXiv:2403.11176},
      year={2024}
}

To be released

Pre-trained model
Testing code
Training code

Authors

Acknowledgements

This work was partially supported by the European Commission under European Horizon 2020 Programme, grant number 951911 - AI4Media.

LICENSE

All material is made available under Creative Commons BY-NC 4.0. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicate any changes that you've made.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

LICENSE

LICENSE

README.md

README.md

Repository files navigation

QualiCLIP

Quality-aware Image-Text Alignment for Real-World Image Quality Assessment

Overview

Abstract

Citation

To be released

Authors

Acknowledgements

LICENSE

About

Releases

Packages

License

miccunifi/QualiCLIP

Folders and files

Latest commit

History

Repository files navigation

QualiCLIP

Quality-aware Image-Text Alignment for Real-World Image Quality Assessment

Overview

Abstract

Citation

To be released

Authors

Acknowledgements

LICENSE

About

Topics

Resources

License

Stars

Watchers

Forks