Skip to content

VQAssessment/ExplainableVQA

Repository files navigation

Towards Explainable Video Quality Assessment

New! Use DOVER++ with the merged DIVIDE-MaxWell dataset!

Official Repository for ACMMM 2023 Paper: "Towards Explainable in-the-wild Video Quality Assessment: a Database and a Language-prompt Approach." Paper Link: Arxiv

Dataset Link: Hugging Face.

Welcome to visit Sibling Repositories from our team:

FAST-VQADOVERZero-shot BVQI

The database (Maxwell, training part) has been released.

The code, demo and pre-trained weights of MaxVQA are released in this repo.

Installation

Install and modify OpenCLIP:

git clone https://github.com/mlfoundations/open_clip.git
cd open_clip
sed -i '92s/return x\[0\]/return x/' src/open_clip/modified_resnet.py 
pip install -e .

Install DOVER for Pre-processing and FAST-VQA weights:

git clone https://github.com/vqassessment/DOVER.git
cd DOVER
pip install -e .
mkdir pretrained_weights 
cd pretrained_weights 
wget https://github.com/VQAssessment/DOVER/releases/download/v0.1.0/DOVER.pth 

MaxVQA

Gradio Demo

demo_maxvqa.py

You can maintain a custom service for multi-dimensional VQA.

Inference from Videos

infer_from_videos.py

Inference from Pre-extracted Features

infer_from_feats.py

For the first run, the script will extract features from videos.

Training on Mixed Existing VQA Databases

For the default setting, train on LIVE-VQC, KoNViD-1k, and YouTube-UGC.

train_multi_existing.py -o LKY.yml

You can also modify the yaml file to include more datasets for training.

Citation

Please feel free to cite our paper if you use this method or the MaxWell database (with explanation-level scores):

%explainable
@inproceedings{wu2023explainable,
      title={Towards Explainable Video Quality Assessment: A Database and a Language-Prompted Approach}, 
      author={Wu, Haoning and Zhang, Erli and Liao, Liang and Chen, Chaofeng and Hou, Jingwen and Wang, Annan and Sun, Wenxiu and Yan, Qiong and Lin, Weisi},
      year={2023},
      booktitle={ACM MM},
}

This dataset is built upon the original DIVIDE-3K dataset (with perspective scores) proposed by our ICCV2023 paper:

%dover and divide
@inproceedings{wu2023dover,
      title={Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives}, 
      author={Wu, Haoning and Zhang, Erli and Liao, Liang and Chen, Chaofeng and Hou, Jingwen and Wang, Annan and Sun, Wenxiu and Yan, Qiong and Lin, Weisi},
      year={2023},
      booktitle={ICCV},
}