Luigi Sigillo, Christian Bianchi, Aurelio Uncini, and Danilo Comminiello
ISPAMM Lab, Sapienza University of Rome
- [2025.07.05] Presented the work at IJCNN 2025 in Rome!
- [2025.06.05] Checkpoints and code are released!
- [2025.05.05] The paper has been published on Arxiv 🎉. The pdf version is available here!
- [2025.03.31] The paper has been accepted for presentation at IJCNN 2025 🎉!
Our work introduces ResQu, a novel approach that significantly advances image super-resolution by leveraging the power of quaternion wavelet embeddings. This allows for superior feature representation, leading to high-fidelity reconstructions and enhanced perceptual quality, a crucial step in various computer vision applications.
We propose a streamlined framework that conditions a latent diffusion model (built upon the StableSR baseline) using quaternion wavelet embeddings. Through extensive experimentation, ResQu demonstrates a +15% PSNR improvement over traditional super-resolution models, showcasing its state-of-the-art capabilities in capturing intricate texture details.
Unlike existing methods that demand heavy preprocessing, complex architectures, and additional components like captioning models, our approach is efficient and straightforward. This enables a new frontier in real-time BCIs, advancing tasks like visual cue decoding and future neuroimaging applications.
For more evaluation, please refer to our paper for details.
conda create --name=resqu python=3.9
conda activate resqupip install torch torchvision torchaudio --index-url [https://download.pytorch.org/whl/cu118](https://download.pytorch.org/whl/cu118)
pip install diffusers transformers accelerate xformers==0.0.16 wandb numpy==1.26.4 datasets scikit-learn torchmetrics==1.4.1 scikit-image pytorch_fidTo launch the training of the model, you can use the following command, you need to change the output_dir and also specify the gpu number you want to use, right now only 1 GPU is supported:
CUDA_VISIBLE_DEVICES=N accelerate launch src/resqu/train_resqu.py \
--pretrained_model_name_or_path=stabilityai/stable-diffusion-2-1-base \
--output_dir=output/resqu_model_out \
--dataset_name=your_huggingface_dataset_name \
--image_column=image \
--conditioning_column=quaternion_wavelet_embedding \
--resolution=512 \
--learning_rate=1e-5 \
--train_batch_size=8 \
--num_train_epochs=50 \
--tracker_project_name=resqu \
--enable_xformers_memory_efficient_attention \
--checkpointing_steps=1000 \
--validation_steps=500 \
--report_to wandbRequest access to the pretrained models from Google Drive.
To launch the generation of the images from the model, you can use the following commands:
CUDA_VISIBLE_DEVICES=N python src/resqu/generate_resqu.py \
--model_path=output/resqu_model_out/checkpoint-XXXXX/ \
--input_low_res_image_path=path/to/your/low_res_image.png \
--output_dir=generated_images/Request access to the pretrained models from Google Drive.
To launch the testing of the model, you can use the following command, you need to change the output_dir:
CUDA_VISIBLE_DEVICES=N python src/resqu/evaluation/evaluate.py \
--generated_images_path=generated_images/ \
--ground_truth_images_path=path/to/your/ground_truth_images/Please cite our work if you found it useful:
@INPROCEEDINGS{11228578,
author={Sigillo, Luigi and Bianchi, Christian and Uncini, Aurelio and Comminiello, Danilo},
booktitle={2025 International Joint Conference on Neural Networks (IJCNN)},
title={Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution},
year={2025},
volume={},
number={},
pages={1-8},
keywords={Wavelet transforms;Measurement;Deep learning;Satellites;Quaternions;Superresolution;Diffusion models;Robustness;Image reconstruction;Standards;Generative Deep Learning;Image Super resolution;Diffusion Models},
doi={10.1109/IJCNN64981.2025.11228578}}
This project is based on StableSR baseline. Thanks for their awesome work.


