MMSSL: Multi-Modal Self-Supervised Learning for Recommendation

PyTorch implementation for WWW 2023 paper Multi-Modal Self-Supervised Learning for Recommendation.

MMSSL is a new multimedia recommender system which integrates the generative modality-aware collaborative self-augmentation and the contrastive cross-modality dependency encoding. It achieves better performance than existing SOTA multi-model recommenders.

Dependencies

Python >= 3.9.13
Pytorch >= 1.13.0+cu116
dgl-cuda11.6 >= 0.9.1post1

Usage

Start training and inference as:

cd MMSSL
python ./main.py --dataset {DATASET}

Supported datasets: Amazon-Baby, Amazon-Sports, Tiktok, Allrecipes

Datasets

├─ MMSSL/ 
    ├── data/
      ├── tiktok/
      ...

Dataset	Amazon				Tiktok			Allrecipes
Modality	V	T	V	T	V	A	T	V	T
Embed Dim	4096	1024	4096	1024	128	128	768	2048	20
User	35598		19445		9319			19805
Item	18357		7050		6710			10067
Interactions	256308		139110		59541			58922
Sparsity	99.961%		99.899%		99.904%			99.970%

2024.3.20 baselines LLATTICE and MICRO uploaded: 📢📢📢📢🌹🔥🔥🚀🚀 Because baselines LATTICE and MICRO require some minor modifications, we provide code that can be easily run by simply modifying the dataset path.
2023.11.1 new multi-modal datastes uploaded: 📢📢🔥🔥🌹🌹🌹🌹 We provide new multi-modal datasets Netflix and MovieLens (i.e., CF training data, multi-modal data including item text and posters) of new multi-modal work LLMRec on Google Drive. 🌹We hope to contribute to our community and facilitate your research~
2023.3.23 update(all datasets uploaded): We provide the processed data at Google Drive.
2023.3.24 update: The official website of the Tiktok dataset has been closed. Thus, we also provide many other versions of preprocessed Tiktok. We spent a lot of time pre-processing this dataset, so if you want to use our preprocessed Tiktok in your work please cite.

🚀🚀 The provided dataset is compatible with multi-modal recommender models such as MMSSL, LATTICE, and MICRO and requires no additional data preprocessing, including (1) basic user-item interactions and (2) multi-modal features.

Multi-modal Datasets

🌹🌹 Please cite our paper if you use the 'netflix' dataset~ ❤️

We collected a multi-modal dataset using the original Netflix Prize Data released on the Kaggle website. The data format is directly compatible with state-of-the-art multi-modal recommendation models like LLMRec, MMSSL, LATTICE, MICRO, and others, without requiring any additional data preprocessing.

Textual Modality: We have released the item information curated from the original dataset in the "item_attribute.csv" file. Additionally, we have incorporated textual information enhanced by LLM into the "augmented_item_attribute_agg.csv" file. (The following three images represent (1) information about Netflix as described on the Kaggle website, (2) textual information from the original Netflix Prize Data, and (3) textual information augmented by LLMs.)

Visual Modality: We have released the visual information obtained from web crawling in the "Netflix_Posters" folder. (The following image displays the poster acquired by web crawling using item information from the Netflix Prize Data.)

Original Multi-modal Datasets & Augmented Datasets

Download the Netflix dataset.

🚀🚀 We provide the processed data (i.e., CF training data & basic user-item interactions, original multi-modal data including images and text of items, encoded visual/textual features and LLM-augmented text/embeddings). 🌹 We hope to contribute to our community and facilitate your research 🚀🚀 ~

netflix: Google Drive Netflix. 🌟(Image&Text)

Encoding the Multi-modal Content.

We use CLIP-ViT and Sentence-BERT separately as encoders for visual side information and textual side information.

Experimental Results

Performance comparison of baselines on different datasets in terms of Recall@20, Precision@20 and NDCG@20:

Baseline	Tiktok			Amazon-Baby			Amazon-Sports			Allrecipes
	R@20	P@20	N@20	R@20	P@20	N@20	R@20	P@20	N@20	R@20	P@20	N@20
MF-BPR	0.0346	0.0017	0.0130	0.0440	0.0024	0.0200	0.0430	0.0023	0.0202	0.0137	0.0007	0.0053
NGCF	0.0604	0.0030	0.0238	0.0591	0.0032	0.0261	0.0695	0.0037	0.0318	0.0165	0.0008	0.0059
LightGCN	0.0653	0.0033	0.0282	0.0698	0.0037	0.0319	0.0782	0.0042	0.0369	0.0212	0.0010	0.0076
SGL	0.0603	0.0030	0.0238	0.0678	0.0036	0.0296	0.0779	0.0041	0.0361	0.0191	0.0010	0.0069
NCL	0.0658	0.0034	0.0269	0.0703	0.0038	0.0311	0.0765	0.0040	0.0349	0.0224	0.0010	0.0077
HCCF	0.0662	0.0029	0.0267	0.0705	0.0037	0.0308	0.0779	0.0041	0.0361	0.0225	0.0011	0.0082
VBPR	0.0380	0.0018	0.0134	0.0486	0.0026	0.0213	0.0582	0.0031	0.0265	0.0159	0.0008	0.0056
LightGCN-$M$	0.0679	0.0034	0.0273	0.0726	0.0038	0.0329	0.0705	0.0035	0.0324	0.0235	0.0011	0.0081
MMGCN	0.0730	0.0036	0.0307	0.0640	0.0032	0.0284	0.0638	0.0034	0.0279	0.0261	0.0013	0.0101
GRCN	0.0804	0.0036	0.0350	0.0754	0.0040	0.0336	0.0833	0.0044	0.0377	0.0299	0.0015	0.0110
LATTICE	0.0843	0.0042	0.0367	0.0829	0.0044	0.0368	0.0915	0.0048	0.0424	0.0268	0.0014	0.0103
CLCRec	0.0621	0.0032	0.0264	0.0610	0.0032	0.0284	0.0651	0.0035	0.0301	0.0231	0.0010	0.0093
MMGCL	0.0799	0.0037	0.0326	0.0758	0.0041	0.0331	0.0875	0.0046	0.0409	0.0272	0.0014	0.0102
SLMRec	0.0845	0.0042	0.0353	0.0765	0.0043	0.0325	0.0829	0.0043	0.0376	0.0317	0.0016	0.0118
MMSSL	0.0921	0.0046	0.0392	0.0962	0.0051	0.0422	0.0998	0.0052	0.0470	0.0367	0.0018	0.0135
p-value	1.28e-5	7.12e-6	6.55e-6	2.23e-6	7.69e-6	8.65e-7	7.75e-6	6.48e-6	6.78e-7	3.94e-4	5.06e-6	4.31e-5
Improv.	8.99%	9.52%	6.81%	16.04%	15.91%	14.67%	9.07%	8.33%	10.85%	15.77%	12.50%	14.40%

Citing

If you find this work helpful to your research, please kindly consider citing our paper.

@inproceedings{wei2023multi,
  title={Multi-Modal Self-Supervised Learning for Recommendation},
  author={Wei, Wei and Huang, Chao and Xia, Lianghao and Zhang, Chuxu},
  booktitle={Proceedings of the ACM Web Conference 2023},
  pages={790--800},
  year={2023}
}

Acknowledgement

The structure of this code is largely based on LATTICE, MICRO. Thank them for their work.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
LATTICE		LATTICE
MICRO		MICRO
MMSSL		MMSSL
image		image
MMSSL.png		MMSSL.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LATTICE

LATTICE

MICRO

MICRO

MMSSL

MMSSL

image

image

MMSSL.png

MMSSL.png

README.md

README.md

Repository files navigation

MMSSL: Multi-Modal Self-Supervised Learning for Recommendation

Dependencies

Usage

Datasets

Multi-modal Datasets

Original Multi-modal Datasets & Augmented Datasets

Download the Netflix dataset.

Encoding the Multi-modal Content.

Experimental Results

Citing

Acknowledgement

About

Releases

Packages

Contributors 2

Languages

HKUDS/MMSSL

Folders and files

Latest commit

History

Repository files navigation

MMSSL: Multi-Modal Self-Supervised Learning for Recommendation

Dependencies

Usage

Datasets

Multi-modal Datasets

Original Multi-modal Datasets & Augmented Datasets

Download the Netflix dataset.

Encoding the Multi-modal Content.

Experimental Results

Citing

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Languages