Skip to content

hongyurain/Recommendation-with-modality-information

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 

Repository files navigation

Multimodal and Unimodal Recommendation Paperlist

This is the sumarized paperlist that has been mentioned in our mutlimodal recommendation survey paper:

xxx

Moreover, we also include the unimodal recommendation papers and will update the paperlist.

The framework code is avaiable at: https://github.com/enoche/MMRec

Table of Contents

Background

The traditional recommendation system requires a large number of interactions between users and items to make accurate recommendation. The mulitmodal information has been utilized to alleviate the data sparsity problem and cold start issue. The unimodal information has been utilized to enrich the representations and recently, the fused multimodal information has been also leverage to improve the performance accuracy.

We classified the papers according to the modality information they used and list out the paper list according to the publish time.

Survey

  • Recommender Systems Leveraging Multimedia Content [ACM Computing Surveys Sep 2021] [pdf]

Multimodal Rec

  • Collaborative Knowledge Base Embedding for Recommender Systems [KDD Aug 2016] [pdf]
  • Joint Representation Learning for Top-N Recommendation with Heterogeneous Information Sources [CIKM Nov 2017] [pdf] github
  • User-Video Co-Attention Network for Personalized Micro-video Recommendation [WWW May 2019] [pdf]
  • Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network [SIGIR Jul 2019] [pdf]
  • Multimodal Representation Learning for Recommendation in Internet of Things [IEEE internet of things journal Sep 2019] [pdf]
  • User Diverse Preference Modeling by Multimodal Attentive Metric Learning [MM Oct 2019] [pdf] github
  • MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video [MM Oct 2019] [pdf] github
  • Adversarial Training Towards Robust Multimedia Recommender System [TKDE May 2020] [pdf] github
  • MGAT: Multimodal Graph Attention Network for Recommendation [Elsevier Apr 2020] [pdf] github
  • Multi-modal Knowledge Graphs for Recommender Systems [CIKM Oct 2020] [pdf]
  • Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback [MM Oct 2020] [pdf] github
  • Recommendation by Users’ Multimodal Preferences for Smart City Applications [IEEE transactions on industrial informatics June 2021] [pdf] github
  • MULTIMODAL DISENTANGLED REPRESENTATION FOR RECOMMENDATION [ICME Jul 2021] [pdf]
  • MM-Rec: Multimodal News Recommendation [SIGIR Jul 2021] [pdf]
  • Multi-Modal Variational Graph Auto-Encoder for Recommendation Systems [TMM Sep 2021] [pdf] github
  • Why Do We Click: Visual Impression-aware News Recommendation [MM Oct 2021] [pdf] github
  • Pre-training Graph Transformer with Multimodal Side Information for Recommendation [MM Oct 2021] [pdf] github
  • Mining Latent Structures for Multimedia Recommendation [MM Oct 2021] [pdf] github
  • DualGNN: Dual Graph Neural Network for Multimedia Recommendation [TMM Dec 2021] [pdf] github
  • A two-stage embedding model for recommendation with multimodal auxiliary information [Elsevier Jan 2022] [pdf]
  • Disentangled Multimodal Representation Learning for Recommendation [TMM Oct 2022] [pdf]
  • Latent Structure Mining with Contrastive Modality Fusion for Multimedia Recommendation [TKDE Nov 2022] [pdf] github
  • MEGCF: Multimodal Entity Graph Collaborative Filtering for Personalized Recommendation [TOIS May 2022] [pdf] github
  • Self-supervised Learning for Multimedia Recommendation [TMM Jun 2022] [pdf] github
  • Hierarchical User Intent Graph Network for Multimedia Recommendation [TMM Jun 2021] [pdf] github
  • Multi-Modal Contrastive Pre-training for Recommendation [ICMR Jun 2022] [pdf]

Textual Based Rec

Title Abstract Tag

  • Tag2Word: Using Tags to Generate Words for Content Based Tag Recommendation [CIKM Oct 2016] [pdf]
  • Interactive resource recommendation algorithm based on tag information [WWW Feb 2018] [pdf]
  • Leveraging Title-Abstract Attentive Semantics for Paper Recommendation [AAAI Apr 2020] [pdf]
  • Graph Neural Network for Tag Ranking in Tag-enhanced Video Recommendation [CIKM Oct 2020] [pdf] github

Reviews Description

  • Convolutional Matrix Factorization for Document Context-Aware Recommendation [RecSys Sep 2016] [pdf] github
  • Joint Deep Modeling of Users and Items Using Reviews for Recommendation [WSDM Feb 2017] [pdf] github
  • Interpretable Convolutional Neural Networks with Dual Local and Global Attention for Review Rating Prediction [RecSys Aug 2017] [pdf] github
  • Coevolutionary Recommendation Model: Mutual Learning between Ratings and Reviews [WWW Apr 2018] [pdf]
  • Neural Attentional Rating Regression with Review-level Explanations [WWW Apr 2018] [pdf] github
  • Multi-Pointer Co-Attention Networks for Recommendation [KDD Jul 2018] [pdf] github
  • PARL: Let Strangers Speak Out What You Like [CIKM Oct 2018] [pdf] github
  • ANR: Aspect-based Neural Recommender [CIKM Oct 2018] [pdf] github
  • A Context-Aware User-Item Representation Learning for Item Recommendation [TOIS Jan 2019] [pdf] github
  • Recommendation Based on Review Texts and Social Communities: A Hybrid Model [IEEE Access Feb 2019] [pdf] github
  • Attentive Aspect Modeling for Review-Aware Recommendation [TOIS Mar 2019] [pdf]
  • NRPA: Neural Recommendation with Personalized Atention [SIGIR Jul 2019] [pdf] github
  • DAML: Dual Attention Mutual Learning between Ratings and Reviews for Item Recommendation [KDD Aug 2019] [pdf]
  • Reviews Meet Graphs: Enhancing User and Item Representations for Recommendation with Hierarchical Attentive Graph Neural Network [EMNLP | IJCNLP Nov 2019] [pdf] github
  • Neural Unified Review Recommendation with Cross Attention [SIGIR Jul 2020] [pdf]
  • Learning Hierarchical Review Graph Representations for Recommendation [TKDE Apr 2021] [pdf] github
  • Improving Explainable Recommendations by Deep Review-Based Explanations [IEEE Access Apr 2021] [pdf]
  • Counterfactual Review-based Recommendation [CIKM Nov 2021] [pdf] github
  • Review-Aware Neural Recommendation with Cross-Modality Mutual Attention [CIKM Nov 2021] [pdf]
  • Aligning Dual Disentangled User Representations from Ratings and Textual Content [KDD Aug 2022] [pdf] github

Visual Based Rec

  • VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback [AAAI Feb 2016] [pdf] github
  • Do" Also-Viewed" Products Help User Rating Prediction? [WWW Apr 2017] [pdf]
  • DeepStyle: Learning User Preferences for Visual Recommendation [SIGIR Aug 2017] [pdf]
  • Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention [SIGIR Aug 2017] [pdf] github
  • Visually-aware fashion recommendation and design with generative image models [ICDM Nov 2017] [pdf] github
  • Visually-Aware Personalized Recommendation using Interpretable Image Representations [arxiv 2018] [pdf]
  • Exploring the Power of Visual Features for the Recommendation of Movies [UMAP Jun 2019] [pdf]
  • Image and Video Understanding for Recommendation and Spam Detection Systems [KDD Aug 2020] [pdf]
  • CausalRec: Causal Inference for Visual Debiasing in Visually-Aware Recommendation [MM Oct 2021] [pdf]

EVALUATING THE SOTA MODELS

we validate the effectiveness and efficiency of state-of-the-art multimodal recommendation models by conducting extensive experiments on four public datasets. Furthermore, we investigate the principal determinants of model performance, including the impact of different modality information and data split methods.

Statistics of the evaluated datasets.

Datasets # Users # Items # Interactions Sparsity
Baby 19,445 7,050 160,792 99.8827%
Sports 35,598 18,357 296,337 99.9547%
FoodRec 61,668 21,874 1,654,456 99.8774%
Elec 192,403 63,001 1,689,188 99.9861%

Experimental Results

Comparison of performance for different models in terms of Recall and NDCG.

Dataset Model Recall@10 Recall@20 Recall@50 NDCG@10 NDCG@20 NDCG@50
Baby BPR 0.0357 0.0575 0.1054 0.0192 0.0249 0.0345
LightGCN 0.0479 0.0754 0.1333 0.0257 0.0328 0.0445
VBPR 0.0423 0.0663 0.1212 0.0223 0.0284 0.0396
MMGCN 0.0378 0.0615 0.1100 0.0200 0.0261 0.0359
DualGNN 0.0448 0.0716 0.1288 0.0240 0.0309 0.0424
GRCN 0.0539 0.0833 0.1464 0.0288 0.0363 0.0490
LATTICE 0.0547 0.0850 0.1477 0.0292 0.0370 0.0497
BM3 0.0564 0.0883 0.1477 0.0301 0.0383 0.0502
SLMRec 0.0529 0.0775 0.1252 0.0290 0.0353 0.0450
ADDVAE 0.0598 0.091 0.1508 0.0323 0.0404 0.0525
FREEDOM 0.0627 0.0992 0.1655 0.0330 0.0424 0.0558
Sports BPR 0.0432 0.0653 0.1083 0.0241 0.0298 0.0385
LightGCN 0.0569 0.0864 0.1414 0.0311 0.0387 0.0498
VBPR 0.0558 0.0856 0.1391 0.0307 0.0384 0.0492
MMGCN 0.0370 0.0605 0.1078 0.0193 0.0254 0.0350
DualGNN 0.0568 0.0859 0.1392 0.0310 0.0385 0.0493
GRCN 0.0598 0.0915 0.1509 0.0332 0.0414 0.0535
LATTICE 0.0620 0.0953 0.1561 0.0335 0.0421 0.0544
BM3 0.0656 0.0980 0.1581 0.0355 0.0438 0.0561
SLMRec 0.0663 0.0990 0.1543 0.0365 0.0450 0.0562
ADDVAE 0.0709 0.1035 0.1663 0.0389 0.0473 0.0600
FREEDOM 0.0717 0.1089 0.1768 0.0385 0.0481 0.0618
FoodRec BPR 0.0303 0.0511 0.0948 0.0188 0.0250 0.0356
LightGCN 0.0331 0.0546 0.1003 0.0210 0.0274 0.0386
VBPR 0.0306 0.0516 0.0972 0.0191 0.0254 0.0365
MMGCN 0.0307 0.0510 0.0943 0.0192 0.0253 0.0359
DualGNN 0.0338 0.0559 0.1027 0.0214 0.0280 0.0394
GRCN 0.0356 0.0578 0.1063 0.0226 0.0295 0.0411
LATTICE 0.0336 0.0560 0.1012 0.0211 0.0277 0.0388
BM3 0.0334 0.0553 0.0994 0.0208 0.0274 0.0381
SLMRec 0.0323 0.0515 0.0907 0.0208 0.0266 0.0362
ADDVAE 0.0309 0.0508 0.093 0.0186 0.0247 0.035
FREEDOM 0.0333 0.0556 0.1009 0.0212 0.0279 0.0389
Elec BPR 0.0235 0.0367 0.0621 0.0127 0.0161 0.0212
LightGCN 0.0363 0.0540 0.0879 0.0204 0.0250 0.0318
VBPR 0.0293 0.0458 0.0778 0.0159 0.0202 0.0267
MMGCN 0.0213 0.0343 0.0610 0.0112 0.0146 0.0200
DualGNN 0.0365 0.0542 0.0875 0.0206 0.0252 0.0319
GRCN 0.0389 0.0590 0.0970 0.0216 0.0268 0.0345
LATTICE - - - - - -
BM3 0.0437 0.0648 0.1021 0.0247 0.0302 0.0378
SLMRec 0.0443 0.0651 0.1038 0.0249 0.0303 0.0382
ADDVAE 0.0451 0.0665 0.1066 0.0253 0.0308 0.0390
FREEDOM 0.0396 0.0601 0.0998 0.0220 0.0273 0.0353

Ablation Study

Recommendation performance comparison using different data split methods.:

We evaluate the performance of various recommendation models using different data splitting methods. The offline evaluation is based on the historical item ratings or the implicit item feedback. As this method relies on the user-item interactions and the models are all learning based on the supervised signals, we need to split the interactions into train, validation and test sets. There are three main split strategies that we applied to compare the performance:

• Random split: As the name suggested, this split strategy randomly selects the train and test boundary for each user, which selects to split the interactions according to the ratio. The disadvantage of the random splitting strategy is that they are not capable to reproduce unless the authors publish how the data split and this is not a realistic scenario without considering the time.

• User time split: The temporal split strategy splits the historical interactions based on the interaction timestamp by the ratio (e.g., train:validation:test=8:1:1). It split the last percentage of interactions the user made as the test set. Although it considers the timestamp, it is still not a realistic scenario because it is still splitting the train/test sets among all the interactions one user made but did not consider the global time.

• Global time split: The global time splitting strategy fixed the time point shared by all users according to the splitting ratio. The interactions after the last time point are split as the test set. Additionally, the users of the interactions after the global temporal boundary must be in the training set, which follows the most realistic and strict settings. The limitation of this strategy is that the number of users will be reduced due to the reason that the users not existing in the training set will be deleted

Our experiments on the Sports dataset, using these three splitting strategies, provide insights into their impact on recommendation performance. The table below presents the performance comparison results in terms of Recall@k and NDCG@k where k=10,20, and the second table shows the performance ranking of models based on Recall@20 and NDCG@20.

Dataset Model Recall@10 Recall@20
Random User Time Global Time Random User Time Global Time
MMGCN 0.0384 0.0266 0.0140 0.0611 0.0446 0.0245
BPR 0.0444 0.0322 0.0152 0.0663 0.0509 0.0258
VBPR 0.0563 0.0385 0.0176 0.0851 0.0620 0.0298
DualGNN 0.0576 0.0403 0.0181 0.0859 0.0611 0.0297
sports GRCN 0.0604 0.0418 0.0167 0.0915 0.0666 0.0286
LightGCN 0.0568 0.0405 0.0205 0.0863 0.0663 0.0336
LATTICE 0.0641 0.0450 0.0207 0.0964 0.0699 0.0337
BM3 0.0646 0.0447 0.0213 0.0955 0.0724 0.0336
SLMRec 0.0651 0.0470 0.0220 0.0985 0.0733 0.0350
FREEDOM 0.0708 0.0490 0.0226 0.1080 0.0782 0.0372
Dataset Model NDCG@10 NDCG@20
Random User Time Global Time Random User Time Global Time
MMGCN 0.0202 0.0134 0.0091 0.0261 0.0180 0.0125
BPR 0.0245 0.0169 0.0102 0.0302 0.0218 0.0135
VBPR 0.0304 0.0204 0.0115 0.0378 0.0265 0.0153
DualGNN 0.0321 0.0214 0.0118 0.0394 0.0268 0.0155
sports GRCN 0.0332 0.0219 0.0101 0.0412 0.0282 0.0138
LightGCN 0.0315 0.0220 0.0139 0.0391 0.0286 0.0180
LATTICE 0.0351 0.0238 0.0138 0.0434 0.0302 0.0177
BM3 0.0356 0.0237 0.0144 0.0436 0.0308 0.0182
SLMRec 0.0364 0.0253 0.0148 0.0450 0.0321 0.0189
FREEDOM 0.0388 0.0255 0.0151 0.0485 0.0330 0.0197

As demonstrated above, different data splitting strategies lead to varied performance outcomes for the same dataset and evaluation metrics. This variability presents a challenge in comparing the effectiveness of different models when they are based on different data split strategies.

Model Sports, NDCG@20
Random User Time Global Time
MMGCN 10 10 10
BPR 9 9 8↑1
VBPR 8 8 7↑1
LightGCN 7 5↑2 4↑3
DualGNN 6 7↓1 6
DRCN 5 6↓1 9↓4
LATTICE 4 4 5↓1
BM3 3 3 3
SLMRec 2 2 2
FREEDOM 1 1 1
Model Sports, Recall@20
Random User Time Global Time
MMGCN 10 10 10
BPR 9 9 9
VBPR 8 7↑1 6↑2
DualGNN 7 8↓1 7
LightGCN 6 6 5↑1
GRCN 5 5 8↓3
BM3 4 3↑1 4
LATTICE 3 4↓1 3
SLMRec 2 2 2
FREEDOM 1 1 1

The above table reports the ranks of SOTA models under each splitting strategy. The rows are sorted by the performance of models under random splitting strategy, with the up and down arrows indicating the relative rank position swaps compared with random splitting. As we can see, the ranking swaps are observed between the models under different splitting strategies

Recommendation performance comparison using Different Modalities

We are interested in how the modality information benefits the recommendation, and which modality contributes more. We aim to understand the specific benefits of different modalities in recommendation systems and provide guidelines for researchers on selecting appropriate modalities. We evaluate it by feeding the single modality information, and compare the performance between using both modalities and the single modality.

The following figure is based on Recall@20 to show the summary and tendency of other modalities, visually summarize the impact of different modalities on various models. The orange point represents the performance of multi-modality, the green one represents the performance of textual modality and the blue point is for visual modality. The specific numerical values will be shown in our github.

image-1image-2

Please consider to cite our paper if it helps you, thanks:

@article{zhou2023comprehensive,
  title={A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions},
  author={Zhou, Hongyu and Zhou, Xin and Zeng, Zhiwei and Zhang, Lingzi and Shen, Zhiqi},
  journal={arXiv preprint arXiv:2302.04473},
  year={2023}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published