Skip to content

The 1st place solution for SIGIR 2020 E-Commerce Workshop Multimodal Product Classification Challenge

Notifications You must be signed in to change notification settings

Wang-Shuo/SIGIR2020Challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SIGIR 2020 E-Commerce Workshop Data Challenge (Multi-modal Classification Task)

Task Intro

This challenge focuses on the topic of large-scale multi-modal (text and image) classification, where the goal is to predict each product’s type code as defined in the catalog of Rakuten France. To be specific, given a training set of products and their product type codes, predict the corresponding product type codes for an unseen held out test set of products. The systems are free to use the available textual titles and/or descriptions whenever available and additionally the images to allow for true multi-modal learning.

Dataset

The organizer released approximately 99K product listings in tsv format, including 84,916 samples for training, 937 samples for phase 1 testing and 8435 samples for phase 2 testing. The training dataset consists of product titles, descriptions, images and their corresponding product type codes. There are 27 product categories in the training dataset and the number of product samples in each category ranges from 764 to 10,209. The frequency distribution of categories in the training dataset is shown in the following figure. category frequency distribution

Overview

We employed a decision-level fusion scheme to leverage mulitmodal product information. Specifically, the modal-specific classifiers are built from textual and image modal data respectively in the first stage. Then the late fusion strategy is learned from the class probabilities predicted by each modal classifier in the second stage. The overview of the proposed method is shown in the following figure. model overview

Reference

About

The 1st place solution for SIGIR 2020 E-Commerce Workshop Multimodal Product Classification Challenge

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published