Skip to content

We explore different techniques to perform few-shot-classification of fashion images.

Notifications You must be signed in to change notification settings

carlossantosgarcia/few-shot-classification

Repository files navigation

Few-Shot-Classification (FSL)

The objective of this project is to evaluate the feasibility of using pre-trained feature-extractors to quickly categorize products based on images with limited amounts of data. To do that, we explore different techniques based on metric learning (siamese and prototypical networks) and meta-learning (model-agnostic meta-learning). Our results are presented in our defense presentation and report.

Dataset

We use real images from seven luxury brands. The dataset provided by Navee contains 3967 classes accross 7 brands. Each class represents a fashion article and contains about 5 images. Here is an example of the images available for three different articles. img

Image Retrieval task

We approach FSL through a retrieval task that is evaluated with mean average precision (mAP): our systems embed images in a space where similar articles should be projected closer to one another. During training, mAP metrics are computed to check on the performances of our networks. img

Siamese Networks

They consist of neural networks that contain two or more identical sub-networks, that share same characteristics and parameters and undergo the same updates during training. img Two main losses have been used to train our models: contrastive loss [1] and triplet loss, especially developed in [2]. The former is based on using pairs of images. The latter is based on the use of triplets of images. The idea of both losses is to push similar images close together and dissimilar images far from another in the embedding space.

Prototypical Networks

Prototypical networks are a metric-learning technique using in our implementation ResNet-50 to map fashion images into a metric space where classification is then done computing prototypes (means) from each category and their distance to the query image. This simple method can actually thrive in limited-data regime.

Model-Agnostic Meta-Learning

This is an implementation of the paper by Finn et. al[3], that uses meta-learning to train a model on batches of tasks, for the purpose of image classification using few-shot learning. Although it's model-agnostic, since it can be implemented with any gradient-descent model, we use a convolutional network. Training is composed of training on each individual task then minimizing the sum of all losses. This implementation is heavily inspired by this one.

References

[1] LeCun et al. (2005). Dimensionality reduction by learning an invariant mapping

[2] Schroff et al. (2015). FaceNet: A Unified Embedding for Face Recognition and Clustering

[3] Finn et al. (2017). Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

About

We explore different techniques to perform few-shot-classification of fashion images.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published