Skip to content

Kaggle - Hotel-ID to Combat Human Trafficking 2021 (FGVC8) - image similarity

Notifications You must be signed in to change notification settings

michal-nahlik/kaggle-hotel-id-2021

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Hotel-ID to Combat Human Trafficking 2021 (FGVC8) - 8th place solution

Code used for Hotel-ID to Combat Human Trafficking 2021 - FGVC8 kaggle competition. The task was to identify hotel to which given image of a room belongs to.

Detailed description: https://www.kaggle.com/c/hotel-id-2021-fgvc8/discussion/242207

alt text

Data

For training I used only competition data rescaled and padded to 512x512 pixels but including external data (like Hotels-50K dataset) can improve the score significantly. Following autmentations were used during training: HorizontalFlip, VerticalFlip, ShiftScaleRotate, OpticalDistortion, IAAPerspective, CoarseDropout, RandomBrightness.

EDA: src/hotel-id-eda-with-plotly.ipynb (nbviewer)

Image preprocessing notebook: src/hotel-id-preprocess-images.ipynb
512x512 dataset: https://www.kaggle.com/michaln/hotelid-images-512x512-padded
256x256 dataset: https://www.kaggle.com/michaln/hotelid-images-256x256-padded
Notebook to download Hotels-50K dataset: src/download-hotels-50K.ipynb

Description

Trained 3 types of models with different backbones:
ArcMargin model: src/training/hotel-id-arcmargin-training.ipynb
CosFace model: src/training/hotel-id-cosface-training.ipynb
Classification model: src/training/hotel-id-classification-training.ipynb

Parameters: Lookahead (k=3) + AdamW optimizer, OneCycleLR scheduler, CrossEntropyLoss/CosFace loss

These models were used to generate embeddings for each image which were then used to calculated cosine similarity of the test images to the train dataset. Product of similarities was used to ensemble output from different models and to find the top 5 most similar images from different hotels.

Trained models: https://www.kaggle.com/michaln/hotelid-trained-models
Inference notebook: src/hotel-id-inference.ipynb

Results

Evaluation metric: Mean Average Precision @5

Type Backbone Embed size Public LB Private LB Epochs
ArcMargin eca_nfnet_l0 1024 0.6564 0.6704 6/6
ArcMargin efficientnet_b1 4096 0.6780 0.6962 9/9
Classification eca_nfnet_l0 4096 0.6691 0.6875 6/9
CosFace ecaresnet50d_pruned 4096 0.6702 0.6796 9/9
Ensemble 0.7273 0.7446

Instructions

  1. Prepare data: download the preprocessed dataset or run hotel-id-preprocess-images notebook to generate images
  2. Train models: run hotel-id-arcmargin-training, hotel-id-cosface-training, hotel-id-classification-training notebooks, or use trained models
  3. Inference: Edit models and paths in inference notebook and run it on Kaggle

About

Kaggle - Hotel-ID to Combat Human Trafficking 2021 (FGVC8) - image similarity

Topics

Resources

Stars

Watchers

Forks