Skip to content

Experiments with convolutional neural networks for emotion recognition

Notifications You must be signed in to change notification settings

lugrenl/Emotion-Recognition_model

Repository files navigation

Experiments with convolutional neural networks for emotion recognition

Introduction

This work is based on Kaggle competition. I've tried to use various strategies in data preparation and model training for reaching a maximum score on a private dataset.

Versions of used libraries

Python 3.8.5

  • tensorflow 2.4.1
  • numpy 1.19.2
  • tensorflow-hub 0.12.0

Preparation

Before starting do these steps:

  • Download dataset files from Kaggle competition and unzip to ./Data folder. After that you will get a folder structure like this:
     ./dataset
       /train
         /anger
         /contempt
         /disgust
         etc.
       /test_kaggle
         <unstructured images>
    
    Also, you can do this inside EDA.ipynb
  • Download openCV model files for face detector:
    • model and rename as opencv_face_detector.caffemodel
    • config file and place both files to ./Data folder.

Data analysis

Data analysis was placed in EDA.ipynb. In this, part I explore the data as well as getting structure and data statistics. I've cleaned data from outliers and prepared two datasets:

  • a full-image dataset
  • a dataset with cropped faces from source image.

Models training

In this notebook - Models_Training.ipynb I've prepared few filters for additional data augmentation and a function for splitting data into train and validation datasets which keeps proportions between classes. After that I've built four models:

and train them on two datasets.

The best private score on a single model was reached by BiT-M r50x1 on full-image dataset and amounted to 0.56840. After that, I've made the committee with four models which were headed by VGGFace. The decision was made by a majority vote. If the votes were divided, the decision was made by VGGFace. This technique allowed the model to achieve 0.58600 on a private dataset.

Valence-Arousal model

This notebook Valence_Arousal_model.ipynb has an overview and a training nature because the original dataset didn't have coordinate values in the Valence-Arousal system. I have to make them manually depending on the algorithm. VGG was chosen as the two-headed training model. It has two Dense output layers for predicting each coordinate in Valence-Arousal. This model reached 0.42280 on a private dataset.

Web-camera implementation

In this part I've prepared two notebooks for using a single model neural network and the committee for predicting emotions from the web-camera flow:

facedetector

Models checkpoints are able here: