Experiments with convolutional neural networks for emotion recognition

Introduction

This work is based on Kaggle competition. I've tried to use various strategies in data preparation and model training for reaching a maximum score on a private dataset.

Versions of used libraries

Python 3.8.5

tensorflow 2.4.1
numpy 1.19.2
tensorflow-hub 0.12.0

Preparation

Before starting do these steps:

Download dataset files from Kaggle competition and unzip to ./Data folder. After that you will get a folder structure like this:
```
 ./dataset
   /train
     /anger
     /contempt
     /disgust
     etc.
   /test_kaggle
     <unstructured images>
```
Also, you can do this inside EDA.ipynb
Download openCV model files for face detector:
- model and rename as opencv_face_detector.caffemodel
- config file and place both files to ./Data folder.

Data analysis

Data analysis was placed in EDA.ipynb. In this, part I explore the data as well as getting structure and data statistics. I've cleaned data from outliers and prepared two datasets:

a full-image dataset
a dataset with cropped faces from source image.

Models training

In this notebook - Models_Training.ipynb I've prepared few filters for additional data augmentation and a function for splitting data into train and validation datasets which keeps proportions between classes. After that I've built four models:

and train them on two datasets.

The best private score on a single model was reached by BiT-M r50x1 on full-image dataset and amounted to 0.56840. After that, I've made the committee with four models which were headed by VGGFace. The decision was made by a majority vote. If the votes were divided, the decision was made by VGGFace. This technique allowed the model to achieve 0.58600 on a private dataset.

Valence-Arousal model

This notebook Valence_Arousal_model.ipynb has an overview and a training nature because the original dataset didn't have coordinate values in the Valence-Arousal system. I have to make them manually depending on the algorithm. VGG was chosen as the two-headed training model. It has two Dense output layers for predicting each coordinate in Valence-Arousal. This model reached 0.42280 on a private dataset.

Web-camera implementation

In this part I've prepared two notebooks for using a single model neural network and the committee for predicting emotions from the web-camera flow:

Models checkpoints are able here:

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Camera_streaming_committee.ipynb		Camera_streaming_committee.ipynb
Camera_streaming_single_model.ipynb		Camera_streaming_single_model.ipynb
EDA.ipynb		EDA.ipynb
Models_Training.ipynb		Models_Training.ipynb
README.md		README.md
Valence_Arousal_model.ipynb		Valence_Arousal_model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Camera_streaming_committee.ipynb

Camera_streaming_committee.ipynb

Camera_streaming_single_model.ipynb

Camera_streaming_single_model.ipynb

EDA.ipynb

EDA.ipynb

Models_Training.ipynb

Models_Training.ipynb

README.md

README.md

Valence_Arousal_model.ipynb

Valence_Arousal_model.ipynb

Repository files navigation

Experiments with convolutional neural networks for emotion recognition

Introduction

Versions of used libraries

Preparation

Data analysis

Models training

Valence-Arousal model

Web-camera implementation

About

Releases

Packages

Languages

lugrenl/Emotion-Recognition_model

Folders and files

Latest commit

History

Repository files navigation

Experiments with convolutional neural networks for emotion recognition

Introduction

Versions of used libraries

Preparation

Data analysis

Models training

Valence-Arousal model

Web-camera implementation

About

Topics

Resources

Stars

Watchers

Forks

Languages