GitHub - jonathanlawhh/emoji-pixelator: ML that pixelates emoji faces

Emoji Pixelation

Using sliding window and CNN to identify emoji and pixelate it.

Introduction

A small project that uses Tensorflow and an existing model (ResNet50) to build an emoji pixelater (if this is an actual word). Not using the most efficient computer vision technique, it still does the job pretty well :p

In this project, a box sliding through every 200x200 pixel was used to get that region and passes it to Tensorflow for predicting if that region consist of an emoji.

Running this system through the whole emoji movie, some emoji's were not detected as our training dataset does not include that specific emoji, for example an emoji wearing glasses. With enough time, patience and perseverance to collect all the emoji dataset, the system would be good to go.

Dataset

With all ML trainings comes with a miserable time collecting data. Few emojis were downloaded from https://emojipedia.org/people/ using a chrome extension that downloads images on a webpage.

Future enhancements?!?!

Future enhancement could include proper image segmentation (contour, blob detection etc) to reduce computational steps. Hough circle detection worked pretty well initially, however the poop emoji is not a circle, therefore falling back to the sliding window technique would make more sense for inclusiveness.

Usage

Setup

Git clone this repository and run

pip install -R requirements.txt

Built With

Parameters

In the main.py file, 3 parameters can be customized.

WIN_SIZE = 200
Window size for sliding window. The smaller the window size, the more iterations it goes through.
STEP_SIZE = WIN_SIZE - 50
How many pixels the sliding window should move/skip to the next window
CAPTURE_FRAMES = 120
How many frames the system will record on your desktop

generate_video.py generates video based on the frames in the output directory

transfer_learning.py will train and create emoji.h5 model based on the ./dataset/ directory

FAQ

Q: What model is it based on
A: Transfer learning using ResNet50 as the base model.

Q: Why sliding window and not RCNN/FRCNN
A: This project is just built for entertainment, and this method is by far the quickest one I found. It also serves as an introduction to basic computer vision applications. Although, I agree this method would not be the best to accomplish the goal for a production system as it is computational expensive.

Q: Why transfer learning and not building the layers from scratch
A: Due to the small dataset ( < 100 images initially ) and lack of expertise, it would be quicker with higher accuracy using existing weightages which is proven to be reliable.

Q: So why ResNet50 and not other models
A: Tbh there is no consideration here, I just took the first one I saw.

Q: Why emoji's
A: Why not? Although, it is a proof of concept it can be further improved for various use cases such as

Body parts pixelation in AV
Certain logo/trademark detection in videos etc
Your imagination

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
Resource		Resource
dataset		dataset
output		output
.gitignore		.gitignore
README.MD		README.MD
emoji.h5		emoji.h5
generate_video.py		generate_video.py
helpers.py		helpers.py
main.py		main.py
requirements.txt		requirements.txt
transfer_learning.py		transfer_learning.py

jonathanlawhh/emoji-pixelator

Folders and files

Latest commit

History

Repository files navigation

Emoji Pixelation

Introduction

Dataset

Future enhancements?!?!

Usage

Setup

Built With

Parameters

FAQ

About

Resources

Stars

Watchers

Forks

Languages