Gesture Arithmetic - Landmark Extraction and Support Vector Machine for Arithmetic Calculations

This is an open-source implementation of a gesture detection model that uses the user's webcam and accurately predicts the user's hand gestures to do basic arithmetic like addition and multiplication.

It currently only supports 14 gestures, the numbers 0-10, the plus symbol (caused by closing the fist and raising the pinky finger), the multiplication symbol (caused by crossing the two index fingers), and the calculate button used to calculate the result. (caused by closing the fist and extending the thumb sideways)
In models/ are two pickled SVM models, one of which was trained without data augmentation and performs poorly on right-hand gestures (57% accuracy when testing); after applying data augmentation to the landmarks (horizontal flip) and their flattened versions, the test set has 100% accuracy on both left and right hand (shown in train.ipynb).

gesture_1.mp4

gesture_2.mp4

Salient Features

Waits for 5 loop iterations before confirming input
Respects PEMDAS order of operations when doing the math
Supports left and right-hand gestures based on a left-hand training dataset that was augmented to work on the right hand via data augmentation
Uses Google's MediaPipe library for landmark extraction (21 landmarks) on each hand (42 total landmarks)
Processes landmarks and passes them through an SVM for classification inspired by Nguyten et. al's 2014 paper
Has previous implementations of different approaches stored (YoloV* + CNN and mediapipe + NN)
Reasonably fast (> 30 fps)

Training on custom gestures/data

More training data can be added by creating two folders called new_train_data/data and placing images of any size in with the naming convention "CLASS INDEX" (for example, "0 4.jpg") and then running train.ipynb.

train.ipynb processes the images and automatically generates a test set. Creating new gestures only requires changing the image_dataset_loader.py's translation dictionary at the beginning following the same format.

Setup

Step 1. Install requirements.txt Install everything in the requirements.txt file (preferably through a venv as well) Python 3.11.4 was used.

Step 2. Run the train.ipynb notebook. Since the training data isn't provided in the GitHub, a pickled version of the two models is stored in models/ for ease of use. The live webcam detection is in the final cell.

Pitfalls

A pitfall of the project is that there wasn't any data on rotated gestures/landmarks (i.e., my hand was straight), which causes the models to perform slightly worse if the hand/fingers aren't in the correct orientation. Thus, I plan to apply more data augmentation techniques to both the images and landmarks to make the SVM more robust and less error-prone (15-degree rotations).

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
approach 1 (yolov8 + CNN)		approach 1 (yolov8 + CNN)
approach 2 (mediapipe + NN)		approach 2 (mediapipe + NN)
models		models
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
algebra_operators.py		algebra_operators.py
draw_landmarks_mediapipe.py		draw_landmarks_mediapipe.py
hand_landmark_one.ipynb		hand_landmark_one.ipynb
hand_landmarker.task		hand_landmarker.task
image_dataset_loader.py		image_dataset_loader.py
requirements.txt		requirements.txt
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

approach 1 (yolov8 + CNN)

approach 1 (yolov8 + CNN)

approach 2 (mediapipe + NN)

approach 2 (mediapipe + NN)

models

models

.DS_Store

.DS_Store

.gitignore

.gitignore

README.md

README.md

algebra_operators.py

algebra_operators.py

draw_landmarks_mediapipe.py

draw_landmarks_mediapipe.py

hand_landmark_one.ipynb

hand_landmark_one.ipynb

hand_landmarker.task

hand_landmarker.task

image_dataset_loader.py

image_dataset_loader.py

requirements.txt

requirements.txt

train.ipynb

train.ipynb

Repository files navigation

Gesture Arithmetic - Landmark Extraction and Support Vector Machine for Arithmetic Calculations

Salient Features

Training on custom gestures/data

Setup

Pitfalls

About

Releases

Packages

Languages

awesominat/gesture-arithmetic

Folders and files

Latest commit

History

Repository files navigation

Gesture Arithmetic - Landmark Extraction and Support Vector Machine for Arithmetic Calculations

Salient Features

Training on custom gestures/data

Setup

Pitfalls

About

Resources

Stars

Watchers

Forks

Languages