Skip to content

AtomicStation/UFCV_msds_practicum

Repository files navigation

Unconventional Fitness Computer Vision - Mace Action Detection

Description

A lot of current, popular fitness modalities deal with movements that are either up and down (squats, deadlifts) or front and back (bench press). Not a lot of exercises deal with side to side movements. The steel mace is an offset weight on a long handle, which allows for unique opportunities. Steel mace training provides enhanced grip strength, improved core strength and stabilization, increased range of motion, better shoulder health, and fun!

The offset nature of the weight means that momentum is used during the swings. The "float" means that the steel mace swing can also be used to train strength-endurance. Endurance activities like running have technologies that utilize GPS tracking and time, which results in accurate metrics for assessing training volume. For steel mace swings, time and swing repetitions are required for volume (swings per minute, swings per hour, etc). However, for long sessions, keeping track of swing repetitions becomes nearly impossible.

This project was designed with this problem in mind: using computer vision techniques to count reps for steel mace swings.

Currently, the project uses OpenCV and MediaPipe pose estimation data to train a LSTM RNN model to detect swings.

Table of Contents

Installation

Project requires Jupyter Notebook access.

This project imports the following libraries:

  • cv2 (OpenCV)
  • mediapipe (MediaPipe)
  • tensoflow.keras (TensorFlow)
  • numpy
  • matplotlib
  • scipi

Usage

A video is just a series of images shown frame by frame:

Combined, these frames create the illusion of motion:

OpenCV handles the videos, and MediaPipe detects and generates landmarks. These landmarks are used to train the model and detect actions.

This project uses the Pose library from MediaPipe, with the following landmarks:

Currently, training is done by using OpenCV and MediaPipe to generate categorized actions. Each action is a series of 30 frames.

For example, the "hold" action could have frames that look like these:

And the "swing" action could have frames that look like these:

Each action has 30 sets of 30 frames

Each frame is transformed into an array using MediaPipe landmarks

Ideas for future implementations.

Creating an app using Kivy so that this application can be used anywhere.

Once that is out of the way and the app can be used and tested in the field, then other features and user stories can be addressed.

For example, one user story is to address the training program: Instead of training the model with a specific video, any video can be used with a specific frame identifying the "hold" action:

Credits

This project uses Nicholas Renotte's excellent video tutorials as a starting point.

AI Pose Estimation with Python and MediaPipe | Plus AI Gym Tracker Project

Sign Language Detection using ACTION RECOGNITION with Python | LSTM Deep Learning Model

License

GNU General Public License v3.0

Releases

No releases published

Packages

No packages published