Skip to content

divnori/gesture_recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

For my design exercise, I implemented a gesture recognition system that recognizes the five vowels - a, e, i, o, u - and the letter y. The system has two modes: teach mode and compare mode. In teach mode, the user can record a gesture, label it with the corresponding letter, and save to the database. In compare mode, the user can record a gesture, and the system returns the closest saved letter.

I began by drafting an FSM diagram as shown below:

The system starts in IDLE_STATE where the TFT display gives the user instructions. The user can decide whether they'd like to teach a gesture or compare a gesture. The user has two options for next steps:

  • Bottom button - systemState = PRE_TEACH_STATE
  • Top button - systemState = PRE_MEASURE_STATE

In this way, the user progresses through the flow outlined in the FSM flowchart.

Most of my design thought went into the RECORD_TEACH_STATE, RECORD_MEASURE_STATE, LETTER_STATE, GET_STATE, and POST_STATE. I will elaborate on these:

In RECORD_TEACH_STATE and RECORD_MEASURE_STATE, I needed to record the device's motion over a period of time. To do this, I needed to sample the IMU's acceleration values every set interval (I chose a second). I got the z acceleration value every second and appended it to a char array holding the history of acceleration values. The code looked like this:

//get accels continuously
imu.readAccelData(imu.accelCount);
x = imu.accelCount[0] * imu.aRes; 
y = imu.accelCount[1] * imu.aRes; 
z = imu.accelCount[2] * imu.aRes; 

//just store z vals
sprintf(curr_z, "%f, ", z);
strcat(z_arr, curr_z);
Serial.printf("current z vec is %s \n", z_arr);

while (millis() - primary_timer < LOOP_PERIOD); //wait for primary timer to increment
primary_timer = millis();

In LETTER_STATE, I needed to design a way for the user to indicate which letter they just trained the system to recognize. I decided to implement a scheme where short presses and long presses of the 3 buttons signify the 6 different possible letter options. The code is shown below for the top button (detects 'A' and 'E'):

if (topButtonState==UNPRESSED && topButton==0) {
    topButtonState = PRESSED;
    Serial.println("TOP BUTTON HAS BEEN PRESSED");
    button_start_time = millis();
}      
else if (topButtonState==PRESSED && topButton==1) {
    topButtonState = UNPRESSED;
    systemState = POST_STATE;
    if (millis() - button_start_time >= long_press_duration) {
        Serial.println("long press detected");
        strcpy(letter, "E");
    }
    else {
        strcpy(letter, "A");
    }
    tft.fillScreen(TFT_BLACK); //fill background
}

In GET_STATE, I wanted to implement a GET request to my server script. I wanted to send the newly recorded user gesture acceleration values, compare it to the stored database gestures, and return the closest gesture. To do this, I implemented the following in my server-side code:

if request["method"] == "GET":
    //only showing A for clarity, but did this for all letters
    try:
        accels = str(request['values']['accels'])
        stored_accels = getalldata()
        A_sig = list(stored_accels[0][0][0].split(", "))
        accel_sig = list(accels.split(", "))
        if len(A_sig)>10:
            A_sig = [float(x) for x in A_sig[0:len(A_sig)-1]]
        else:
            A_sig = [float(x) for x in A_sig]

        if len(accel_sig)>10:
            accel_sig = [float(x) for x in accel_sig[0:len(accel_sig)-1]]
        else:
            accel_sig = [float(x) for x in accel_sig]

        if len(accel_sig)<len(A_sig):
            A_sig = A_sig[0:len(accel_sig)]
            corrA = correlation(A_sig, accel_sig)
        else:
            corrA = correlation(A_sig, accel_sig[0:len(A_sig)])
        
        maxcoor = max(corrA, corrE, corrI, corrO, corrU, corrY)

        if maxcoor==corrA:
            return 'A'

The correlation() function being used in the code above was taken from an exercise that I previously completed.

Lastly, in POST_STATE, I wanted to save the user-trained gesture and the letter classification to the database. To do this, I needed to do a POST request to my server script. I implemented this functionality like this:

elif request["method"]=="POST":
    try:
        accels = str(request['form']['accels'])
        letter = str(request['form']['letter'])
        create_database()
        x = getalldata()
        if len(x[0])==0:
            initialize()
        else:
            update(accels,letter)

        return 'success'

    except Exception as e:
        return 'error: ' + str(e)

My database included two fields - acceleration and letter. Both are text fields, but before any processing or comparison is done with the acceleration data, the string is converted into a list of floats.

My demonstration video is shown below:

demonstration video

Summary

Overall, my system implements all specific components:

  • Allows teaching and long-term storage of 2D movements
  • Allows comparison of unknown user gesture inputs
  • Exists in a fully operational single script

The system can recognize six different movements and provides an intuitive mechanism for the user. Through the use of several 6.08 skills - IMU integration, correlation calculation, server side work, database design, etc. - an accurate gesture recognition system comes together.

Thank you for reviewing my design exercise!

About

Alphabet Gesture Recognition System with Arduino, C++, and Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published