Skip to content

MahmudulAlam/Fingertip-Mixed-Reality

Repository files navigation

Affine Transformation of Virtual Object

A convolutional neural network (CNN) based thumb and index fingertip detection system are presented here for seamless interaction with a virtual 3D object in the virtual environment. First, a two-stage CNN is employed to detect the hand and fingertips, and using the information of the fingertip position, the scale, rotation, translation, and in general, the affine transformation of the virtual object is performed.

Update

This is the version 2.0 that includes a more generalized affine transformation of virtual objects in the virtual environment with more experimentation and analysis. Previous versions only include the geometric transformation of a virtual 3D object with respect to a finger gesture. To get the previous version visit here.

GitHub stars GitHub forks Downloads GitHub license

Paper

Paper for the affine transformation of the virtual 3D object has been published in Virtual Reality & Intelligent Hardware, Elsevier Science Publishers in 2020. To get more detail, please go through the paper. Paper for the geometric transformation of the virtual object v1.0 has also been published. For more detail, please go through this paper. If you use the code or data from the project, please cite the following papers:

Paper

Affine transformation of virtual 3D object using 2D localization of fingertips 🔗

@article{alam2020affine,
  title={Affine transformation of virtual 3D object using 2D localization of fingertips},
  author={Alam, Mohammad Mahmudul and Rahman, SM Mahbubur},
  journal={Virtual Reality \& Intelligent Hardware},
  volume={2},
  number={6},
  pages={534--555},
  year={2020},
  publisher={Elsevier}
}

Paper

Detection and Tracking of Fingertips for Geometric Transformation of Objects in Virtual Environment 🔗

@inproceedings{alam2019detection,
  title={Detection and Tracking of Fingertips for Geometric Transformation of Objects in Virtual Environment},
  author={Alam, Mohammad Mahmudul and Rahman, SM Mahbubur},
  booktitle={2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA)},
  address = {Abu Dhabi, United Arab Emirates},
  pages={1--8},
  year={2019},
  organization={IEEE}
}

System Overview

Here it the real-time demo of the scale, rotation, translation, and overall affine transformation of the virtual object using finger interaction.

Dataset

To train the hand and fingertip detection model two different datasets are used. One is a self-made publicly released dataset called TI1K Dataset which contains 1000 images with the annotations of hand and fingertip position and another one is Scut-Ego-Gesture Dataset.

Requirements

  • TensorFlow-GPU==1.15.0
  • OpenCV==4.2.0
  • Cython==0.29.2
  • ImgAug==0.2.6
  • Weights: download the trained weights file for both hand and fingertip detection model and put the weights folder in the working directory.

Downloads

Experimental Setup

The experimental setup has a server and client-side. Fingertip detection and tracking and all other machine learning stuff are programmed in the server-side using Python. On the client-side, the virtual environment is created using Unity along with the Vuforia software development kit (SDK). To locate and track a virtual object using the webcam, Vuforia needs marker assistance. For that purpose, a marker is designed which works as an image target. The marker/ folder contains the pdf of the designed marker. To use the system print a copy of the marker.

How to Use

First, to run the server-side directly run 'server.py'. It will wait until the client-side (Unity) is starting to send images to the server.

directory > python server.py

Open the 'Unity Affine Transformation' environment using Unity and hit the play button. Make sure a webcam is connected.

Bring your hand in front of the webcam and interact with the virtual object using your finger gesture.