Skip to content

polarbeargo/Intel-Edge-AI-Scholarship-Foundation-Course-Nanodegree-Program-Show-Case-Group-Project

Repository files navigation

Intel-Edge-AI-Scholarship-Foundation-Course-Nanodegree-Program-Show-Case-Group-Project

Participation in the Intel Edge AI Udacity Scholarship Show Case Group Project

Collaborators:

Name Slack Name
Sarah Majors Sarah Majors
Harkirat Singh Harkirat
Hsin Wen Chang Bearbear
Halwai Aftab Hasan aftab
Anshu Trivedi Anshu Trivedi
Frida Rode Frida

Our story start from trying to mitigation traffic jam in every city globally. In the early stage perform statistics inference detected objects on the video stream to inform traffic status in real-time. In this Intel® Edge AI Group showcase project, our goal is to extend our learning experience from Intel® Edge AI Scholarship Foundation Course Nanodegree and Hands-on Implementation as follow:

  • Import OpenVINO Toolkit, build and compile successfully on Google Colab.
  • Load pre-trained models.
  • Perform model optimization.
  • Integrate with TensorFlow Object Counting(TOC) API, the TOC API is the object detection implementation that runs and infers using the TensorFlow models at the backend; and is optimized to run at the edge using OpenVINO toolkit from Intel.
  • Perform statistics inference detected objects on video stream.

Load Pre-trained Models

  • You can use public or pre-trained models. To download the pre-trained models, use the Tensorflow Model Downloader or go to PreTrain Model Download
  • Before running the demo with a trained model, make sure the model is converted to the Inference Engine format (*.xml + *.bin) using The Model Optimizer

The Intel OpenVINO Toolkit Model Optimizer Flow Chart

In this project, we feed the model into the Model Optimizer, and get the Intermediate Representation. The frozen models will need TensorFlow-specific parameters like --tensorflow_use_custom_operations_config and --tensorflow_object_detection_api_pipeline_config. Also, --reverse_input_channels is usually needed, as TF model zoo models are trained on RGB images, while OpenCV usually loads as BGR. Certain models, like YOLO, DeepSpeech, and more, have their own separate pages.

Quantization

  • Quantization is the process of reducing the precision of a model. In the deep learning research field, the predominant numerical format used for research and for deployment has so far been 32-bit floating point, or FP32. However, the desire for reduced bandwidth and computational energy consumption of deep learning models has driven research into using lower-precision numerical formats. It has been extensively demonstrated that weights and activations can be represented using INT8 without incurring significant loss in accuracy. The use of even lower bit-widths, such as 4/2/1-bits, is an active field of research that has also shown great progress.
  • The OpenVINO™ Toolkit, models usually default to FP32, or 32-bit floating point values, while FP16 and INT8, for 16-bit floating point and 8-bit integer values, are also available (INT8 is only currently available in the Pre-Trained Models; the Model Optimizer does not currently support that level of precision). FP16 and INT8 will lose some accuracy, but the model will be smaller in memory and compute times faster. Therefore, quantization is a common method used for running models at the edge.
INT8 Operation Energy Saving vs FP32 Area Saving vs FP32
Add 30x 116x
Multiply 18.5x 27x

Supported Devices

The following Intel® hardware devices are supported for optimal performance with the OpenVINO™ Toolkit’s Inference Engine:

Device Types
CPUs
GPUs
VPUs
FPGAs

Integrate With TensorFlow Object Counting API


The TensorFlow Object Counting API is used as a base for object counting on this project, more info can be found on this repo. And a modified version repo for run on Google Colab


TensorFlow Object Counting API Work Flow

In this section, we further enhance real-time object counting by using TensorFlow Object Counting API to process input video stream perform real-time object detetction, track and count.

System Architecture

Object Tracking Flow Chart

In this section, we use what we learn from The Intel® Edge AI Scholarship Foundation Course Nanodegree Lesson 5 Deploying an edge app section 4 Handling Input Streams implement 'cv2.VideoCapture' lifecycle and 'cv2.VideoWriter' fulfilled Object Tracking and count detected Objects work flow.

Model Architecture

In here, we use the Single Shot Detector with MobileNet from TensorFlow Detection Model Zoo. SSD is designed for object detection in real-time. SSD speeds up the process by eliminating the need of the region proposal network. SSD applies multi-scale features and default boxes to recover the drop in accuracy. These improvements allow SSD to match the Faster R-CNN’s accuracy using lower resolution images thus pushes the speed higher.

Citation

@ONLINE{tfocapi,
    author = "Ahmet Özlü",
    title  = "TensorFlow Object Counting API",
    year   = "2018",
    url    = "https://github.com/ahmetozlu/tensorflow_object_counting_api"
}

License

This system is available under the GNU - 3.0 license. See the LICENSE file for more info.

Perform Statistics Inference On Video Stream Results

The following is our current progress status to perform statistics inference on video stream.

General Count Multi class count
Barcelona
Taipei City
Result in Taipei City Result in Taipei City
India
Result in India Result in India

Future Work

In this project, we successfully perform statistics inference detected objects on the video stream to inform traffic status in real-time. The next step we will implement the part to detect car accident through surveillance camera systems and throw alert in real-time inform the need of to mitigation traffic route can be applicable in every city globally.

References