Skip to content

matlab-deep-learning/mtcnn-face-detection

Repository files navigation

Face Detection and Alignment MTCNN

circleci codecov

This repository implements a deep-learning based face detection and facial landmark localization model using multi-task cascaded convolutional neural networks (MTCNNs).

Note: This code supports inference using a pretrained model. Training from scratch is not supported. Weights are imported from the original MTCNN model trained in Caffe.

Installation

  • Face Detection and Alignment MTCNN requires the following products:
    • MATLAB R2019a or later (now works in R2019a and later!)
    • Deep Learning Toolbox
    • Computer Vision Toolbox
    • Image Processing Toolbox
  • Download the latest release of the Face Detection and Aligment MTCNN. To install, open the .mltbx file in MATLAB.

Getting Started

To get started using the pretrained face detector, import an image and use the mtcnn.detectFaces function:

im = imread("visionteam.jpg");
[bboxes, scores, landmarks] = mtcnn.detectFaces(im);

This returns the bounding boxes, probabilities, and five-point facial landmarks for each face detected in the image.

Usage

The detectFaces function supports various optional arguments. For more details, refer to the help documentation for this function by typing help mtcnn.detectFaces at the command window.

To get the best speed performance from the detector, first create a mtcnn.Detector object, then call its detect method on your image. Doing so ensures that the pretrained weights and options are loaded before calling detect:

detector = mtcnn.Detector();
[bboxes, scores, landmarks] = detector.detect(im);

The detector object accepts the same optional arguments as the mtcnn.detectFaces function.

Refer to the MATLAB toolbox documentation or click here for a complete example.

About

The MTCNN face detector is fast and accurate. Evaluation on the WIDER face benchmark shows significant performance gains over non-deep learning face detection methods. Prediction speed depends on the image, dimensions, pyramid scales, and hardware (i.e. CPU or GPU). On a typical CPU, for VGA resolution images, a frame rates ~10 fps should be achievable.

In comparisson to MATLAB's built in vision.CascadeObjectDetector the MTCNN detector is more robust to facial pose as demonstrated in the image below.

Face detection from MTCNN in yellow, detections from the built in vision.CascadeObjectDetector in teal.

Contribute

Please file any bug reports or feature requests as GitHub issues. In particular if you'd be interested in training your own MTCNN network comment on the following issue: Support training MTCNN

Copyright 2019 The MathWorks, Inc.