Semantic Segmentation

In this project I trained a Fully Convolutional Network (FCN) to classify each pixel of an image as ROAD or NOT ROAD.

I used the KITTI Dataset avaialable at http://www.cvlibs.net/datasets/kitti/eval_road.php

The dataset consists of 289 training and 290 test images. It contains three different categories of road scenes:

uu - urban unmarked (98/100)
um - urban marked (95/96)
umm - urban multiple marked lanes (96/94)
urban - combination of the three above

Ground truth has been generated by manual annotation of the images and is available for two different road terrain types:

road - the road area, i.e, the composition of all lanes, and
lane - the ego-lane, i.e., the lane the vehicle is currently driving on (only available for category "um").

Ground truth is provided for training images only.

The original paper that made available the KITTI Dataset by Jannik Fritsch et al. can be found at http://www.cvlibs.net/publications/Fritsch2013ITSC.pdf

The FCN was based on the paper by Jonathan Long et al. https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf

Predictions

Urban Multiple Marked Lanes

Urban Unmarked Lanes

Urban Marked Lanes

Misses

Model

Architecture

Following the paper by Jonathan Long, it uses the original VGG 16 network and replaces the fully connected layers with three 1x1 convolutions for layers 7, 4 and 3, adding skip layers between them.

Parameters

keep_prob: 0.5
learning_rate: 0.0005
epochs: 30
batch_size: 8

After several trials, choosing a keep probability of 0.5, a learning rate of 0.0005 and 30 epochs in batches of 8 images was the run with good results. The loss continually decreased and in the 30th epoch it ended between 0.0200 and 0.0300.

  ...
  - loss   0.0242 (images: 8, labels: 8)
  - loss   0.0289 (images: 8, labels: 8)
  - loss   0.0181 (images: 1, labels: 1)
Running epoch 30/100
  ...

I run the final model for 100 epochs in batches of 8 images. It took 50 minutes to complete (GTX 1080) and reached a final loss of about 0.0100

The final network generated the following TensorFlow model when saved:

SIZE   NAME
----------------------------------------
513M - model_01.pb
513M - model_01.meta
513M - model_01.ckpt.meta
4.8K - model_01.ckpt.index
1.6G - model_01.ckpt.data-00000-of-00001

Original

Resized

Softmax

Final Overlay

Model Prediction on Test Images

A few examples from the best model run:

Urban Marked Lanes

Urban Multiple Marked Lanes

Urban Unmarked Lanes

Videos

I run the final model in some videos from my dashcam. The results are remarkable good in portions of the route with similar characteristics than the KITTI dataset.

Considering that none of these images were used for training and the video was completely different, the predictions look good:

Complete videos:

Notes

When building the initial model, I didn't consider the kernel_initializer parameter in the layers (it used the default initializer). That caused the model to generate segmentations with noisy borders:

kernel_initializer with default values	kernel_initializer with truncated normal values

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
images		images
model		model
videos		videos
.gitignore		.gitignore
README.md		README.md
SETUP.md		SETUP.md
exploratory_data_analysis.ipynb		exploratory_data_analysis.ipynb
fully_convolutional_networks.ipynb		fully_convolutional_networks.ipynb
helper.py		helper.py
main.py		main.py
project_tests.py		project_tests.py
requirements.txt		requirements.txt
semantic_segmentation.ipynb		semantic_segmentation.ipynb
training_output.txt		training_output.txt

neocsr/semantic-segmentation

Folders and files

Latest commit

History

Repository files navigation

Semantic Segmentation

Predictions

Urban Multiple Marked Lanes

Urban Unmarked Lanes

Urban Marked Lanes

Misses

Model

Architecture

Parameters

Original

Resized

Softmax

Final Overlay

Model Prediction on Test Images

Urban Marked Lanes

Urban Multiple Marked Lanes

Urban Unmarked Lanes

Videos

Notes

References

About

Topics

Resources

Stars

Watchers

Forks

Languages