Traffic Sign Recognition

README

This is the repo of Traffic sign recognition project of Udacity Self-driving Car Nano degree, original repo can be found here

Build a Traffic Sign Recognition Project

The goals / steps of this project are the following:

Load the data set (see below for links to the project data set)
Explore, summarize and visualize the data set
Design, train and test a model architecture
Use the model to make predictions on new images
Analyze the softmax probabilities of the new images
Summarize the results with a written report

Here is a link to my project code

Data Set Summary & Exploration

1. Summary of the data set:

The size of training set is 34799
The size of the validation set is 4410
The size of test set is 12630
The shape of a traffic sign image is (32,32,3)
The number of unique classes/labels in the data set is 43

2. Exploratory visualization of the dataset.

Here is an exploratory visualization of the data set. It is a bar chart showing number of images of each class in the training dataset. I noticed that the distribution of classes is quiet unbalanced. Several classes have more than 700 examples (ex. speed limit (50km/h), speed limit (30km/h), yield...) where several have less than 100 examples (ex. Dangerous curve to the left, End of no passing, Speed limit (20km/h)).

Design and Test a Model Architecture

1. Data preprocessing

As a first step, I decided to convert the images to grayscale to focus on traffic signs' pattern and accelerate the training process.

Here is an example of a traffic sign image before and after grayscaling.

As a last step, I standardized the image data by scaling pixel values to have a zero mean and unit variance.

As the training dataset has very unbalanced samples between classes, I decided to augment artificially the image data such that the training dataset gets a balanced class distribution.

To add more data to the the data set, I used random image transformation combined with horizontal/vertical shift, zoom, shear and rotation.

Here is an example of original image samples:

And for augmented image samples:

Finally, some of the images in the training dataset are taken in low contrast conditions which will prevent the network from seeing all pixels information behind the darkness. I then applied an adaptive histogram equalization on images suffering from low contrast. The following an example of the effect of histogram equalization:

2. Model Architecture

I used the LeNet architecture and my final model consisted of the following layers:

Layer	Description
Input	32x32x1 grayscale image
Convolution 3x3	1x1 stride, valid padding, outputs 28x28x32
Batch Normalization
RELU
Max pooling	2x2 stride, outputs 14x14x32
Dropout	0.2
Convolution 3x3	1x1 stride, valid padding, outputs 10x10x64
RELU
Batch Normalization
Max pooling	2x2 stride, valid padding, outputs 5x5x64
Dropout	0.2
Fully connected	output 120
RELU
Fully connected	output 84
RELU
Fully connected	output 43

3. Model Training

To train the model, I used an ADAM optimizer to minimize the cost function represented by the cross entropy within the following hyperparameters setting:

batch size = 512
epochs = 20
learning rate = 0.002

4. Solution Approach

My final model results were:

training set accuracy of 99.6%
validation set accuracy of 98.5%
test set accuracy of 95.4%

The model selection was done through an iterative approach:

First architecture tried was the original LeNet-5 model from the class, as it is a well built and efficient architecture for image classification.

The first architecture was fast on training but did not returned satisfied accuracy after several hyperparameters tuning iterations. The results showed an overfitting issue, the model had a good bias level while there was a great gap between training loss and validation loss.

As the input from prior layers can vary after weights updates, I first added a batch normalization right after convolution layer to standardize inputs fed to activation unit, the technique helps stabilize and accelerate the training and offering a second benefit on regularization. I added as well the dropout regularization term to help the model generalize better (break potential dependencies between training data and some nodes), but at the meantime I tried not to degrade much the bias level, so I increased the filter depth to consider more features as well. My hyperparameters tuning mainly focused on the probability of dropout and the depth of convolution filters to manage the overfitting/underfitting trade-off. I then decreased a bit the learning rate but increased the epochs to reach the final model results.

The final model output a low bias level and kept a reasonable loss gap between training and validation.

Test a Model on New Images

1. Acquiring New Images

Here are ten German traffic signs that I found on the web:

The images are all taken in good light conditions with a normal contrast level. Those images also have a good sharpness. The 5th image with a "Dangerous to turn right" sign slightly rotated. The 7th image "Slippery road" with watermark which may introduce undesired bias for the model to predict correctly.

2. Predictions on these new traffic signs

Here are the results of the prediction:

The model was able to correctly guess 10 of the 10 traffic signs, which gives an accuracy of 100%. This compares favorably to the accuracy on the test set of 95.4%

3. Softmax probabilities analysis

The code for making predictions on my final model is located in the line #208 of the Ipython notebook.

For each of the images, the top five soft max probabilities were:

The model is quiet sure (probability close to 100%) about what it predicts for the images. Except for the "Slippery road" image and "No entry", the model has less certainty but predicts correctly.

Again by visualizing the confusion matrix of test dataset, we can get a more clear idea on how certain the model predicts on each of the classes:

The model sees the "Speed limit (60km/h)" sign as "Speed limit (80km/h)", "Speed limit (100km/h)" sign as "Speed limit (120km/h)" a lot. The model has trouble as well to predict "Pedestrians","Beware of ice/snow" and "Bumpy road":

ClassId	SignName	Precision	Recall
27	Pedestrians	0.882353	0.500000
30	Beware of ice/snow	0.840909	0.740000
22	Bumpy road	0.959184	0.783333
3	Speed limit (60km/h)	0.975069	0.782222
7	Speed limit (100km/h)	0.990000	0.880000

The model's certainty about "Double curve" sign and "Speed limit (20km/h)" sign is still limited.

ClassId	SignName	Precision	Recall
21	Double curve	0.566434	0.900000
0	Speed limit (20km/h)	0.674699	0.933333

Further work on image preprocessing and model architecture is still needed to improve model's generalization ability about those classes.

Visualizing the Neural Network (See Step 4 of the Ipython notebook for more details)

1. Visual output of network's feature maps.

The model has a high precision and recall score for the "Speed limit(30km/h)" sign. Here are feature map visualization of the first convolution layer output of two "Speed limit(30km/h)" samples

Fisrt image:

1st conv layer output:

Second image:

1st conv layer output:

Filter 3,4,5,8,22,26,27,31 mainly focus on the number's shape, Filter 20 reads the outline of the sign, and the red circle is the common feature for the majority of filters

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
examples		examples
results		results
web_images		web_images
.DS_Store		.DS_Store
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
Traffic_Sign_Classifier.html		Traffic_Sign_Classifier.html
Traffic_Sign_Classifier.ipynb		Traffic_Sign_Classifier.ipynb
checkpoint		checkpoint
debug.log		debug.log
final_model		final_model
set_git.sh		set_git.sh
signnames.csv		signnames.csv
visualize_cnn.png		visualize_cnn.png

License

kaoozhi/CarND-Traffic-Sign-Classifier-Project

Folders and files

Latest commit

History

Repository files navigation