Skip to content

In this repository you will find everything you need to know about Convolutional Neural Network, and how to implement the most famous CNN architectures in both Keras and PyTorch. (I'm working on implementing those Architectures using MxNet and Caffe)

AnasBrital98/CNN-From-Scratch

Repository files navigation

Convolutional Neural Network From Scratch

This Repository Contains The Explanation and The Implementation Of Convolutional Neural Network Using Keras and Pytorch .

In This Repository you'll see :

  • Introduction to CNN .

  • Convolutional Neural Network vs Multilayer Perceptron .

  • Convolutional Neural Network Layers .

    • Kernels or Filters .

    • Convolutional layer .

    • Activation Layer .

    • Pooling Layer .

    • Fully Connected Layer .

  • Different Layers in Keras and pyTorch .

  • Most Common Architectures of CNN and their Implementation .

  • References .


Introduction :

The Convolutional Neural Network, known as CNN (Convolutional Neural Network), is one of the deep learning algorithms that is the development of the Multilayer Perceptron (MLP) designed to process data in the form of a Matrix (image, sound ...).

Convolutional Neural Networks are used in many fields, but we will just be interested in the application of CNNs to Images.

The question now is, what is an Image?

Image is Just a Matrix of Pixels .

Coding Modes of an Image:


Convolutional Neural Network vs Multilayer Perceptron :

Imagine with me that we've an Image classification problem to solve , and we've only one choice which is Multilayer Perceptron (Neural Network ) , and The images they have 240 height and 240 width and we're Using RGB.

do you know that we need to build a Neural Network with 240 * 240 * 3 = 172 800 Input which is a very big Neural Network , and it will be very hard for as to train it .

Can we find a solution that reduces the size of the images and preserves the Characteristics ?

This is Exactly What CNN Can Do .

In General :

CNN = Convolutional Layers + Activation Layers + Pooling Layers + Fully Connected Layers .


Convolutional Neural Network Layers :

Kernels or Filters in The Convolutional layer :

In the convolutional neural network, the Kernel is nothing more than a filter used to extract features from images. The kernel is a matrix that moves over the input data, performs the dot product with the input data subregion, and obtains the output as a dot product matrix. The kernel moves on the input data by the stride value.

There is a lot Kernels , each one is responsible for extracting a specific Feature.

Convolutional Layers :

The Convolution Layer Extract The Characteristics of The Image By Performing this operation To The Input Image :

The Convolutional Layer produce an Output Image with this Formula :

The Convolutional Layer needs Two Parameters to work :

  • Padding : the amount of pixels added to an image when it is being processed by the kernel of a CNN.
  • Stride : Stride is the number of pixels shifts over the input matrix .

Example 1 : Stride = 1 , Padding = 0 :

if we Applied our Formula (In The Picture above) we'll get The Same Result .

output width = (input_width - kernel_width + 2 * padding) / stride_width + 1

output height = (input_height - kernel_height + 2 * padding) / stride_height + 1

input Image : 6*6
Kernel Size : 2*2

output width = (6 - 2 + 2 * 0) / 1 + 1 = 5
output height = (6 - 2 + 2 * 0) / 1 + 1 = 5

Example 2 : Stride = 2 , Padding = 0 :

input Image : 6*6
Kernel Size : 2*2

output width = (6 - 2 + 2 * 0) / 2 + 1 = 3
output height = (6 - 2 + 2 * 0) / 2 + 1 = 3

Example 3 : Stride = 2 , Padding = 1 :

input Image : 6*6
Kernel Size : 2*2

output width = (6 - 2 + 2 * 1) / 2 + 1 = 4
output height = (6 - 2 + 2 * 1) / 2 + 1 = 4

In All The Examples Above we was talking about Convolution 2D , now let See The general Case which is Convolution 3D :

Input Image : W1×H1×D1 .
Number of filters : K (With Size F*F).
the stride  : S .
Padding : P .
Output : 
W2 = (W1−F+2P)/S+1 .
           H2 = (H1−F+2P)/S+1 .
           D2 = K .


Activation Function in The Convolutional layer :

The activation function used in CNN networks is RELU and it is defined as follows:

RELU (z) = max (0, z)

Pooling Layer :

The Pooling Layer Reduce The Size of The Image , there is two type of Pooling :

  • Max Pooling .
  • AVG Pooling .

The Output Of The Pooling Layer Can be calculated Using This Formula :

Max Pooling :

AVG Pooling :


Fully Connected Layer :

fully connected layer it can be seen as one layer of a simple Neural Network .


Different Layers in Keras and pyTorch :

Keras :

Keras is an open-source software library that provides a Python interface for artificial neural networks. Keras acts as an interface for the TensorFlow library.

  • Convolution Layer :
tf.keras.layers.Conv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="valid",
    data_format=None,
    dilation_rate=(1, 1),
    groups=1,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)
  • Activation Layer :
tf.keras.activations.relu(x, alpha=0.0, max_value=None, threshold=0)
  • Pooling Layer :

    • Max-Pooling :
    tf.keras.layers.MaxPooling2D(
    pool_size=(2, 2), strides=None, padding="valid", data_format=None, **kwargs
    )
    • Avg-Pooling :
    tf.keras.layers.AveragePooling2D(
    pool_size=(2, 2), strides=None, padding="valid", data_format=None, **kwargs
    )
  • Dropout Layer :

tf.keras.layers.Dropout(rate, noise_shape=None, seed=None, **kwargs)
  • Dense Layer or Fully Connected Layer :
tf.keras.layers.Dense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)

pyTorch :

PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab. It is free and open-source software released under the Modified BSD license.

  • Convolution Layer :
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
  • Activation Layer :
torch.nn.ReLU(inplace=False)
  • Pooling Layer :

    • Max-Pooling :
    torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
    • Avg-Pooling :
    torch.nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
  • Dropout Layer :

torch.nn.Dropout(p=0.5, inplace=False)
  • Dense Layer or Fully Connected Layer :
torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)

Most Common Architectures of CNN and their


References :

About

In this repository you will find everything you need to know about Convolutional Neural Network, and how to implement the most famous CNN architectures in both Keras and PyTorch. (I'm working on implementing those Architectures using MxNet and Caffe)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published