8.4 Convolutional neural networks

What is Convolutional Neural Network?

A convolutional neural network, also know as CNN or ConvNet, is a feed-forward neural network that is generally used to analyze viusal images by processing data with grid-like topology. A CNN is used to detect and classify objects in an image. In CNN, every image is represented in the form of an array of pixel values.

The convoluion operation forms the basis of any CNN. In convolution operation, the arrays are multiplied element-wise, and the dot product is summed to create a new array, which represents Wx.

Layers in a Convolutional Neural Network

A Convolution neural network has multiple hidden layers that help in extracting information from an image. The four important layers in CNN are:

Convolution layer
ReLU layer
Pooling layer
Fully connected layer (also called Dense layer)

Convolution layer

This is the first step in the process of extracting valuable freatues from an image. A convolution layer has several filters that perform the convolution operation. Every image is considered as a matrix of pixel values.

Consider a black and white image of 5x5 size whose pixel values are either 0 or 1 and also a filter matrix with a dimension of 3x3. Next, slide the filter matrix over the image and compute the dot product to get the convolved feature matrix.

ReLU layer

Once the feature maps are extracted, the next step is to move them to a ReLU layer. ReLU (Rectified Linear Unit) is an activation function which performs an element-wise operation and sets all the negative pixels to 0. It introduces non-linearity to the network, and the generated output is a rectified feature map. The relu function is: f(x) = max(0,x).

Pooling layer

Pooling is a down-sampling operation that reduces the dimensionality of the feature map. The rectified feature map goes through a pooling layer to generate a pooled feature map.

Imagine a rectified feature map of size 4x4 goes through a max pooling filter of 2x2 size with stride of 2. In this case, the resultant pooled feature map will have a pooled feature map of 2x2 size where each value will represent the maximum value of each stride.

The pooling layer uses various filters to identify different parts of the image like edges, shapes etc.

Fully Connected layer

The next step in the process is called flattening. Flattening is used to convert all the resultant 2D arrays from pooled feature maps into a single linear vector. This flattened vector is then fed as input to the fully connected layer to classify the image.

Convolutional Neural Networks in a nutshell

The pixels from the image are fed to the convolutional layer that performs the convolution operation
It results in a convolved map
The convolved map is applied to a ReLU function to generate a rectified feature map
The image is processed with multiple convolutions and ReLU layers for locating the features
Different pooling layers with various filters are used to identify specific parts of the image
The pooled feature map is flattened and fed to a fully connected layer to get the final output

Links:

Learn CNN in the browser

Notes

Add notes from the video (PRs are welcome)

convolutional neural networks (CNN) consist of different types of layers
convolutional layer as filters (i.e. 5 x 5)
dense layers
sliding the filter across the cells of the entire image
calculating similarity scores of the different positions of the filter
creating the feature map, one feature map per filter
chaining layers of simple and complex filters allows the CNN to "learn"
resulting in vector representation of image
activation functions sigmoid for binary classification and softmax for multiclass

⚠️	The notes are written by the community. If you see an error here, please create a PR with a fix.

Navigation

Machine Learning Zoomcamp course
Session 8: Neural Networks and Deep Learning
Previous: Pre-trained convolutional neural networks
Next: Tranfser learning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

04-conv-neural-nets.md

04-conv-neural-nets.md

8.4 Convolutional neural networks

What is Convolutional Neural Network?

Layers in a Convolutional Neural Network

Notes

Navigation

Files

04-conv-neural-nets.md

Latest commit

History

04-conv-neural-nets.md

File metadata and controls

8.4 Convolutional neural networks

What is Convolutional Neural Network?

Layers in a Convolutional Neural Network

Notes

Navigation