A convolutional neural network takes a -dimensional input.

2D input

Given a 2D input , a sliding window is multiplied by the kernel , to get the feature map.

Without any hidden pixels, and stride = , if is and is , the feature map would be of size .

Padding

Adding hidden pixels to control the output shape.

Stride

Amount of movement of the sliding window.

Pooling

Downsamples feature maps

Aggregation methods: