For most of us who learned CNN, we already knew the convolutional operation is used for feature extraction in the spatial relationship. Compared with the full connection NN, it is good for weights sharing and translation invariant. There are many different convolutions. Recently, I found a very good article which summarized this topic. I translated it to English combined with my understanding. If you want to read the original one, you can go here.
1. Standard Convolution
1.1 Single channel
It’s element-wise multiply then sum together. The Convolutional filter moves forward each element in the picture. Here we set padding = 0, stride = 1. This is very useful for the gray picture.
1.2 multi channels
For the color pictures, they are made of 3 layers: Red, Green and Yellow. we create a 333 convolution which contains 3 convolutional kernels. Then we sum the three results togher to one channel 2D array.