2d Smoothing using cuda The symmetric convolution kernel is decomposed into a column matrix and a row matrix. 2 memories are used to store the results of the convolution and further summation.