I have a 2d
numpy
array
containing greyscale
pixel values from 0
to 255
. What I want to do is to create a gaussian filter
from scratch. I have already written a function to generate a normalized
gaussian kernel
:
def gaussianKernel(size, sigma):
kernel = np.fromfunction(lambda x, y: (1/(2*math.pi*sigma**2)) * math.e ** ((-1*((x-(size-1)/2)**2+(y-(size-1)/2)**2))/(2*sigma**2)), (size, size))
return kernel / np.sum(kernel)
which works fine:
>>> vision.gaussianKernel(5, 1.5)
array([[ 0.01441882, 0.02808402, 0.0350727 , 0.02808402, 0.01441882],
[ 0.02808402, 0.05470021, 0.06831229, 0.05470021, 0.02808402],
[ 0.0350727 , 0.06831229, 0.08531173, 0.06831229, 0.0350727 ],
[ 0.02808402, 0.05470021, 0.06831229, 0.05470021, 0.02808402],
[ 0.01441882, 0.02808402, 0.0350727 , 0.02808402, 0.01441882]])
So then I created a basic convolution
function to apply this kernel
to each pixel
and produces a gaussian
blur:
def gaussianBlurOld(img, kSize, kSigma):
kernel = gaussianKernel(kSize, kSigma)
d = int((kSize-1)/2)
gaussian = np.zeros((img.shape[0]-2*d, img.shape[1]-2*d))
for y in range(d, img.shape[0]-d):
for x in range(d, img.shape[1]-d):
gaussian[y-d][x-d] = np.sum(np.multiply(img[y-d:y+d+1, x-d:x+d+1], kernel))
return gaussian
Which works fine and blurs an image, however, as this code will be eventually running on a raspberry pi, I need it to be efficient and for it to be much faster. So thanks to this answer on a question I asked yesterday on how to speed up a Sobel
edge detector, I tried to apply the same logic he gave to the gaussian
filter. However, as the function
will accept a variable
size parameter for the kernel
, it complicates things slightly from the set size of the Sobel
kernel which is just 3x3
.
If I understand the explanation correctly, I need to first separate the kernel into x
and y
components which can be done by just using the top row
and left column
of the original kernel
(obviously they are the same, but I decided to just keep them separate as I have the 2d
kernel already calculated). Below is the matrix separated:
From these row
and column
vectors, I need to go through each value and multiply that 'window'
of the array by it element-wise. After each one, shifting the reduced size of the window along the array to the right. To show what I think I need to do clearer, these are the 3 different 'windows'
I am talking about for a small image with a kernel
size of 3x3
:
_______3_______
_____|_2_______ |
_____|_1__|____| | |
| | | | | |
|123,|213,|124,|114,|175|
|235,|161,|127,|215,|186|
|128,|215,|111,|141,|221|
|224,|171,|193,|127,|117|
|146,|245,|129,|213,|221|
|152,|131,|150,|112,|171|
So for each 'window'
, you multiply by the index
of that window in the kernel and add that to the total.
Then, take that img which has had the x
component of the gaussian
kernel applied to it and do the same for the y
component.
These are the steps I think I can do to calculate the gaussian
blur much faster than using nested
for-loops
as above and here is the code that I wrote to try and do it:
def gaussianBlur(img, kSize, kSigma):
kernel = gaussianKernel(kSize, kSigma)
gausX = np.zeros((img.shape[0], img.shape[1] - kSize + 1))
for i, v in enumerate(kernel[0]):
gausX += v * img[:, i : img.shape[1] - kSize + i + 1]
gausY = np.zeros((gausX.shape[0] - kSize + 1, gausX.shape[1]))
for i, v in enumerate(kernel[:,0]):
gausY += v * gausX[i : img.shape[0] - kSize + i + 1]
return gausY
My problem is that this function produces the right 'blurring effect', but the output values are all between 0
and 3
as floats
for some reason. Luckily, for some other reason, matplotlib
can still display the output fine so I can check that it has blurred the image correctly.
The question is just simply: why are the pixel values outputting between 0
and 3
???
I have debugged for hours but cannot spot the reason. I am pretty sure that there is just a little scaling detail somewhere, but I just cant find it. Any help would be much appreciated!