I consider an RGB image as a discretization of a vector field f whose domain is the 2D grid of pixels and whose image is a subset of the RGB cube (each colour is a vector with three components).
I want to compute a scalar field g which is related to the gradient of f: for every pixel x the value of g(x) is the maximum Euclidean norm of f(y)-f(x) for y ranging on the pixels adjacent to x.
Finally givena threshold parameter t I want to compute another function h with h(x,t) = 1 if g(x) >= t and h(x,t) = 0 otherwise.
My intuition is that as t increases (starting for example from the average value of g over the entire 2D grid) the rendering of h as a greyscale image should approximate more and more the contours of objects in the original image.
I implemented the above algorithm in Python and noticed how it is slow compared to the performances of convolution operations implemented in Python libraries, even if the above algorithm is not very different from a convolution operation in terms of number of operations. Moreover the results I get are not as neat as I expected.
Here is my Python implementation. I look forward to see possible optimizations, maybe using vectorization techniques and the numpy library.
import numpy as np
import matplotlib.pyplot as plt
import os
directories = ['images']
def filter2(color, mu):
if (color >= mu):
return 1
else:
return 0
max = np.sqrt(3*255*255)
for directory in directories:
for filename in os.scandir(directory):
if filename.is_file():
print(filename.path)
image = plt.imread(filename.path)
plt.imshow(image, cmap='gray')
plt.show()
height = image.shape[0]
width = image.shape[1]
gradient_image = np.zeros((height, width), dtype=float)
for row in range(0, height):
for col in range(0, width):
r1 = np.max([0, row-1])
r2 = np.min([row+2, height])
c1 = np.max([0, col - 1])
c2 = np.min([col + 2, width])
neighborhood = image[r1:r2, c1:c2]
n = (r2-r1)*(c2-c1)
neighborhood = neighborhood.reshape(n, 3).astype('float')
differences = np.array([np.subtract(v, image[row][col]) for v in neighborhood])
dot_products = np.array([np.dot(v, v) for v in differences])
gradient_image[row][col] = np.sqrt(dot_products.max()) / max
mu = gradient_image.mean()
vfilter2 = np.vectorize(filter2)
gradient_image = vfilter2(gradient_image, mu)
plt.imshow(gradient_image, cmap='gray')
plt.show()