How to find basic shapes (brick, cylinder, sphere) in an image using the Sobel operator?

Question

I have calculated the Sobel gradient magnitude and direction. But I'm stuck on how to use this further for shape detection.

Image> Grayscaled> Sobel filtered> Sobel gradient and direction calculated> next?

The Sobel kernels used are:

Kx = ([[1, 0, -1],[2, 0, -2],[1, 0, -1]]) 
Ky = ([[1, 2, 1],[0, 0, 0],[-1, -2, -1]])

(I have restriction to only use Numpy and no other library with language Python.)

import numpy as np
def classify(im):

   #Convert to grayscale
   gray = convert_to_grayscale(im/255.)

   #Sobel kernels as numpy arrays

   Kx = np.array([[1, 0, -1],[2, 0, -2],[1, 0, -1]]) 
   Ky = np.array([[1, 2, 1],[0, 0, 0],[-1, -2, -1]])

   Gx = filter_2d(gray, Kx)
   Gy = filter_2d(gray, Ky)

   G = np.sqrt(Gx**2+Gy**2)
   G_direction = np.arctan2(Gy, Gx)

   #labels = ['brick', 'ball', 'cylinder']
   #Let's guess randomly! Maybe we'll get lucky.
   #random_integer = np.random.randint(low = 0, high = 3)

   return labels[random_integer]

def filter_2d(im, kernel):
   '''
   Filter an image by taking the dot product of each 
   image neighborhood with the kernel matrix.
   '''

    M = kernel.shape[0] 
    N = kernel.shape[1]
    H = im.shape[0]
    W = im.shape[1]

    filtered_image = np.zeros((H-M+1, W-N+1), dtype = 'float64')

    for i in range(filtered_image.shape[0]):
        for j in range(filtered_image.shape[1]):
            image_patch = im[i:i+M, j:j+N]
            filtered_image[i, j] = np.sum(np.multiply(image_patch, kernel))

    return filtered_image

def convert_to_grayscale(im):
    '''
    Convert color image to grayscale.
    '''
    return np.mean(im, axis = 2)

Why start with edge detection? You're pushing yourself in a difficult corner this way... If you want better hints, please share the input image. — Cris Luengo, Sep 10 '18 at 22:11
@Cris How do you suggest to start this otherwise? I believe edge detection is a crucial intermediate step. — KshitijJaju, Sep 10 '18 at 22:59
Looking at your images, I'd say a simple threshold will give you a binary mask for the shape. You can then compute shape features to determine what shape you're looking at. — Cris Luengo, Sep 10 '18 at 23:08
@Chris Can you elaborate on the compute shape features part? That where I am stuck , once i have the binary image with some threshold, how do i identify its a brick or a sphere — KshitijJaju, Sep 10 '18 at 23:11
@Yuves I mean recognizing the shape in the image, if its a shape of circle/ shpere for the ball or brick. — KshitijJaju, Sep 10 '18 at 23:13
Look around on [some of my old blog posts about measuring](https://www.crisluengo.net/index.php/archives/tag/measure) for how to compute basic features such as area, perimeter, Feret diameters, etc. of objects. You can then use some logic such as the ratio area vs perimeter square (which is maximal for a circle), or object area vs minimum bounding box area (which is maximal for a rectangle). — Cris Luengo, Sep 10 '18 at 23:32
@CrisLuengo Your blogs has some great information but is based on chain code. Can you help me a quick code snippet for finding area / perimeter based extending to my code? — KshitijJaju, Sep 11 '18 at 01:21

score 1 · Answer 1 · 2018-09-11T13:52:24.397

You can use the following distinctive characteristics of your shapes:

a brick has several straight edges (from four to six, depending on the point of view);
a sphere has a single curved edge;
a cylindre has two curved edges and to straight edges (though they can be completely hidden).

Use binarization (based on luminance and/or saturation) and extract the outlines. Then find the straight sections, possibly using the Douglas-Peucker simplification algorithm. Finally, analyze the sequences of straight and curved edges.

A possible way to address the final classification task, is to represent the outlines as a string of chunks, either straight or curved, with a rough indication of length (short/medium/long). With imperfect segmentation, every shape will correspond to a set of patterns.

You can work with a training phase to learn a maximum of patterns, then use string matching (where the strings are seen as loops). There will probably be ties to be arbitrated. Another option is approximate string matching.

“Finally, analyze the sequences of straight and curved edges.” It looks to me that you abstract away the most difficult bit... — Cris Luengo, Sep 11 '18 at 13:42
@CrisLuengo: no, the real challenge is the correct segmentation in lines and curves. The rest is peanuts in comparison. Anyway, I'll add a comment. — , Sep 11 '18 at 13:45

How to find basic shapes (brick, cylinder, sphere) in an image using the Sobel operator?

1 Answers1