I want to build a clothing classifier that takes a photo of an item of clothing and classifies it as 'jeans', 'dress', 'trainers' etc.
Some examples:
These images are from retailer websites, so are typically taken from the same angle, typically on a white or pale background -- they tend to be very similar.
I have a set of several thousand images whose category I already know, which I can use to train a machine-learning algorithm.
However, I'm struggling for ideas of what features I should use. The features I have so far:
def get_aspect_ratio(pil_image):
_, _, width, height = pil_image.getbbox()
return width / height
def get_greyscale_array(pil_image):
"""Convert the image to a 13x13 square grayscale image, and return a
list of colour values 0-255.
I've chosen 13x13 as it's very small but still allows you to
distinguish the gap between legs on jeans in my testing.
"""
grayscale_image = pil_image.convert('L')
small_image = grayscale_image.resize((13, 13), Image.ANTIALIAS)
pixels = []
for y in range(13):
for x in range(13):
pixels.append(small_image.getpixel((x, y)))
return pixels
def get_image_features(image_path):
image = Image.open(open(image_path, 'rb'))
features = {}
features['aspect_ratio'] = get_aspect_ratio(image)
for index, pixel in enumerate(get_greyscale_array(image)):
features["pixel%s" % index] = pixel
return features
I'm extracting a simple 13x13 grayscale grid as a crude approximation of shape. Howerver, using these features with nltk's NaiveBayesClassifier
only gets me 34% accuracy.
What features would work well here?