1

I want to evaluate if an event is happening in my screen, every time it happens a particular box/image shows up in a screen region with very similar structure.

I have collected a bunch of 84x94 .png RGB images from that screen region and I'd like to build a classifier to tell me if the event is happening or not.

Therefore my idea was to create a pd.DataFrame (df) containing 2 columns, df['np_array'] contains every picture as a np.array and df['is_category'] contains boolean values telling if that image is indicating that the event is happening or not.

The structure looks like this (with != size):

I have resized the images to 10x10 for training and converted to greyscale

df = pd.DataFrame(
    {'np_array': [np.random.random((10, 10,2)) for x in range(0,10)],
     'is_category': [bool(random.getrandbits(1)) for x in range(0,10)]
    })

My problem is that I can't fit a scikit learn classifier by doing clf.fit(df['np_array'],df['is_category'])

I've never tried image recognition before, thanks upfront for any help!

EduGord
  • 139
  • 2
  • 13

1 Answers1

2

If its a 10x10 grayscale image, you can flatten it:

import numpy as np
from sklearn import ensemble

# generate random 2d arrays
image_data = np.random.rand(10,10, 100)

# generate random labels
labels = np.random.randint(0,2, 100)

X = image_data.reshape(100, -1)

# then use any scikit-learn classification model
clf = ensemble.RandomForestClassifier()
clf.fit(X, y)

By the way, for images the best performing algorithms are convolutional neural networks.

Abhishek Thakur
  • 16,337
  • 15
  • 66
  • 97
  • 1
    Thank you very much for the solution and algorithm tip, this solved my problem! As I said the images were pretty similar, with less the 500 pictures I got 100% accuracy in the testing set. – EduGord Feb 15 '17 at 15:40