Python SVM setting an array element with a sequence error

Question

I'm trying to use the SVM from the sklearn library to perform some image recognition, but when I call the fit method, I get a "ValueError: setting an array element with a sequence." type of error. My code is as following.

My testing.py file:

import matplotlib.pyplot as plt
import numpy as np
from sklearn import svm
from imageToNumberArray import imageToNumberArray

classAndValuesFile = "../Classes_Values.txt"
classesFiles = "../"

testImage = "ImageToPerformTestOn.png"

x = []
y = []

def main():
    i = 0
    with open(classAndValuesFile) as f:
        for line in f:
            splitter = line.split(",", 2)
            x.append(imageToNumberArray(classesFiles + splitter[0]))
            y.append(splitter[1].strip())

    clf = svm.SVC(gamma=0.001, C=100)
    clf.fit(x,y)
    #print clf.predict(testImage)

The imageToNumberArray file is:

from PIL import Image
from numpy import array


def imageToNumberArray(path):
    img = Image.open(path)
    arr = array(img)
    return arr

And I'm getting the following error:

Traceback (most recent call last):
  File "D:\Research\project\testing.py", line 30, in <module>
main()
  File "D:\Research\project\testing.py", line 23, in main
clf.fit(x,y)
  File "C:\Python27\lib\site-packages\sklearn\svm\base.py", line 139, in fit
X = check_array(X, accept_sparse='csr', dtype=np.float64, order='C')
  File "C:\Python27\lib\site-packages\sklearn\utils\validation.py", line 344, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: setting an array element with a sequence.

If I comment the clf.fit line it works just fine.

Also, If I print all the shapes of the matrices in X, I get something like this (some are 2D, some are 3D):

(59, 58, 4)
(49, 27, 4)
(570, 400, 3)
(471, 364)
(967, 729)
(600, 600, 3)
(325, 325, 3)
(386, 292)
(86, 36, 4)
(49, 26, 4)
(578, 244, 3)
(300, 300)
(995, 557, 3)
(1495, 677)
(400, 400, 3)
(200, 230, 3)
(74, 67, 4)
(49, 34, 4)
(240, 217, 3)
(594, 546, 4)
(387, 230, 3)
(297, 273, 4)
(400, 400, 3)
(387, 230, 3)
(86, 62, 4)
(50, 22, 4)
(499, 245, 3)
(800, 566, 4)
(1050, 750, 3)
(400, 400, 3)
(499, 245, 3)
(74, 53, 4)
(47, 26, 4)
(592, 348, 4)
(1050, 750, 3)
(1600, 1600)
(320, 320)
(84, 54, 4)
(47, 25, 4)
(600, 294, 3)
(400, 400, 3)
(1050, 750, 3)
(1478, 761)
(504, 300, 3)
(53, 84, 4)
(36, 42, 4)
(315, 600, 4)
(223, 425, 3)
(194, 325, 3)

The first two numbers are the size of the image.

What can I do the get rid of this error?

You almost definitely want to extract features from your images before doing any kind of machine learning (although I know KNN can work well for digit recognition). Check this out: http://www.codeproject.com/Articles/619039/Bag-of-Features-Descriptor-on-SIFT-Features-with-O — Ryan, Aug 24 '15 at 16:09
Perhaps [this](http://stackoverflow.com/questions/25485503/valueerror-setting-an-array-element-with-a-sequence-while-using-svm-in-scikit) can help you. — deborah-digges, Aug 24 '15 at 16:12

score 2 · Answer 1 · answered Aug 25 '15 at 22:53

You seem to be confused how SVM works. In short, x has to be one, big two-dimensional array, while in your case it is a list of various matrices. SVM will not ever run on such data. First, find a meaningful (in your data sense) way to represent each image as a constant size vector, which is often called feature extraction. One of the basic approaches would be to represent each image as some histogram or as bag of visual words.

Python SVM setting an array element with a sequence error

1 Answers1