1

I have just begun learning Machine Learning using Python. I have written the following class which gives an error:

TypeError: can't multiply sequence by non-int of type 'float'

class Perceptron(object):
    def __init__(self, eta=0.01, n_iter=10):
        self.eta = eta                          # Learning Rate
        self.n_iter = n_iter                    # Number of iteration over the training dataset

    def fit(self, x, y):
        self.w_ = np.zeros(1 + x.shape[1])      # Initialize Weights to zero initially                                                # x = {array-like} : shape[no_of_samples, no_of_features]
        self.errors_ = []                       # No errors in the beginning of the computation
        for _ in range(self.n_iter):
            errors = 0
            for xi, target in zip(x, y):
                update = self.eta * (target - self.predict(xi))
                self.w_[1:] += update * xi
                self.w_[0] += update
                errors += int(update != 0.0)
            self.errors_.append(errors)

        return self

    def net_input(self, x):
        return np.dot(x, self.w_[1:]) + self.w_[0]

    def predict(self, x):
        return np.where(self.net_input(x) >= 0.0, 1, -1) 

I am getting an error in the net_input() method at np.dot(). I am using the following dataset : https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv

ml4294
  • 2,559
  • 5
  • 24
  • 24
Satya Prakash
  • 29
  • 1
  • 8

4 Answers4

1

If you're reading the training data (data and predictions) from the file iris.csv

sepal_length,sepal_width,petal_length,petal_width,species
5.1,3.5,1.4,0.2,setosa
4.9,3,1.4,0.2,setosa

with:

data = pd.read_csv("iris.csv")

make sure that you define x as the first four columns, otherwise it will contain the strings from the last column:

X = data.iloc[:,0:4]

And prediction values:

y = data.iloc[:,5]
y = y.values.reshape(150,1)
user2314737
  • 27,088
  • 20
  • 102
  • 114
0

Following changes would help.

def fit(self, x, y):
    ...
    for xi, target in zip(x, y):
        update = self.eta * (target - self.predict(xi.reshape(1, x.shape[1]))
        ...

# Here if you want to implement perceptron, use matmul not dot product
def net_input(self, x):
    return np.matmul(x, self.w_[1:]) + self.w_[0]
Ishant Mrinal
  • 4,898
  • 3
  • 29
  • 47
  • `np.dot` is element-wise multiplication `np.matmul` is matrix multiplication. Here in this special case `(1, n)(n, 1)` it would be same; NOT OTHERWISE – Ishant Mrinal Aug 18 '17 at 13:17
  • For element-wise multiplication dimensions of the matrices have to be same. `np.multiply` is the element-wise multiplication. (Operator is `*`). For matrix multiplication result of `np.dot` and `np.matmul` (Operator is `@`) are same. – Ramesh-X Aug 18 '17 at 13:29
0

Check the shape of x.

If it is (a, 1) where a is a number, use this:

def net_input(self, x):
    return np.dot(x.T, self.w_[1:]) + self.w_[0]

If it is (1, a) use this:

def net_input(self, x):
    return np.dot(x, self.w_[1:].T) + self.w_[0]
Ramesh-X
  • 4,853
  • 6
  • 46
  • 67
0

My guess is that x is an object dtype array of lists.

If I define an object dtype array:

In [45]: x=np.empty((2,),object)
In [46]: x[:]=[[1,2],[3,4]]
In [49]: x
Out[49]: array([list([1, 2]), list([3, 4])], dtype=object)

I get the same error with a list (or array) of floats:

In [50]: np.dot(x, [1.2,4.5])
...
TypeError: can't multiply sequence by non-int of type 'float'

If instead I give it integers, it works - sort of

In [51]: np.dot(x, [1,2])
Out[51]: [1, 2, 3, 4, 3, 4]

What it has actually done is [1,2]*1 and [3,4]*2, list replication. This is not numeric multiplication.

That's the only combination of variables that makes sense of the error message.

So you need to figure out why x is an object array. Often that's the result of building an array from lists that differ in length

In [54]: x = np.array([[1,2],[3,4,5]])
In [55]: x
Out[55]: array([list([1, 2]), list([3, 4, 5])], dtype=object)

So the basic question when faced with an error like this, what's the shape and dtype of the variables.

hpaulj
  • 221,503
  • 14
  • 230
  • 353