4

If I use this function

import numpy as np
from numba import jit

@jit(nopython=True)
def diss_matrix(data):
    n = data.shape[0]
    diss = np.empty((n, n))
    for i in range(n):
        for j in range(i):
            dist = np.absolute(data[i] - data[j]).sum()
            diss[i, j] = dist
            diss[j, i] = dist
    return diss

x = np.random.randn(100)
print(diss_matrix(x))

I get this error

numba.errors.UntypedAttributeError: Failed at nopython (nopython frontend)
Unknown attribute 'sum' of type float64
File "test_numba.py", line 11
[1] During: typing of get attribute at 
c:/Users/matte/Dropbox/Università/SDS/Thesis/source/test_numba.py (11)

I have been trying to understand what that means. The fact is, the instruction that triggers the error is the following

dist = np.absolute(data[i] - data[j]).sum()

but I think the problem is that, somehow, numba assumes that data[i] and data[j] are float64 and not arrays. In fact, the following code

@jit(nopython=True)
def diss_matrix3():
    vec1 = np.array([1, 2, 3])
    vec2 = np.array([2, 3, 4])
    dist = np.absolute(vec1 - vec2).sum()
    return dist

works flawlessly.

I'm using numba 0.35 and I'm trying to find a way to make that function works. I know the existence of scipy.spatial.distance.pdist, but I need to make my own implementation. Furthermore, the same error may occur in the future.

Any suggestions?

Matteo Silvestro
  • 430
  • 1
  • 7
  • 16

1 Answers1

5

If you look at the shape of np.random.randn(100) it is (100,), so data[i] is indeed a scalar, not an array. If you use np.random.randn(100,100) it should work. Have a look at the docs for randn for a more detailed explanation of how the function works.

JoshAdel
  • 66,734
  • 27
  • 141
  • 140
  • Thank you! I was desperate and didn't check such a basic thing. What misled me was the fact that it compiled with `nopython=False`. – Matteo Silvestro Oct 07 '17 at 13:43