0

I am looking for an efficient way to apply a function to a numpy array. I know that I can for loop over the rows, and that I can use numpy.apply_along_axis for a function that returns something, but is there a way to do something like apply_along_axis without a return?

For example, if I have:

a_list_of_strings = []

A = numpy.array([[1,2,3],[4,5,6],[7,8,9]])

def funct (row):
      global a_list_of_strings
      a_list_of_strings.append(some_other_complicated_function(row))

is there an analogous way to do numpy.apply_along_axis with A and this function?

Edit: The reason I want to do this is so that I can have a better implementation of the following code (which I didn't originally post because I thought it was hard to read.

def entropy (array):
    """calculates the entropy of a static array"""
    values = []
    for i in range(3,len(types)+3):
        p_s = sum(array[:,0]==i)/array.shape[0]
        values.append(p_s)
    return(sum(list(map(helper_ent,values))))

def helper_ent (p):
    return (-p*np.log2(p+eps))

def calc_entropies (array):
    entropies = []
    eap = entropies.append
    for i in range(1,len(array)+1):
        split1 = array[array[:,i]==1.]
        split2 = array[array[:,i]==0.]
        if split1.shape[0]>0 and split2.shape[0]>0:
            E = split1.shape[0]/array.shape[0]*entropy(split1)+split2.shape[0]/array.shape[0]*entropy(split2)
            eap(E)
        else:
            eap(1)
    return(np.array(entropies))</pre>
Sara
  • 1
  • 1
  • Just to check, is there a reason you can't have a function that also takes a_list_of_strings as an argument? That way it wouldn't have to be global, and you can have it return the list of strings as the output. – neophlegm Jan 19 '18 at 03:02
  • Well, I tried to come up with a toy example, since my actual code is pretty involved. Basically, I'm trying to use each column as an indicator function to split the array into 2 sub-arrays, and then calculate values based on the sub-arrays. I'll add that into the question. – Sara Jan 19 '18 at 03:29
  • I think your asking if you can do an inplace mapping if I'm not mistaken. – jxramos Jan 19 '18 at 03:41
  • I'm not sure I know what you mean by that. For each column, I want to use the values of that column to split the array into 2 smaller arrays composed of rows from the original array. I then have a function that I apply to those smaller arrays that yields a float, and I end up with a 1D array of those floats from doing that for every column. – Sara Jan 19 '18 at 03:48
  • You don't want the empty () when defining `eap` do you? – hpaulj Jan 19 '18 at 04:23
  • you're right I don't. I copied over old code by mistake. – Sara Jan 19 '18 at 04:25
  • Have you tried iterating on `A` as though it was a list of lists? – hpaulj Jan 19 '18 at 04:25
  • Yes, but it's very very slow. I was hoping there was something matrix-y I could do to speed it up. – Sara Jan 19 '18 at 04:27
  • if you're working in numpy, you don't need `helper` in your inner loop. can I use what you defined as `A` for the example? – Paul H Jan 19 '18 at 04:30
  • Sure. I know I can apply algebra to arrays, but the`helper_ent` is just given a small 1D, so I don't think it's a major contributor to the time issue. – Sara Jan 19 '18 at 04:34
  • `apply_along_axis` doesn't speed up row iteration; it's just a convenience function, and is most useful when working with 3d or higher arrays. The way to get more speed is to figure out how to use `numpy` functions to work with the whole array, not just row by row. As a simple example, `np.sum(a, axis=1)` sums each row. – hpaulj Jan 19 '18 at 05:49
  • 1
    https://stackoverflow.com/q/44239498 - for fuller discussion of applying a generic function over rows. – hpaulj Jan 19 '18 at 14:51

0 Answers0