0

I would like to get the same result as Matlab's accumarray function in Python. I know that there are some other discussions which provide solutions to this problem but the case I consider seems to be more difficult.

It corresponds to the situation of this Matlab script which contains the following line:

binned_data = accumarray(bins(all(bins>0,2),:),1/nrows,M(ones(1,ncols)));

I tried to understand the meaning of this computation by reading the official documentation of accumarray but it is not clear for me.

Do you understand the meaning of this line and know how to get the same result with some Python library (numpy, scipy, pandas, ...)?

EDIT: as far as I understand, my question is different from this one. As you can notice, accumarray has 3 input parameters in my case whereas there are only 2 input parameters in the example from the other discussion. Furthermore, no example of usage of accumarray is provided in the other discussion: the reference to this function only appears in the title (only a very restrictive definition of the function is given by the author). In my case, there is a practical example which seems to be more general than the one considered in the other discussion.

Aleph
  • 1,343
  • 1
  • 12
  • 27
  • I edited my question to explain why it is not a duplicate as far as I understand. Actually, the title of the discussion you refer to should be modified because Matlab's accumarray function does not appear in the body of the question, except that it gives a definition of this function which seems to be wrong in the general case if we consider the description given in the official documentation. – Aleph Apr 25 '19 at 19:02
  • Makes sense. It doesn't look like there's a direct equivalent, but [here](https://scipy-cookbook.readthedocs.io/items/AccumarrayLike.html) it looks like someone built something scipy-based. Maybe that solution will work for your case? – G. Anderson Apr 25 '19 at 19:40
  • It does not work because the second argument is a scalar. Actually, I do not understand why it is a scalar because examples given in the official documentation of accumarray take a vector as a second argument. The error I get with the accum function your refer to is "ValueError: The initial dimensions of accmap must be the same as a.shape", which seems to mean that Matlab is able to interpret a scalar as a vector but I do not understand how it works. – Aleph Apr 26 '19 at 13:48

1 Answers1

0

I did not test it in all details, but this function should be what you are looking for. It handels a couple of things the Matlab function does.

def accumarray(subs, vals, size=None, fun=np.sum):

    if len(subs.shape) == 1:
        if size is None:
            size = [subs.values.max() + 1, 0]

        acc = val.groupby(subs).agg(fun)
    else:
        if size is None:
            size = [subs.values.max()+1, subs.shape[1]]

        subs = subs.copy().reset_index()
        by = subs.columns.tolist()[1:]
        acc = subs.groupby(by=by)['index'].agg(list).apply(lambda x: val[x].agg(fun))
        acc = acc.to_frame().reset_index().pivot_table(index=0, columns=1, aggfunc='first')
        acc.columns = range(acc.shape[1])
        acc = acc.reindex(range(size[1]), axis=1).fillna(0)

    id_x = range(size[0])
    acc = acc.reindex(id_x).fillna(0)

    return acc

You can do easy calculations like:

val = pd.Series(np.arange(101, 106+1))
subs = pd.Series([1, 3, 4, 3, 4]) - 1
accumarray(subs, val)

or more complicated things like:

val = pd.Series(np.arange(101, 106+1))
subs = (pd.DataFrame([[1, 1], [2, 2], [3, 2], [1, 1], [2, 2], [4, 1]]) - 1)
accumarray(subs,val, [4, 4])

val = pd.Series(range(1, 10+1))
subs = pd.DataFrame([[1, 1], [1, 1], [1, 1], [1, 1], [2, 1], [2, 1], [2, 1], [2, 1], [2, 1], [2, 2]]) - 1
accumarray(subs, val, None, list)
JoergVanAken
  • 1,286
  • 9
  • 10