numpy.random.multinomial bad outputs?

Question

I have this function:

import numpy as np 
def unhot(vec):
    """ takes a one-hot vector and returns the corresponding integer """
    assert np.sum(vec) == 1    # this assertion shouldn't fail, but it did...
    return list(vec).index(1)

that I call on the output of a call to:

numpy.random.multinomial(1, coe)

and I got an assertion error at some point when I ran it. How is this possible? Isn't the output of numpy.random.multinomial guaranteed to be a one-hot vector?

Then I removed the assertion error, and now I have:

ValueError: 1 is not in list

Is there some fine-print I am missing, or is this just broken?

@S.M.AlMamun, `vec` is the ouput from `np.random.multinomial`, as stated in the question. — askewchan, Apr 24 '14 at 01:20
I can't repeat this behavior (you have `coe.sum() == 1` I assume) for any value of `coe`. What version of numpy are you running? — askewchan, Apr 24 '14 at 01:27
Also, I'd recommend replacing `list(vec).index(1)` with `vec.argmax()`, which is the same in the case that `1` is the max value of the array. For identical behavior but slower performance (but still much faster than converting to list) use `np.where(vec==1)[0][0]` — askewchan, Apr 24 '14 at 02:06
askewchan - thanks. I am trying to ensure that the sum is leq 1, but I am having problems there as well, see my other question: (http://stackoverflow.com/questions/23257587/how-can-i-avoid-value-errors-when-using-numpy-random-multinomial) I guess I should try with python's sum rather than numpy's. — capybaralet, Apr 24 '14 at 16:08

capybaralet · Answer 1 · 2014-04-24T16:51:23.567

1

Well, this is the problem, and I should've realized, because I've encountered it before:

np.random.multinomial(1,A([  0.,   0.,  np.nan,   0.]))

returns

array([0,                    0, -9223372036854775807,0])

I was using an unstable softmax implementation that gave the Nans. Now, I was trying to ensure that the parameters I passed multinomial had a sum <= 1, but I did it like this:

coe = softmax(coeffs)
while np.sum(coe) > 1-1e-9:
    coe /= (1+1e-5)

and with NaNs in there, the while statement will never even get triggered, I think.

edited Apr 24 '14 at 16:51

answered Apr 24 '14 at 16:46

capybaralet

1,757
3
21
31

1

That large negative number is the int representation of `np.nan` which only works as a float. This still seems like a strange way to make `coe.sum() <=1`. How do you want `np.nan` to be interpreted? Probably as `0`, or `1`? If `0`, you can use `np.nansum` to ignore the `nan`: `coe /= np.nansum(coe)`. If `1`, you'll have to just say `coe[np.isnan] = 1` before `coe /= coe.sum()`. – askewchan Apr 25 '14 at 14:56

numpy.random.multinomial bad outputs?

1 Answers1