I'm tidying up some digit classification code. So I feed in an image of a digit, say "7" and I get out 10 probabilities (i.e. sums to 1). If my algorithm is working well, the 7th element should have the highest value.
An added complication is that I'm working with batches of 100 elements. So I actually have a COLxROW = 100x10 MATRIX where each ROW sums to 1.
Now I wish to sample from each of these 100 distributions, i.e. I need to produce a vector like [0 0 0 1 0 0 0 0 0 0] (that would be a 3) for each batch item according to my probability distribution.
The existing implementation is:
samp = pd*0;
layers = cumsum( pd, 2 );
randoms = rand( batchSize, 1 );
for k = 1:batchSize
index = find( randoms(k) <= layers(k,:), 1 );
samp( k, index ) = 1;
end
However I would prefer to avoid explicitly looping (as I have read it is often causes poor performance).
Efficiency is key, as this routine gets executed in the tightest loops.
How to accomplish this efficiently?
EDIT I will attempt to answer my question, I'm posting in case someone can improve upon the answer (there is nearly always more than one way to skin a cat in MatLab) and also as this may constitute a valuable snippet to somebody.