I'm predicting roughly one of 100K possible outputs with a MXNet model, using a fairly standard softmax output. I want to compare the probability assigned to the true label versus the top predictions under the model. To get the former I'm using the pick operator; the later I've tried the cheap version (topk operator) and the expensive version (sort/argsort + slice).
In both cases I'm getting contradictory results. Specifically, there are numerous cases where the probability of the true label (retrieved with pick) is significantly higher than the highest probability output (retrieved with topk/sort). I think this means I'm doing something wrong but don't understand what. It does not happen for all predictions, but it does for a significant fraction.
Can anybody give me a hint as to what is going on?
Code follows:
for batch in data_iter:
model.forward(batch, is_train=False)
predictions = model.get_outputs()[0]
labels = batch.label[0].as_in_context(predictions.context)
# scores = mx.nd.topk(predictions, axis=1, k=6, ret_typ='value')
scores = mx.nd.sort(predictions, axis=1, is_ascend=0)
scores = mx.nd.slice_axis(scores, axis=1, begin=0, end=6)
label_score = mx.nd.pick(predictions, labels, axis=1)
equal = label_score.asnumpy() <= scores.asnumpy()[:, 0]
if not np.all(equal):
#I think this should never happen but it does frequently