0

I have a trained CRNN model which is supposed to recognise text from images. It really works and so far so good.

My output is a CTC loss layer and I decode it with the tensorflow function keras.backend.ctc_decode which returns, as the documentations says (https://code.i-harness.com/en/docs/tensorflow~python/tf/keras/backend/ctc_decode), a Tuple with the decoded result and a Tensor with the log probability of the prediction.

By making some tests with the model, I get this results:

True value: test0, prediction: test0, log_p: 1.841524362564087
True value: test1, prediction: test1, log_p: 0.9661365151405334
True value: test2, prediction: test2, log_p: 1.0634151697158813
True value: test3, prediction: test3, log_p: 2.471940755844116
True value: test4, prediction: test4, log_p: 1.4866207838058472
True value: test5, prediction: test5, log_p: 0.7630811333656311
True value: test6, prediction: test6, log_p: 0.35642576217651367
True value: test7, prediction: test7, log_p: 1.5693446397781372
True value: test8, prediction: test8, log_p: 0.9700028896331787
True value: test9, prediction: test9, log_p: 1.4783780574798584

The prediction is always correct. However what I think it's the probability seems not to be what I expect. They looks like completely random numbers, even grater than 1 or 2! What am I doing wrong??

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Test
  • 51
  • 4

2 Answers2

2

Well, I guess you've mixed up Probability and Log Probability together. While your intuition is correct that the probability value anything above or below 0-1 would have been weird. However your function isn't giving you the probabilities but the log probabilities, which is actually nothing but the probability in the logarithmic scale. So everything's good with your model.

If you're wondering why we work with log probabilities instead of probability itself, it has got to do mostly with the scaling issue, however, you could read the thread here

Example on changing Log Probabilities into Actual Probabilities:

import numpy as np

# some random log probabilities
log_probs = [-8.45855173, -7.45855173, -6.45855173, -5.45855173, -4.45855173, -3.45855173, -2.45855173, -1.45855173, -0.45855173]

# Let's turn these into actual probabilities (NOTE: If you have "negative" log probabilities, then simply negate the exponent, like np.exp(-x))
probabilities = np.exp(log_probs)

print(probabilities)

# Output:
[2.12078996e-04, 5.76490482e-04, 1.56706360e-03, 4.25972051e-03, 1.15791209e-02, 3.14753138e-02, 8.55587737e-02, 2.32572860e-01, 6.32198578e-01] # everything is between [0-1]
Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
Khalid Saifullah
  • 747
  • 7
  • 16
  • Thank you so much. However I don't understand how to interpret those numbers. I mean is the log probability 2.4 a better probability than 0.97? Which is the range of the log probability function? – Test Jan 10 '21 at 11:14
  • To answer your question about interpretation, the higher the log probability the better. However, You can turn your log probabilities to actual probabilities simply by taking the exponent (`np.exp()`) of those log probabilities (I've added an example in the answer above). let me know if it answered your question. – Khalid Saifullah Jan 10 '21 at 11:39
  • Thanks again but in your case you used negative numbers. I got always positive ones and with these number I get results out of [0, 1] probability range by using np.exp() to reverse them – Test Jan 10 '21 at 14:03
  • You're right, It seems to me that those are "negative" log probabilities, as all of your values are positive, and if you look at `log(x)` graph you'll see in [0-1] interval the function value is negative, so it should be `-log(x)` in your case. So to turn these into probabilities simply add a "negative" sign in the exponent (like `np.exp(-x)`), that should work. – Khalid Saifullah Jan 10 '21 at 17:38
  • Yes I thought about it, however it makes no sense. In this way my probabilities are definitely too low! Let's take my example. In `test3` my log prob is 2.4 So it would be -2.4. By making exp(-2.4) I get 0.09 probability it is correct! It is the right answer instead! These probabilitiies are driving me crazy! – Test Jan 11 '21 at 11:34
  • 1
    I've been investigating this and it is an interesting phenomenon. For the greedy search, the log probability returned is a positive integer, always. For the beam search, say beam_width = 5, the numbers returned are all negative, hence np.exp(x) will easily return good probabilities, between 0 and 1. – Timbus Calin Mar 11 '21 at 06:57
0

Short example from my code:

predictions, log_probabilities = keras.backend.ctc_decode(pred, input_length=input_len, greedy=False,top_paths=5)
The Log-probabilites are:  tf.Tensor([-0.00242825 -6.6236324  -7.3623376  -9.540713   -9.54832   ], shape=(5,), dtype=float32)


probabilities = tf.exp(log_probabilities)
The probabilities are:  tf.Tensor([0.9975747  0.0013286  0.00063471 0.00007187 0.00007132], shape=(5,), dtype=float32)

What I deem important to outline here is that when using the parameter greedy=True, the returned log_probability is positive, hence the need to negate it.

In essence, beam_search with a beam_width of 1 is equivalent to greedy search. Nevertheless, the results given by the following 2 approaches are different:

predictions_beam, log_probabilities_beam = keras.backend.ctc_decode(pred, input_length=input_len, greedy=False,top_paths=1)

vs

predictions_greedy, log_probabilities_greedy = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)

in the sense that the latter always returns a positive log, therefore, it is necessary to negate it before np.exp(log_probabilities)/tf.exp(log_probabilities).

Timbus Calin
  • 13,809
  • 5
  • 41
  • 59