-1

I am a bit confused about what value I should be expecting from Keras's evaluate function.

Here is the evaluate function definition from Keras documentation:

evaluate(self, x=None, y=None, batch_size=None, 
         verbose=1, sample_weight=None, steps=None)

And here is the short description from the same page:

Returns the loss value & metrics values for the model in test mode.

If I have a large cross validation dataset which kind of requires me to call the evaluate function several times, does the evaluate function remember previous calls? Or does it only return, say, the loss value for the given mini-batch each time?

today
  • 32,602
  • 8
  • 95
  • 115
edn
  • 1,981
  • 3
  • 26
  • 56
  • 1
    The description doesn't say anything about remembering, so it should be quite obvious that it only considers the batch that you give through x and y – Dr. Snoopy Jul 10 '18 at 20:46

1 Answers1

1

The evaluate() method evaluates the model on the whole data you pass to it and therefore the given loss value and the metric(s) value(s) are based on the performance of the model on the whole data.

Further, there is another method called test_on_batch() which tests the model on a single batch of data and returns the corresponding loss value and metric(s) value(s) of the model on the given data batch.

However, I am not sure what you mean by saying "I have a large cross validation data set which kind of requires me to call the evaluate function several times..." (emphasize mine). Do you mean the whole validation data does not fit in memory? If that's the case, and you have stored the validation data in a file on the disk (say a h5py file), then you can define a generator and use evaluate_generator() method to perform evaluation using the generator you have defined.

today
  • 32,602
  • 8
  • 95
  • 115
  • (1) My understanding now: The dev set is provided by x and y parameters when calling the evaluate function. Say that there are 100 examples in the dev set. With batch_size param, it is possible to configure how many examples will be evaluated at a time. Say that batch_size is set to 10. Then the evaluation will be repeated 10 times (with 1 single call to evaluate(..)) and the evaluate function returns the evaluation result of the entire dev set (which is 100). But say that I actually have 1000 examples in the dev set and I can't give the whole thing (bcz of lack of memory) and thereby... – edn Jul 10 '18 at 22:35
  • (2) need to call the evaluate function 10 times in this case (assuming each time I give 100 examples from the dev set). And each call to the evaluate function will then only give the evaluation result for the given 100 examples. In order to cope with this problem, I can use the evaluate_generator() so that I can get the evaluation result for all 1000 examples. Is my understanding correct now? Also: Is the same thing valid when training? E.g. - to my understanding - one can't use a generator with the fit function but needs to use fit_generator instead. – edn Jul 10 '18 at 22:37
  • @edn Yes, you are right. Or you can do it manually: load batches of 100 samples one by one, each time call `evaluate` on the batch and store the outputs in a list. After going over all the batches and computing the loss and metrics values on them, to compute the final loss and metric values simply take the average of batch losses and metric values which you have stored. – today Jul 10 '18 at 22:50