2

I am using the KNIME Doc2Vec Learner node to build a Word Embedding. I know how Doc2Vec works. In KNIME I have the option to set the parameters

  • Batch Size: The number of words to use for each batch.
  • Number of Epochs: The number of epochs to train.
  • Number of Training Iterations: The number of updates done for each batch.

From Neural Networks I know that (lazily copied from https://stats.stackexchange.com/questions/153531/what-is-batch-size-in-neural-network):

  • one epoch = one forward pass and one backward pass of all the training examples
  • batch size = the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need.
  • number of iterations = number of passes, each pass using [batch size] number of examples. To be clear, one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes).

As far as I understand it makes little sense to set batch size and iterations, because one is determined by the other (given the data size, which is given by the circumstances). So why can I change both parameters?

Make42
  • 12,236
  • 24
  • 79
  • 155

1 Answers1

1

This is not necessarily the case. You can also train "half epochs". For example, in Google's inceptionV3 pretrained script, you usually set the number of iterations and the batch size at the same time. This can lead to "partial epochs", which can be fine.

If it is a good idea or not to train half epochs may depend on your data. There is a thread about this but not a concluding answer.

I am not familiar with KNIME Doc2Vec, so I am not sure if the meaning is somewhat different there. But from the definitions you gave setting batch size + iterations seems fine. Also setting number of epochs could cause conflicts though leading to situations where numbers don't add up to reasonable combinations.

Gegenwind
  • 1,388
  • 1
  • 17
  • 27
  • I do not understand what the number of epochs `E` have to do with this. Let's say I have `N=1000` training samples. Then an epoch is how often I train on those 1000 samples. Let's say have `I=5` iterations, then I the batch size is `B = N/I = 200` - no `E` involved. Alternatively I can fix the batch size to 120 and have `I = N/B = 8.3`, so I have 8 full iterartions with 120 samples and on iteration with 40 samples. Still no `E` involved. Now I set `I=5` and `B=100` - this cannot be reconciled, leading to my original question. – Make42 Mar 01 '18 at 10:16
  • Can you provide any code or more information about the NN you are using? Without it its impossible to say why and how those parameters are used exactly. – Gegenwind Mar 01 '18 at 13:58
  • KNIME falls back on the implementation from DL4J: https://deeplearning4j.org/doc2vec. You can also check out https://deeplearning4j.org/word2vec, which is rather similar. – Make42 Mar 01 '18 at 19:12