Is there is difference between the keras layers Masking() and Embedding(mask_zero =True)?

Question

The documentation for the Embedding layer is here:

and the documentation for the Masking layer is here:

I cant find a difference there. Should one of the layers be prefered in certain situations?

score 1 · Accepted Answer · answered Aug 10 '17 at 14:27

I feel like Masking() is more masking of time steps; while Embedding(mask_zero=True) is more of a data filter. Masking:

If all values in the input tensor at that timestep are equal to mask_value, then the timestep will be masked (skipped) in all downstream layers

With an arbitrary mask_value. Thus, you can decide to skip time steps in which there is no input, or some other condition you can think of, based on your data.

For Embedding, you overlay a mask on your input skipping calculations for data for which the input=0. This way, you can, in a single time step, propagate full data, part of the data, of no data through the network. This is not a masking of time step #3 or something like that, it is a masking of input data #i. Also, only having no input (input=zero) can be masked.

Thus, there are certainly cases I can think of where the two are completely equal (when an input = 0, it is 0 for all inputs would be such a case), but their use is on another resolution.

So, with the Embedding layer a timestep within my sequences will be ignored wether or not it is zero over al sequences. E.g in every sequence timestep #3 should be ignored than i use the Embedding layer and if just zeros should be ignored no matter in which timestep they are the Masking layer is used? — Mimi Müller, Aug 11 '17 at 09:18
If time steps should be ignored if all inputs are equal to a certain value which you can specify, then use use Masking. If zeros should be ignored, (so if 1 input zero, other with value, the time step is still going strong), use Embedding — Uvar, Aug 11 '17 at 09:23

Is there is difference between the keras layers Masking() and Embedding(mask_zero =True)?

1 Answers1