3

Pytorch's EmbeddingBag allows for efficient lookup + reduce operations on varying length collections of embedding indices. There are 3 modes: "sum", "average" and "max" for the reduce operation. With "sum", you can also provide per_sample_weights giving you a weighted sum.

Why is per_sample_weights not allowed for the "max" operation? Looking at how it's implemented, I can only assume there is an issue with performing a "ReduceMean" or "ReduceMax" operation after a "Mul" operation. Could that be something to do with calculating gradients??


p.s: It's easy enough to turn a weighted sum into a weighted average by dividing by the sum of the weights, but for "max" you can't get a weighted equivalent like that.

iacob
  • 20,084
  • 6
  • 92
  • 119
drevicko
  • 14,382
  • 15
  • 75
  • 97
  • It might be helpful if you explained what a weighted max would mean. –  Mar 30 '21 at 02:39
  • @MattF. Apply the max operation after applying the weights to the input vectors. The only possibility to do that and use an EmbeddingBag effectively would be to duplicate the input embedding (in GPU memory - infeasible in many cases), apply the weights to the selected embeddings, then use the bag. This is probably faster than doing the whole thing manually, though duplicating in memory would also be time consuming, particularly if there are many embeddings. – drevicko May 10 '21 at 01:33

1 Answers1

3

The argument per_sample_weights was only implemented for mode='sum', not due to technical limitations, but because the developers found no use cases for a "weighted max":

I haven't been able to find use cases for "weighted mean" (which can be emulated via weighted sum) and "weighted max".

iacob
  • 20,084
  • 6
  • 92
  • 119