1

Imagine there're 2 vectors in GPU memory - a and b, which are in fact 2D float textures (1 float value per pixel). The goal is to compute the dot product a·b.

If I create a third texture - c - which contains the element-wise product of a and b (i.e. c_{ij} = a_{ij} × b_{ij}), then the sum of all pixels' values is the dot product.

I thought I could let the GPU generate 1px mipmap of c - let's call it d. AFAIK d is the average of all pixels in c, which if multiplied by appropriate LOD of the mipmamp would result in the dot product.

Questions

  • Is it possible to compute the dot product in the way described above?
  • Would such approach be faster than computing the dot product via compute kernel?

For concrete example of API let's consider Apple's Metal API or OpenGL/CUDA.

sarasvati
  • 792
  • 12
  • 30
  • You'll probably lose a lot of precision from the mipmapping step. – user253751 Mar 20 '17 at 03:40
  • I highly doubt that this would be faster than executing a kernel. Simply the data arrangement and reordering might kill the performance. – BDL Mar 20 '17 at 14:53

0 Answers0