I have found that dot product is the same cycle with vector add, vector mul(just one cycle per ALU per core), but not the mad. So I'm curious to how many cycles mad instruction are.
Asked
Active
Viewed 287 times
1 Answers
0
I resort the dot product to improve OpenCL performance instead of mad, but I got bad performance. With mad, the consuming time of kernel in my project is 58ms(average, multiple times test, on arm mali G77 Bifrost). And 68ms with the dot product. So if you have a different conclusion, please attach it.

冯剑龙
- 569
- 8
- 22
-
G77 is Valhall, are you sure that's the GPU you meant? – Andrea Jan 13 '20 at 16:56
-
@Andrea, yes, you are right, I used G77 that is Valhall. – 冯剑龙 Jan 20 '20 at 02:03