How to calculate the computation complexity of an artificial neural network after quantization

Question

Could anyone suggest a way of defining the computational complexity of a neural network after quantization?

I understand computation complexity as the amount of arithmetic “work” needed to calculate the entire network or a single layer. Nevertheless, when a neural network has been quantized, numbers are not represented anymore by the same format (the new format will depend on the quantization method used as described here, e.g.

We proceed from multiplying two real numbers in all operations to multiplying the pairs of int8, float 16, etc. The latter operations are evidently “simpler” than the multiplication of two reals.

Therefore, this has an effect on the time and memory it takes to carry out computations and as a consequence the traditional metrics, as for example "BigO" notation, do not make sense.

Of course they make sense. The key to complexity is knowing how your time will change if you double the data. That's what O() notation is for, and that doesn't care what the types are. As it happens an integer multiply, a single float multiply and a double float multiply all take the same number of cycles, but cycles are not important in complexity. — Tim Roberts, Apr 27 '21 at 03:01
Now, if you want to BENCHMARK your network, then you care about details. But deciding that an algorithm is O(N) or O(N^2) does not. Complexity doesn't care whether there are 2 muitiplies or 20 multiplies, because that's irrelevant to how the time with scale with volume. — Tim Roberts, Apr 27 '21 at 03:03

How to calculate the computation complexity of an artificial neural network after quantization

0 Answers0