I have a quantized network from a framework, that I want to export as an ONNX file. The quantization requires adding intermediary layers performing bitwise right shift to avoid overflow. I have to insert these layers in between some other layers of the network.
I think I can use the Bitshift operator from Pytorch and somehow create the additional layers. My first question is: do such layer already exist or do I have to create them from scratch? For example, could I use a linear layer, with a diagonal weight matrix, and tell the framework not to change the weights?
Then, for the ONNX export, I believe such layers are not supported yet. Is there a simple way to do the export anyway?
Thanks for your help.