Questions tagged [quantization]

Use this tag for questions related to quantization of any kind, such as vector quantization.

Quantization, in mathematics and digital signal processing, is the process of mapping a large set of input values to a (countable) smaller set.

For more, please read the Wikipedia article for more.

444 questions
0
votes
0 answers

How to convert quantized ONNX model to quantized TFLite model?

I have quantized ONNX model (exported from PyTorch). Is there any way to convert it to quantized TFLite model? It's important to apply quantization on the PyTorch side.
0
votes
0 answers

What is Difference between TensorRT and AIMET Quantsim..! Which is better and Why?

I'm good at TensorRT. But Recently, I came to know about AIMET Quantsim used for Quantization-Aware training along with ONNX. What is the difference and which is better and why? I want to understand the need of AIMET Quantsim and its difference with…
0
votes
1 answer

torch Parameter grad return none

I want to implement learned size quantization algorithm. And I create a quante Linear layer class QLinear(nn.Module): def __init__(self, input_dim, out_dim, bits=8): super(QLinear, self).__init__() # create a tensor…
zz z
  • 27
  • 5
0
votes
1 answer

How do you find the quantization parameter inside of the ONNX model resulted in converting already quantized tflite model to ONNX?

what I was trying to do is convert quantized tflite to ONNX and run it directly with ORT with tf2onnx. The network I'm trying to convert is YOLOv5 pre-trained with my own dataset + COCO and The model has been converted successfully. I'm sure of that…
PesozuSejin
  • 61
  • 1
  • 9
0
votes
0 answers

Quantizing OpenCV dnn calibration data error

I am attempting to run net = cv2.dnn.readNetFromONNX(onnx_model_path) dummy_input = torch.randn(1, 3, 224, 224) net.quantize(dummy_input, 'CV_32F', 'CV_8S') And I get error: OpenCV(4.6.0) :-1: error: (-5:Bad argument) in function 'quantize' >…
0
votes
1 answer

Is it possible to quantize individual layers weights/activations post training?

Quantization-aware training in Tensorflow allows me to quantize individual levels with different quantization configurations using tensorflow_model_optimization.quantization.keras.quantize_annotate_layer. I want to have a similar effect on an…
Tortellini Teusday
  • 1,335
  • 1
  • 12
  • 21
0
votes
1 answer

Why are some nn.Linear layers not quantized by Pytorch?

I'm quantizing the Swin transformer (static PTQ) using the following function: def static_quantize(m, data_loader): backend = 'qnnpack' torch.backends.quantized.engine = backend m.eval() m.qconfig =…
corazza
  • 31,222
  • 37
  • 115
  • 186
0
votes
0 answers

Tflite int8 Quantise model with multiple inputs and variable output

Hi I have a neural network that takes in 2 inputs using the Functional API varyNetworkInput = tf.keras.Input(shape=(1,), batch_size=1) mainInput = tf.keras.Input(shape=(10, 10)) ... x = tf.keras.layers.Lambda(switcher)([varyNetworkInput,…
GILO
  • 2,444
  • 21
  • 47
0
votes
1 answer

Quantization of torch tensors to reduce storage size

I have a number (N) of vectors of size 192 x 1 currently stored as torch tensors. Each element in the tensor are float numbers. These Nvectors are used to be compared against a reference vector by its' similarity. Now, N can be very large and hence…
0
votes
1 answer

Method to quantize a range of values to keep precision when signficant outliers are present in the data

Could you tell me please if there is a suitable quantizing method in the following case (preferrably implemented in python)? There is an input range where majority of values are within +-2 std from mean, while some huge outliers are present. E.g.…
aaxx
  • 55
  • 1
  • 5
0
votes
1 answer

pytorch eager quantization - skipping modules

I am using eager mode quantization. However, I want to skip some layers from being quantized. I am following the tutorial here However, when I test the model now I get the following error: Could not run ‘aten::_slow_conv2d_forward’ with arguments…
0
votes
0 answers

How to stop images from affecting other images in python?

I'm writing a piece of code, in Python, to show image quantization (2,4,6,8 - Level). I'm using matplotlib, skimage and numpy. The code was running as expected for the individual levels, but when I tried to club them all into one program, the image…
senSMEM8
  • 1
  • 3
0
votes
1 answer

How to reduce model size in Pytorch post training

I have created a pytorch model and I want to reduce the model size. Defining Model Architecture :- import torch import torch.quantization import torch.nn as nn import copy import os import time import numpy as np import torch.autograd as…
Lalit Jain
  • 363
  • 2
  • 14
0
votes
1 answer

Operation type in full integer quantization method in TensorFlowLite

I want to apply Post-Training Quantization (Full integer) using TensorFlow model optimization package on a pre-trained model (LeNet5). https://www.tensorflow.org/model_optimization/guide/quantization/post_training model = Sequential() model._name =…
0
votes
1 answer

network quantization——Why do we need "zero_point"? Why symmetric quantization doesn't need "zero point"?

I have Googled all the days, but can't still find the answer I need. There must be some misunderstandings in my brain. Could you please help me out? 1. Why do we need "zero_point"? quantization:q=round(r/scale)+zero_point I think that the…