Questions tagged [quantization]

Use this tag for questions related to quantization of any kind, such as vector quantization.

Quantization, in mathematics and digital signal processing, is the process of mapping a large set of input values to a (countable) smaller set.

For more, please read the Wikipedia article for more.

444 questions
2
votes
0 answers

Jpeg Image Compression: Quantization issue

While encoding a raw image to a jpeg image, a 8x8 data unit is level shifted, transformed using a 2-D DCT, quantized and Huffman encoded. I have first performed the Row DCT and then the column DCT and i have rounded the result to the nearest…
Ram
  • 437
  • 3
  • 13
2
votes
1 answer

JPEG file quantization table definition

When I use Photoshop Save As function, and pick jpeg file format I get following window: As you can see, I select Baseline ("Standard") format, and maximum picture quality. When I open this picture in Hex editor, I see several FF DB markers (which…
MrD
  • 2,423
  • 3
  • 33
  • 57
2
votes
0 answers

GGML vs GPTQ vs bitsandbytes

What are the core differences between how GGML, GPTQ and bitsandbytes (NF4) do quantisation? Which will perform best on: a) Mac (I'm guessing ggml) b) Windows c) T4 GPU d) A100 GPU So far, I've run GPTQ and bitsandbytes NF4 on a T4 GPU and…
2
votes
2 answers

Quantizing normally distributed floats in Python and NumPy

Let the values in the array A be sampled from a Gaussian distribution. I want to replace every value in A with one of n_R "representatives" in R so that the total quantization error is minimized. Here is NumPy code that does linear…
Björn Lindqvist
  • 19,221
  • 20
  • 87
  • 122
2
votes
2 answers

Quantize Image using PIL and numpy

I have an image that I want to perform quantization on it. I want to use PIL and Numpy libraries and the original image is: and I expect the output for 2-level quantization to be like this: but the output is like this: What am I doing wrong? from…
2
votes
1 answer

TF Yamnet Transfer Learning and Quantization

TLDR: Short term: Trying to quantize a specific portion of a TF model (recreated from a TFLite model). Skip to pictures below. \ Long term: Transfer Learn on Yamnet and compile for Edge TPU. Source code to follow along is here I've been trying to…
2
votes
0 answers

How to quantize an ONNX model converted from a XGBoost classifier model?

I converted a XGBoost classifier model to an ONNX model by onnxmltools and quantized the ONNX model using ONNX quantize_dynamic(). But I didn't get a quantized ONNX model with smaller model file size or faster inference time. I used Anaconda3,…
SC Chen
  • 23
  • 5
2
votes
1 answer

Exclude Rescaling layer from TensorFlow quantization while preserving sparsity and clustering

I'm following this guide for performing quantization on my model stripped_clustered_model. Unfortunately, my model contains a layer, which can not be quantized (Rescaling layer). In order to account for that, I'm using quantize_annotate_layer to…
mat.tho
  • 194
  • 8
2
votes
0 answers

Jupyter Kernel Dies when trying to run tflite quantization debugger

Main Issue: Every time I try to debug my quantized model, my jupter kernel dies and is restarted. Want to know what to fix, but more importantly, how to look for what to fix the next time this happens. What I've tried: Not sure about behavior of…
chr0nikler
  • 468
  • 4
  • 13
2
votes
1 answer

Reproducing arithmetic with pytorch's quantized tensors with numpy operations

I would like to know what exact arithmetic operations I have to do to reproduce results of quantized operations in pytorch. This is almost duplicate question to: I want to use Numpy to simulate the inference process of a quantized MobileNet V2…
2
votes
1 answer

Scalar vs Vector signal Quantization

I am studying about scalar vs vector quantization, and I have an assignment to implement (in MATLAB) a scalar quantizer , using the Lloyd-Max algorithm, and a vector quantizer via k-means clustering. The vector quantizer works in the R2 vector…
Zeosleus
  • 51
  • 4
2
votes
1 answer

Fix Batchsize of Tensorflow Model after training

I am training a Model in Tensorflow with a variable batchsize (Input: [None, 320, 240, 3]). The problem is during post-training quantization I can not have any dynamic input, thus no "None" and with edgetpu compiler I can not have batchsizes greater…
Jodo
  • 4,515
  • 6
  • 38
  • 50
2
votes
1 answer

I want to use Numpy to simulate the inference process of a quantized MobileNet V2 network, but the outcome is different with pytorch realized one

Python version: 3.8 Pytorch version: 1.9.0+cpu Platform: Anaconda Spyder5.0 To reproduce this problem, just copy every code below to a single file. The ILSVRC2012_val_00000293.jpg file used in this code is shown below, you also need to download it…
Johnson Z
  • 31
  • 3
2
votes
0 answers

TensorFlow's QAT does not seem to perform per-channel quantization with AllValuesQuantizer

I have created two QAT models with the AllValuesQuantizer, one with per-tensor and one with per-channel quantization. When inspecting their respective QuantizeWrapper layers I note that both have scalar values for the variables kernel_min and…
2
votes
1 answer

How to get quantized weights from TensorFlow's quantization aware training with experimental quantization

I'm using TensorFlow's quantization aware training API and wish to deploy a model with arbitrary bit-width. As only 8 bit quantization is supported for tflite deployment I will deploy with a custom inference algorithm, but I still need to access the…