Highest Voted 'quantization' Questions

0

votes

2 answers

How to find the size of a deep learning model?

I am working with different quantized implementations of the same model, the main difference being the precision of the weights, biases, and activations. So I'd like to know how I can find the difference between the size of a model in MBs that's in…

asked Apr 13 '22 at 03:22

amur

67
8

0

votes

1 answer

Algorithm for detecting Voltage levels in a voltage vs time data/waveform

I am analyzing a voltage output that I get from spice simulator and I want to quantize the time sampled voltage data so that I can convert the given (trapezoidal wave) to a square wave. I have tried differentiation as a method to understand when the…

algorithm signal-processing simulation waveform quantization

asked Mar 24 '22 at 19:41

Prasad

1
1

0

votes

1 answer

Why did I get 'AssertionError: did not find fuser method for:' error while doing static quantization for a PyTorch model

I am getting the following error while I am trying to apply static quantization on a model. The error is in the fuse part of the code: torch.quantization.fuse_modules(model, modules_to_fuse): model = torch.quantization.fuse_modules(model,…

python pytorch quantization static-quantization

asked Mar 02 '22 at 12:48

Celik

2,311
2
32
54

0

votes

1 answer

XGBoost model quantization - Sklearn model quantization

I am looking for solutions to quantize sklearn models. I am specifically looking for XGBoost models. I did find solutions to quantize pytorch and tensorflow models but nothing on sklearn. Solutions tried: Converted sklearn model to ONNX and then…

python scikit-learn onnx quantization

asked Jan 27 '22 at 09:43

pratsbhatt

1,498
10
20

0

votes

1 answer

onnx.load() | ALBert throws DecodeError: Error parsing message

Goal: re-develop this BERT Notebook to use textattack/albert-base-v2-MRPC. Kernel: conda_pytorch_p36. PyTorch 1.8.1+cpu. I convert a PyTorch / HuggingFace Transformers model to ONNX and store it. DecodeError occurs on onnx.load(). Are my ONNX files…

python deep-learning onnx quantization onnxruntime

asked Jan 20 '22 at 13:33

DanielBell99

896
5
25
57

0

votes

1 answer

Converting PyTorch to ONNX model increases file size for ALBert

Goal: Use this Notebook to perform quantisation on albert-base-v2 model. Kernel: conda_pytorch_p36. Outputs in Sections 1.2 & 2.2 show that: converting vanilla BERT from PyTorch to ONNX stays the same size, 417.6 MB. Quantization models are…

python pytorch onnx quantization onnxruntime

asked Jan 20 '22 at 12:11

DanielBell99

896
5
25
57

0

votes

1 answer

HuggingFace - 'optimum' ModuleNotFoundError

I want to run the 3 code snippets from this webpage. I've made all 3 one post, as I am assuming it all stems from the same problem of optimum not having been imported correctly? Kernel: conda_pytorch_p36 Installations: pip install optimum OR ! pip…

python huggingface-transformers quantization modulenotfounderror pruning

asked Jan 06 '22 at 12:39

DanielBell99

896
5
25
57

0

votes

1 answer

Quantizing the neural network weights and biases to int16 format

I am trying to quantize the weights and biases of my neural network to a 16 bit integer format. The reason for this is to use these arrays in CCS to program the network on a MCU. While I followed the process for post-training quantization using…

python tensorflow deep-learning neural-network quantization

asked Dec 22 '21 at 10:27

Siddhant Yeole

43
5

0

votes

1 answer

LeNet5 inference based on quantize TFLite model. How to downscale int32 to int8 with the M parameter?

I trained a LeNet5 CNN with Keras/TensorFlow. I used TensorFlow Lite to quantized FP32 weights and activations to INT8. I extracted and visualized weights, biases, scales en zero-points thanks to Netron. I needed to design LeNet5 CNN in C langage.…

tensorflow keras tensorflow-lite inference quantization

asked Dec 22 '21 at 09:08

ThoG_Sor

3
2

0

votes

1 answer

How to visualize feature maps of a TensorFlow Lite model?

I used Keract to visualize the feature maps of a TensorFlow/Keras model. I have applied quantization with TensorFlow Lite. I would like to visualize the feature maps generated by the TensorFlow Lite model during an inference. Do you know a way to do…

tensorflow keras tensorflow-lite inference quantization

asked Dec 20 '21 at 16:00

ThoG_Sor

3
2

0

votes

0 answers

tensorflow.python.framework.errors_impl.InvalidArgumentError: Length for attr 'output_shapes' of 0 must be at least minimum 1

I am working on an image segmentation use case. So I created a keras model with ".h5" extension and then I am now trying to convert this keras model to a tensorflow lite model as this tensorflow lite model is required to run in the android…

python deep-learning image-segmentation tensorflow-lite quantization

asked Dec 08 '21 at 03:22

sarthak dave

29
2

0

votes

1 answer

Python sounddevice.rec(), dtype ='int8' quantizes to zero problem

I am trying to plot my voice signal, with different dtypes (the bits/sample obviously). So i tried to capture my voice with dtype = 'int16' and the plot made sense. But i tried to speak in the same sound level with dtype = 'int8' and my plot is a…

python signal-processing multimedia quantization python-sounddevice

asked Dec 07 '21 at 18:24

Panos

53
1
7

0

votes

1 answer

Quantization aware training worse than post-quantization

I want to use quantization aware training to quantize my model to int8. Unfortunately, I cant simply quantize the entire model, since my first layer is a batch normalization (after InputLayer), so I need to use a custom quantizationConfig for that…

tensorflow quantization

asked Nov 02 '21 at 08:28

horsti

113
1
7

0

votes

1 answer

INT8 quantization for matmul

Being inspired by "Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model", I decided to follow through with the caveat of the paper. However, I get confused about setting offset variables during quantization. INPUT :…

python nlp pytorch quantization

asked Oct 23 '21 at 07:28

esse non videri

29
2

0

votes

1 answer

INT8 quantization for FP32 matrix multiplication

I tried to apply INT8bit quantization before FloatingPoint32bit Matrix Multiplication, then requantize accumulated INT32bit output to INT8bit. After all, I guess there's a couple of mix-ups somewhere in the process. I feel stuck in spotting those…

python nlp pytorch matrix-multiplication quantization

asked Oct 18 '21 at 04:17

esse non videri

29
2

Questions tagged [quantization]