Python sounddevice.rec(), dtype ='int8' quantizes to zero problem

Question

I am trying to plot my voice signal, with different dtypes (the bits/sample obviously). So i tried to capture my voice with dtype = 'int16' and the plot made sense. But i tried to speak in the same sound level with dtype = 'int8' and my plot is a zero line.

Why is this happening?

One thought that came across is that maybe the quantizer with 8 bits has a bigger dead zone, so for the same input voice level the quantizer drops the value at 0. Of course i made a hypothesis for the type of the quantizer. I have not seen if the quantizer is uniform dead zone. Below is my code and my plots

import matplotlib.pyplot as plt
import numpy as np
import sounddevice as sd

Fs = 8000  # Sampling frequency
duration = 5  # Recording duration in seconds
voice = sd.rec(frames=duration * Fs, samplerate=Fs, channels=1, dtype='int16')  # Capture the voice
# frames indicate  indirectly the duration of record, dtype is 16 bits per sample.
sd.wait()  # close after recording finish
time = np.linspace(0, len(voice - 1) / Fs, len(voice - 1))  # split x axis in voice-1 points
print(voice)  # points have 1/Fs distance each other
plt.plot(time, voice)  # plot in seconds
plt.title("Voice Signal")
plt.xlabel("Time [seconds]")
plt.ylabel("Voice amplitude")
plt.show()

Here are my voice arrays https://www.mediafire.com/file/7bbfbz0jltszmb6/16bit.csv/file https://www.mediafire.com/file/k627wmbeayutc20/8bit.csv/file

Does `sd.rec` even support 8-bit recording? Most audio devices only support 16-bit and above. Have you tried converting to 8-bit instead of recording directly in it? Also, what is the amplitude and max sample values of the original audio? If it's really really quiet, then it's possible that when converted to 8-bit everything would be cut off. Seems unlikely though. — Random Davis, Dec 07 '21 at 18:28
Hello Random. Thanks for the immediate response. Iam very new to this domain, i did not know before that are also the devices that support 16-bit and above sampling. I have not tried to make convertions, because the point of the excercise is just to record and plot some voice, but iam able to search for this. When you say the amplitude of the original audio, do you mean the numpy array 'voice' that stores the values?. When i use 8bit all elements are zero, and they are acceptable non zero using 16bits. The cuttoff maybe is impossible reason, maybe its all about incompatibility — Panos, Dec 07 '21 at 18:39
Yes, that's what I meant; the array that stores the values, in 16-bit, might indicate what the issue is. 16-bit audio's values can go from -32768 to 32767, whereas 8-bit values can go from -128 to 127. If the values in the first plot really go from just -60 to 60, out of 32767, that's only 0.2% of the max volume. Meaning if you were to convert that to 8-bit, even a value of 60 would be converted to 0.46, which would get rounded to zero. So, it's possible that your recording is just far too quiet. The values in the 16-bit version of the array would be a clue as to what may be happening. — Random Davis, Dec 07 '21 at 18:43
Accidentally i attached the first image, showing a really too quiet signal. I attach now a new plot, where iam speaking normal and very very close to the microphone. So now the values is 20,000 out of 32,767 which is a quite big percent. I attach also two csv file containing the voice arrays on both experiments (16bit and 8bit) — Panos, Dec 07 '21 at 19:02
Also my microphone device is called conexant smart audio HD, and according to a resource about microphones, the HD indicates this 'a sample rate and bit depth higher than 44.1kHz/16-bit is considered high definition (HD) audio.' — Panos, Dec 07 '21 at 19:27
Okay so if you record in 16-bit and then convert the array to 8-bit, does that work? You'd have to normalize the values to cap out at 127 beforehand but I don't see why that wouldn't be an issue. You haven't explained exactly why you're using 8 bits or why you want to record straight to that format rather than just converting from 16 to 8 after. — Random Davis, Dec 07 '21 at 19:34
I will give it a try. I am playing with this function and nothing more. Thanks — Panos, Dec 07 '21 at 19:41
So.. we have results :) . I make a manual conversion, by just scaling my 16-bit voice array. So what i do is simply multiply my voice elements with 256/65535. I posted the photo — Panos, Dec 07 '21 at 19:50

score 1 · Answer 1 · answered Dec 08 '21 at 19:21

This is a problem of the underlying PortAudio library and/or the host API that's used. Depending on which device from the device list is selected, it might or might not work. For me, some devices create garbage values while others work fine. For me (on Linux), uint8 seems to work better than int8.

If you want the problem to be fixed, you should create an issue at https://github.com/PortAudio/portaudio.

As mentioned in the comments, 8 bit samples are rarely used anyway, which might explain why they are supported so badly. If you just want to play around, it's probably easiest to use the default dtype, which gives you floating point numbers in a range from -1.0 to 1.0.

Python sounddevice.rec(), dtype ='int8' quantizes to zero problem

1 Answers1