pydub export is not accurate

Question

For testing purposes I wanted to create an mp3 file with pure sin-waves so I can check that my FFT is working properly. The following code creates this array of 1 and 0,5 seconds of sinewaves:

sample_rate = 1 / 48000
x = np.linspace(0, 1, 48_000)
y0 = np.sin(440 * 2 * np.pi * x)
y1 = np.sin(440 * 2 * np.pi * x) + np.sin(1337 * 2 * np.pi * x)
y2 = np.sin(880 * 2 * np.pi * x[:len(x)//2])
y3 = np.sin(3200 * 2 * np.pi * x[:len(x)//2])
y4 = np.sin(200 * 2 * np.pi * x[:len(x)//2])
y5 = np.sin(20 * 2 * np.pi * x[:len(x)//2])
y = np.concatenate([y0,y1,y2,y3,y4,y5])

Now when I run FFT on this signal I correctly see 1 second of 440Hz, 1 second of 440Hz + 1337Hz and so on.

Now I store the array as mp3 using pydub and load it back using the following code:

sound = pydub.AudioSegment(
    # raw audio data (bytes)
    data=y.astype(np.float16).tobytes(),

    # 2 byte (16 bit) samples
    sample_width=2,

    frame_rate=48_000,

    channels=1
)

sound.export("test.mp3", format="mp3", bitrate="192k",)

a = pydub.AudioSegment.from_mp3("test.mp3")
y = np.array(a.get_array_of_samples())

When I run the same FFT as above on the new signal y, I get a lot of overtones in my FFT

Why is there such a big difference between the two?

Prehaps because mp3? It's a lossy format based on a psychoacoustic model tuned for music and speech and you're using a middling bitrate, for such "pure" signals I wouldn't expect them to go through unharmed by the encoder. Have you tried what happens with other formats e.g. lossless (WAV of compressed) or other lossy formats? — Masklinn, Jun 24 '20 at 13:18
is it possible the audio is clipping? what does the mp3 sound like? — Jiaaro, Jun 26 '20 at 16:59
i do think that i can hear the overtones, but i am no expert to tell if it is distorted actually. how can i check if it is clipping? i posted in my question how the audio is generated. — Jonathan R, Jun 26 '20 at 18:39

pydub export is not accurate

0 Answers0