What does each element in the NumPy array derived from the Pydub audio segment represent?

Question

I'm quite new to audio processing. Lately, I'm trying to use audio data to control the virtual lightbulb (implemented by a simple python Turtle) with the assumption that each element in the Numpy array represents the amplitude of the audio at the specific time. However, the result came out to be very different from what I expected. Thus, I start to wonder whether my assumption is correct or not.

Please kindly help.
Thank everyone in advance.

load the audio file using pydub

frame_rate = 16000

#load component into variable
music = AudioSegment.from_wav("Music.wav").set_frame_rate(frame_rate)

change it to Numpy array

musicArr = np.array(music.get_array_of_samples())

printing the array out

print(musicArr)

The result is something like this

[ 11 -11  12 -20  13 -23  10 -24  13 -25  10 -19   7 -16   5  -4   8   2
  12   9  14  18  24  23  33  29  30  30  33  32  32  33  28  26  25  18
  24  15  21  10  21   2  12  -1  10  -8   1 -11   0 -10  -4  -6 -13  -1
 -19   1 -29   4 -31   6 -38   6 -41   4 -43   1 -47  -6 -48 -11 -49 -24
 -52 -27 -51 -28 -49 -33 -53 -35 -55 -33 -56 -36 -52 -37 -51 -36 -47 -33
 -45 -28 -44 -34 -44 -37 -44 -40 -43 -45 -43 -47 -42 -44 -42 -46 -40 -42
 -37 -31 -27 -28 -26 -23 -21 -15 -11 -17  -6 -14  -5 -12  -1  -9  -9  -6
  -6  -5  -9  -2 -12  -4 -15  -2 -18   1 -20  -1 -19  -1 -16  -3 -16  -9
 -13 -12  -6 -16  -8 -17  -5 -19  -1 -22  -3 -20   0 -18   3 -22   7 -23
   9 -24   8 -24   6 -25   8 -23   4 -23   2 -22   3 -21   8 -23   6 -24
   3 -24   0 -22  -1 -30  -4 -32  -3 -37  -8 -32 -12 -38 -20 -30 -16 -32
 -19 -32]

You are correct in saying each element should be a sample value. In this case it appears to print the byte values interpreted as signed 8-bit ints, or the wav has a PCM bit-depth of 8 — fdcpp, Jan 19 '22 at 17:39
All the high value is at almost the end of the array. I don't know why it acts like that. Normally, the source music should have some vocal start at 4 sec in the beginning. — SorawitC, Jan 21 '22 at 14:23
I'd sanity check this against an Audio editor like Audactity before making any assumptions on what you think you should see — fdcpp, Jan 21 '22 at 15:08
I can solve the problem now. This weird display I get is happening because the python can not run as fast I expected. But, anyway thank you very much — SorawitC, Jan 22 '22 at 13:06

What does each element in the NumPy array derived from the Pydub audio segment represent?

0 Answers0