I'm quite new to audio processing. Lately, I'm trying to use audio data to control the virtual lightbulb (implemented by a simple python Turtle) with the assumption that each element in the Numpy array represents the amplitude of the audio at the specific time. However, the result came out to be very different from what I expected. Thus, I start to wonder whether my assumption is correct or not.
Please kindly help.
Thank everyone in advance.
load the audio file using pydub
frame_rate = 16000
#load component into variable
music = AudioSegment.from_wav("Music.wav").set_frame_rate(frame_rate)
change it to Numpy array
musicArr = np.array(music.get_array_of_samples())
printing the array out
print(musicArr)
The result is something like this
[ 11 -11 12 -20 13 -23 10 -24 13 -25 10 -19 7 -16 5 -4 8 2
12 9 14 18 24 23 33 29 30 30 33 32 32 33 28 26 25 18
24 15 21 10 21 2 12 -1 10 -8 1 -11 0 -10 -4 -6 -13 -1
-19 1 -29 4 -31 6 -38 6 -41 4 -43 1 -47 -6 -48 -11 -49 -24
-52 -27 -51 -28 -49 -33 -53 -35 -55 -33 -56 -36 -52 -37 -51 -36 -47 -33
-45 -28 -44 -34 -44 -37 -44 -40 -43 -45 -43 -47 -42 -44 -42 -46 -40 -42
-37 -31 -27 -28 -26 -23 -21 -15 -11 -17 -6 -14 -5 -12 -1 -9 -9 -6
-6 -5 -9 -2 -12 -4 -15 -2 -18 1 -20 -1 -19 -1 -16 -3 -16 -9
-13 -12 -6 -16 -8 -17 -5 -19 -1 -22 -3 -20 0 -18 3 -22 7 -23
9 -24 8 -24 6 -25 8 -23 4 -23 2 -22 3 -21 8 -23 6 -24
3 -24 0 -22 -1 -30 -4 -32 -3 -37 -8 -32 -12 -38 -20 -30 -16 -32
-19 -32]