5

I am using the wave library in python to attempt to reduce the speed of audio by 50%. I have been successful, but only in the right channel. in the left channel it is a whole bunch of static.

import wave,os,math
r=wave.open(r"C:\Users\A\My Documents\LiClipse Workspace\Audio 
compression\Audio compression\aha.wav","r")
w=wave.open(r"C:\Users\A\My Documents\LiClipse Workspace\Audio 
compression\Audio compression\ahaout.wav","w")
frames=r.readframes(r.getnframes())
newframes=bytearray()
w.setparams(r.getparams())
for i in range(0,len(frames)-1):
    newframes.append(frames[i])
    newframes.append(frames[i])
w.writeframesraw(newframes)

why is this? since I am just copying and pasting raw data surely I can't generate static? edit: I've been looking for ages and I finally found a useful resource for the wave format: http://soundfile.sapp.org/doc/WaveFormat/ If I want to preserve stereo sound, it looks like I need to copy the actual sample width of 4 twice. This is because there are two channels and they take up 4 bytes instead of 2.

`import wave
r=wave.open(r"C:\Users\A\My Documents\LiClipse Workspace\Audio 
compression\Audio compression\aha.wav","r")
w=wave.open(r"C:\Users\A\My Documents\LiClipse Workspace\Audio 
compression\Audio compression\ahaout.wav","w")
frames=r.readframes(r.getnframes())
newframes=bytearray()
w.setparams(r.getparams())
w.setframerate(r.getframerate())
print(r.getsampwidth())
for i in range(0,len(frames)-4,4):
    newframes.append(frames[i])
    newframes.append(frames[i+1])
    newframes.append(frames[i+2])
    newframes.append(frames[i+3])
    newframes.append(frames[i])
    newframes.append(frames[i+1])
    newframes.append(frames[i+2])
    newframes.append(frames[i+3])
w.writeframesraw(newframes)`

Edit 2: Okay I have no idea what drove me to do this but I am already enjoying the freedoms it is giving me. I chose to copy the wav file into memory, edit the copy directly, and write it to an output file. I am incredibly happy with the results. I can import a wav, repeat the audio once, and write it to an output file, in only 0.2 seconds. Reducing the speed by half times now takes only 9 seconds instead of the 30+ seconds with my old code using the wav plugin :) here's the code, still kind of un-optimized i guess but it's better than what it was.

import struct
import time as t
t.clock()
r=open(r"C:/Users/apier/Documents/LiClipse Workspace/audio editing 
software/main/aha.wav","rb")
w=open(r"C:/Users/apier/Documents/LiClipse Workspace/audio editing 
software/main/output.wav","wb")
rbuff=bytearray(r.read())
def replacebytes(array,bites,stop):
    length=len(bites)
    start=stop-length
    for i in range(start,stop):
        array[i]=bites[i-start]
def write(audio):
    w.write(audio)
def repeat(audio,repeats):
    if(repeats==1):
        return(audio)
    if(repeats==0):
        return(audio[:44])
    replacebytes(audio, struct.pack('<I', struct.unpack('<I',audio[40:44])
[0]*repeats), 44)
    return(audio+(audio[44:len(audio)-58]*(repeats-1)))
def slowhalf(audio):
    buff=bytearray()
    replacebytes(audio, struct.pack('<I', struct.unpack('<I',audio[40:44])
[0]*2), 44)
    for i in range(44,len(audio)-62,4):
        buff.append(audio[i])
        buff.append(audio[i+1])
        buff.append(audio[i+2])
        buff.append(audio[i+3])
        buff.append(audio[i])
        buff.append(audio[i+1])
        buff.append(audio[i+2])
        buff.append(audio[i+3])
    return(audio[:44]+buff)
rbuff=slowhalf(rbuff)
write(rbuff)
print(t.clock())

I am surprised at how small the code is.

  • My first thought is that it is an issue with the file itself. Have you tried muting the right channel and listening just to the left at full speed? You can check if it's the program by re-encoding your file in mono and running it through your program again. Have you tried using other audio files? – NuclearPeon Jul 13 '17 at 19:42
  • 2
    @NuclearPeon Doesn't appear to be an issue with the audio file. I downloaded a random wav file to test with and it does the same thing – Wondercricket Jul 13 '17 at 19:43

1 Answers1

4

Each of the elements returned by readframes is a single byte, even though the type is int. An audio sample is typically 2 bytes. By doubling up each byte instead of each whole sample, you get noise.

I have no idea why one channel would work, with the code shown in the question it should be all noise.

This is a partial fix. It still intermixes the left and right channel, but it will give you an idea of what will work.

for i in range(0,len(frames)-1,2):
    newframes.append(frames[i])
    newframes.append(frames[i+1])
    newframes.append(frames[i])
    newframes.append(frames[i+1])

Edit: here's the code that should work in stereo. It copies 4 bytes at a time, 2 for the left channel and 2 for the right, then does it again to double them up. This will keep the channel data from interleaving.

for i in range(0, len(frames), 4):
    for _ in range(2):
        for j in range(4):
            newframes.append(frames[i+j])
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622