So, let's say you want several chunk size per file.
In the simplest form, you'll need two things:
- a new for loop
- an array with all the chunk size
from pydub import AudioSegment
from pydub.utils import make_chunks
from pydub import AudioSegment
from pydub.utils import make_chunks
myaudio = AudioSegment.from_file('C:/Users/XY/Desktop/input/HouseSample.wav')
chunk_sizes = [10000] # pydub calculates in millisec
for chunk_length_ms in chunk_sizes:
chunks = make_chunks(myaudio,chunk_length_ms) #Make chunks of one sec
for i, chunk in enumerate(chunks):
chunk_name = '{0}.wav'.format(i)
print ('exporting', chunk_name)
chunk.export(chunk_name, format='wav')
For now, this code will actually produce the same split as you already have.
To add multiple split, you can simply add more values to the chunk_sizes
array, e.g. chunk_sizes = [10000, 5000]
for 10 and 5 seconds splits.
If you want to add some randomness, you could rely on any pseudo-random generator like random
or numpy.random
.
A small example, with 5 different split between 10s and 5s:
import random
N_SPLIT = 5
chunk_sizes = []
for _ in range(N_SPLIT):
chunk_sizes.append(random.randint(5000, 10000))
Beware, if you need this split to be consistent across your dataset, you'll need to use the same randomized chunk_sizes
array for each file, so it might be useful to use a seed here (e.g. random.seed(42)
).