6

I have a large sound file (150 MB) that I would like to split into smaller files of some more easily managed size, say, files with 5 minutes of audio. Clearly, the last segment is going to be <= 5 minutes, and that's OK. Is there a way to do this sort of task easily?

A small sample .mp3 file to be used for this problem can be downloaded using this link: download.linnrecords.com/test/mp3/recit.aspx.

Here is what I have tried so far. I imported the data using readMP3 from tuneR and was going to use the cutw function, but haven't found an efficient way of using it.

library(tuneR)

sample<-readMP3("recit.mp3") 

# the file is only 9.04 seconds long (44.1 Hz, 16-bit, sterio)
# so, for this example we can cut it into 0.5 second intervals)
subsamp1<-cutw(sample, from=0, to=0.5, output="Wave")

# then I would have to do this for each interval up to:
subsampn<-cutw(sample, from=9, to=9.04, output="Wave") 
# where I have to explicitly state the maximum second (i.e. 9.04), 
# unless there is a way I don't know of to extract this information.

This approach is inefficient when intervals become small in comparison to the total file length. Also, sample was stereo, but subsamp1 is mono, and I'd prefer not to change anything about the data if possible.

In the way of improving efficiency, I tried inputting vectors to the from and to arguments, but I got an error (see below). Even if it had worked, though, it would not be a particularly nice solution. Anyone know of a more elegant way to approach this problem using R?

cutw(subsamp1,from=seq(0,9,0.5),to=c(seq(0.5,9.0,0.5),9.04) 
# had to explicitly supply the max second (i.e. 9.04). 
# must be a better way to extract the maximum second

Error in wave[a:b, ] : subscript out of bounds
In addition: Warning messages:
1: In if (from > to) stop("'from' cannot be superior to 'to'") :
  the condition has length > 1 and only the first element will be used
2: In if (from == 0) { :
  the condition has length > 1 and only the first element will be used
3: In a:b : numerical expression has 19 elements: only the first used
Jota
  • 17,281
  • 7
  • 63
  • 93

3 Answers3

4

Building on the excellent answer by @Jean V. Adams, I found a solution using indexing (i.e. [).

library(seewave)

# your audio file (using example file from seewave package)
data(tico)
audio <- tico
# the frequency of your audio file
freq <- 22050
# the length and duration of your audio file
totlen <- length(audio)
totsec <- totlen/freq

# the duration that you want to chop the file into
seglen <- 0.5

# defining the break points
breaks <- unique(c(seq(0, totsec, seglen), totsec))
index <- 1:(length(breaks)-1)
# a list of all the segments
lapply(index, function(i) audio[(breaks[i]*freq):(breaks[i+1]*freq)])
# the above final line is the only difference between this code and the 
# code provided by @Jean V. Adams

The advantage here is that if your input audio object is stereo, the returned objects are stereo, as well. cutw changes output objects to mono, from what I can tell.

Jota
  • 17,281
  • 7
  • 63
  • 93
2

I don't have any experience working with audio files in R, but I was able to come up with an approach that might help you. Check out the code below.

library(seewave)

# your audio file (using example file from seewave package)
data(tico)
audio <- tico
# the frequency of your audio file
freq <- 22050
# the length and duration of your audio file
totlen <- length(audio)
totsec <- totlen/freq

# the duration that you want to chop the file into
seglen <- 0.5

# defining the break points
breaks <- unique(c(seq(0, totsec, seglen), totsec))
index <- 1:(length(breaks)-1)
# a list of all the segments
subsamps <- lapply(index, function(i) cutw(audio, f=freq, from=breaks[i], to=breaks[i+1]))
Jean V. Adams
  • 4,634
  • 2
  • 29
  • 46
0

Check https://github.com/schultzm/SliceAudio.py I wrote this script to do a very similar thing as asked in this question, but I wrote it in python. Not sure if it's still relevant, but here is my solution anyway. You could launch the python script from within R if so desired.

The python script slices audio files (in batch if so desired) along the length of the file until it reaches the end of the file. By default it will slice a file into 2-second blocks, with each block starting at the end of the next block, and each block output as a separate file (into the folder containing the input file; file output names as per input but with the position in the original file added to the output file name). The default format of the output slices is 16-bit, 48kHz, mono. The user can crush the sample to 8-bit width or have it in medium (16-bit) or high-quality (32-bit). Sample rate can be anywhere from low quality (11025 Hz) to high quality (48000 Hz) – in fact, sample rate can be whatever you want, but your computer may not know how to deal those non-standard rates (e.g., I tested it with 1 Hz, and iTunes died when trying to play it – see the help menu for standard/accepted options [python SliceAudio.py -h] ). The user can also alter the sample slice length and the overlap slide on the previous slice (e.g., you could slice into 10 second windows with each subsequent window sliding along 1 second to overlap the previous window by 1 second. NB. time is measured in milliseconds, so multiply x-seconds by 1000 to get the desired slice length in seconds). There is an option for stereo output. The script can input and output any format that is supported by ffmpeg**.

Dependencies: 1. gcc 2. pydub (sudo pip install pydub), see github.com/jiaaro/pydub 3. ffmpeg (brew install libav --with-libvorbis --with-sdl --with-theora) 4. audioread (sudo pip install audioread)

Example usage: python SliceAudio.py -i xyz.m4a -f m4a -b 2 -s 11025 -l 10000 python SliceAudio.py -h

**ffmpeg formats: trac.ffmpeg.org/wiki/audio%20types

user3479780
  • 525
  • 7
  • 18