0

Disclaimer: Forgive my ignorance of audio/sound processing, my background is web and mobile development and this is a bespoke requirement for one of my clients!

I have a requirement to concatenate 4 audio files, with a background track playing behind all 4 audio files. The source audio files can be created in any format, or have any treatment applied to them, to improve the processing time, but the output quality is still important. For clarity, the input files could be named as follows (.wav is only an example format):

  • background.wav
  • segment-a.wav
  • segment-b.wav
  • segment-c.wav
  • segment-d.wav

And would need to be structured something like this:

[------------------------------background.wav------------------------------]
[--segment-a.wav--][--segment-b.wav--][--segment-c.wav--][--segment-d.wav--]

I have managed to use the SoX tool to achieve the concatenation portion of the above using MP3 files, but on a reasonably fast computer I am getting roughly an hours worth of concatenated audio per minute of processing, which isn't fast enough for my requirements, and I haven't applied the background sound or any 'nice to haves' such as trimming/fading yet.

My questions are:

  • Is SoX the best/only tool for this kind of operation?
  • Is there any way to make the process faster without sacrificing (too much) quality?
  • Would changing the input file format result in improved performance? If so, which format is best?

Any suggestions from this excellent community would be much appreciated!

Jay
  • 156
  • 6
  • try to find the bottleneck of your performance-issues. most likely, it is not sox that is slow, but your harddisk. or the encoding/decoding process. in both cases, i don't think that trimming/fading will add much overhead. – umläute Jul 19 '12 at 17:54
  • Thanks for responding umlaeute. I am developing on the latest Macbook Air (SSD) so I suspect that hardware is not likely the issue. Can you elaborate on the encoding/decoding process you mentioned? I wonder if SoX is having to do something to the sample MP3s i'm using before it can work with them, and this is adding overhead - this is why I asked whether changing the input file format would help. – Jay Jul 19 '12 at 18:21

1 Answers1

0
  1. Sox may not be the best tool, but I doubt you will find anything much better without hand-coding.
  2. I would venture to guess that you are doing pretty well to process that much audio in that time. You might do better, but you'll have to experiment. You are right that probably the main way to improve speed is to change the file format.
  3. MP3 and OGG will probably give you similar performance, so first identify how MP3 compares to uncompressed audio, such as wav or aiff. If MP3/OGG is better, try different compression ratios and sample rates to see which goes faster. With wav files, you can try lowering the sample rate (you can do this with MP3/OGG as well). If this is speech, you can probably go as low as 8kHz, which should speed things up considerably. For music, I would say 32kHz, but it depends on the requirements. Also, try mono instead of stereo, which should also speed things up.
Bjorn Roche
  • 11,279
  • 6
  • 36
  • 58
  • Hi Bjorn, thanks for chiming in. As suspected, the concatenation of uncompressed wav's was significantly faster than MP3 when outputting to an equally uncompressed wav file. When changing the output to MP3, the process slowed right back down again, so it's definitely the compression that's causing the performance issue. – Jay Jul 21 '12 at 12:22