Which portable tools and libraries are recommended to transcode audio with time stretching/crunching while maintaining meta-information?

Question

I'm looking to implement a podcast transcoding tool/script that uses as many existing tools as possible. What combination of tools and libraries would you recommend?

Objectives:

Automate transcoding audio subscriptions (spoken word podcasts) to a smaller size (Ogg Vorbis and Speex) and make them available to sync to multiple devices (Android phone and an iPod with Rockbox);
Time crunch files with tempo adjustment maintaining pitch (1.5x to 2.5x with at least 0.1x increments);
Keep all meta-information (id3 and images) on the transcoded file;
Preferably using small portable Unix/Linux tools and compatible libraries (Cygwin on Windows, or Wine compatible calls also a possibility);
Simultaneous decode/encode and time crunch in one pass would be a bonus to save computation time.

SoX doesn't have Speex support. MPlayer with -speed as an argument is a possibility for decoding and speed adjustment to WAV, followed by ogg/speex standard encoders, and ending with id3tool or some other meta-information manipulation tool. Are there other alternative transcoding pipelines that fit the requirements?

Note the low # of followers on 2 of your 3 tags. Linux tag might get you more viewers. OR You might want to rethink your absolute dependence on *ix solution. There are probably more tools available on Windows. Finally, this barely passes the test for a programming problem question, which is the focus here on S.O. But your good formatting and clear indication that you have done some research earns you a +1 from me. Consider flagging to have moved by a moderator to superuser.com. Good luck. — shellter, May 12 '12 at 16:27
Thanks for your input, shellter. Justin has already pointed out some programming libraries, which was a kind of answer I was expecting. [SoundTouch](http://www.surina.net/soundtouch/) is also a library to consider. I'm comfortable with C/C++ to mix in these libraries. I was initially reluctant to place the question here, superuser, unix or avp. Considering the Linux tag, it seems like good advice, I'll take it, thanks. Windows libraries and tools are also fine, as long as they are able to be eventually ported to other platforms. — Pedro Palhoto, May 13 '12 at 09:14

score 1 · Accepted Answer · answered May 12 '12 at 21:00

If you find you need to drop down to writing programs, some good starting points would be:

libsndfile for format conversion and access to properties
Dirac for time compression/expansion
and potentially a sample rate convertor for your inputs

One problem with your question is that its input formats and file attributes don't appear to be bound. For example, some formats are capable of defining regions - how should you handle this case? Omit that information? Leave as-is (even though it will be incorrect once stretched)? Adapt the region based on the scale factor? The last is the best, but you may need to get your hands dirty with C or C++ if this level of support is a requirement.

Which portable tools and libraries are recommended to transcode audio with time stretching/crunching while maintaining meta-information?

1 Answers1