Scenario
I have a bunch of MP3 files which some have a constant bit-rate, others have variable bitrate, some are encoded at 128 kbps, some at other bitrate, some are stereo and some are joint stereo. All are at 44,100 khz
In order to automate a task with these thousands of MP3 files, I'm trying to develop an algorithm that should insert a silence of an arbitrary duration into these MP3 files at different arbitrary positions / durations (eg. insert 500 ms of silence into one MP3 file at position 00:02:30, then insert 750 ms of silence into other MP3 file at position 00:40:02).
Research
The only info I found is about inserting silence at the start or at the end of an MP3 file. This is not what I want because I require to insert silence at an arbitrary position. Most of the times for most of the files I would require to add a silence near the middle of the MP3 file, and maybe very few times I would require to add it at the start of the MP3 file. I will not never need to add a silence at the end of the file.
Some suggests the usage of SOX or FFMPEG command-line applications to insert silence at the start or the end of a MP3 file. I don't know if these apps could serve me for my purpose, but anyways my objective is to do this with C# or VB.NET languages, not depending on any third party app, so this way I can have total control of what modifications I will be doing in the file, and programaticaly handle the resulting modified file to perform other tasks with it (because inserting a silence is just one of the things that I really need to do with these MP3 files).
But I approve depending on the usage of any external library, and I remembered NAudio for .NET, a great library for audio manipulation, and I found this interesting snippet which is not about inserting silence but concatenating files:
https://markheath.net/post/concatenating-sample-providers-in-naudio
I think with NAudio I will have a chance to develop an algorithm to insert silence at a specific duration.
Approaches
It's obvious I don't have enough knowledge to understand how can I do this task.
One of the approaches I figured out is just trying to insert / fill with zeroes at a specific position of the stream, I know how to do that but... how I'm supposed to translate a zero (a byte) to milliseconds to calculate the duration of the silence to insert in the MP3 file?. So I don't know if just inserting a sequence of zeroes will work as a silence, and in case of it works I don't know how to translate that sequence of zeros to time, and also I don't know whether this approach would be secure for all kind of MP3 file variants (CBR, VBR, ABR, mono or stereo channel, etc).
The second approach I think of is to use any audio editor software to generate a MP3 file that will consist of a silence of 1 millisecond, and just insert and concatenate that silence as many times as required in a specific position of the MP3 file stream. I think I would require to generate this 1 ms MP3 file for every possible CBR bitrate, but what happens for VBR and ABR?, I'm stuck with this idea.
Probably at the end things will be very easier than my thoughts, and sure NAudio could help me to accomplish this task or at least to accomplish a big part of it with less effort.
Question
How can I insert a silence of specific duration at a specific position / duration of a undetermined MP3 file format ( which could be CBR, VBR, ABR, single or stereo channel, joint stereo, 128 or 320 kbps, etc) using C# or VB.NET with or without the help of NAudio or other library for .NET?.
Requeriments
NOT USING THIRD PARTY COMMAND-LINE APPLICATIONS neither automating GUI apps.
The file modifications should be done without audio loss, that is without reencoding the file. In the same way as for example MP3DirectCut does, on which you can insert silence or cut & paste without reencoding.
Preferably it would be appreciated the implementation of a reusable universal function like the one below, with this prototype of parameters that I have thought to try simplify things:
public static MemoryStream InsertSilence( Stream inputFile, // pass the raw file stream data TimeSpan startPosition, // eg: new TimeSpan(0, 2, 10) TimeSpan silenceDuration // eg. TimeSpan.FromSeconds(10) ) { // Do the work, save the data into a new stream and return it. return null; }