0

I have a voice recording program that records sound from microphone, then split it onto WAV single-second fragments and the converts every WAV to MP3.

I getting normal melody when joining all WAV files together. I getting crappy melody when joining all MP3 files together.

What's wrong? i Though wav -> mp3 conversion should not add or remove any chunks from files. This is the code to create single-second fragments in wav and mp3 version:

    public void CreateWavAndMp3(string wav_path, string mp3_path, WaveFormat recordingFormat)
    {
        WaveFileWriter wav_writer = new WaveFileWriter(wav_path, recordingFormat);

        List<byte> complete_chunk = new List<byte>(); //to store chunks one after another

        for (int i = 0; i < this.Chunks.Count; i++) //here I have raw bytes stored in List<byte[]>. I just do it that way and since WAV files are fine it's no matter
            complete_chunk.AddRange(this.Chunks[i]);           

        long maxFileLength = recordingFormat.AverageBytesPerSecond * 60;
        var toWrite = (int)Math.Min(maxFileLength - wav_writer.Length, complete_chunk.Count);

        if (toWrite > 0)
        {
            wav_writer.Write(complete_chunk.ToArray(), 0, complete_chunk.Count); //write wav based on stored chunks
            wav_writer.Dispose(); //wav file written
        }

        //mp3 junk
        WaveLib.WaveStream InStr = new WaveLib.WaveStream(wav_path);
        Yeti.MMedia.Mp3.Mp3Writer mp3Writer;
        Yeti.MMedia.Mp3.Mp3WriterConfig m_Config = new Yeti.MMedia.Mp3.Mp3WriterConfig(InStr.Format);

        FileStream Mp3FS = new FileStream(mp3_path, FileMode.Create, FileAccess.Write);
        mp3Writer = new Yeti.MMedia.Mp3.Mp3Writer(Mp3FS, m_Config);

        byte[] mp3buff = new byte[mp3Writer.OptimalBufferSize];
        int read = 0;
        long total = InStr.Length;

        while ((read = InStr.Read(mp3buff, 0, mp3buff.Length)) > 0)
            mp3Writer.Write(mp3buff, 0, read);

        InStr.Dispose();
        mp3Writer.Dispose();
    }

Test sound files: https://www.dropbox.com/s/e43hh4y3oli13f4/livestream.7z?dl=0 so you can hear it too. try joining all files in movie maker or etc.

Kosmo零
  • 4,001
  • 9
  • 45
  • 88
  • 1
    I think the mp3 writer put a padding at the end after each `Write` method call. It may even put the leading space at the first chunk. I think in this kind of situation it's better to join every wave chunks into one and then convert it all together. – Wutipong Wongsakuldej Dec 29 '15 at 12:55
  • @TaW - I don't know where I split... I have `OnDataAvailable` event that gives me some raw bytes. I add this bytes into `Single-second` class and check if second already passed. If passed, then I write next bytes into new `Single-second` class and you saw the method of this class to create wav and mp3. – Kosmo零 Dec 29 '15 at 12:57
  • @WutipongWongsakuldej - Sadly, but splitting on fragments is the main idea how it all works. I have livestreaming app that allow admin to broadcast his voice to all connected browsers. It receives sound fragments and plays it one after another. – Kosmo零 Dec 29 '15 at 12:59
  • So you reciving the multiple waves and recording that to a mp3 file, without knowing when it would end, right ? I think if you can find the frame size that tha padding does not occur (in the case that there actually are) than you are good to go. You could record them into separate files and then join them together after the session ends. – Wutipong Wongsakuldej Dec 29 '15 at 13:17
  • @WutipongWongsakuldej - You are right... Is there anyway to determine what padding is without bruteforce? – Kosmo零 Dec 29 '15 at 13:38
  • I think that's not very easy to do. Sorry, I can't think of any way to do that. Off-topic: my browser reported that the link is not safe. Can you find the other place to put the file on ? – Wutipong Wongsakuldej Dec 29 '15 at 13:53
  • @WutipongWongsakuldej - weird, but ok, I will replace it with dropbox link. – Kosmo零 Dec 29 '15 at 13:59

1 Answers1

2

You're experiencing a problem related to the way MP3 does its encoding. Part of the codec itself adds padding at the start and end of every file. It is not avoidable. You will need to use a different format if you want to join them end to end.

Some music players get around this by calculating how much silence is added. But even this can vary depending on the codec. If you want to dive into the technical details, check out section 2 of this document: http://lame.sourceforge.net/tech-FAQ.txt

(tl;dr: that document says "576 samples", and 16-bit stereo is 4 bytes per sample.)

One other lossy codec that does not exhibit this problem is OGG. "Vorbis" is a NuGet package that is said to support using this format.

A.Konzel
  • 1,920
  • 1
  • 13
  • 14
  • Ugh, mp3 is the only codec that is supported by IE html5 `audio` tag, but thanks for info. I will think what to do with padding. – Kosmo零 Dec 29 '15 at 19:30
  • Do you know how to calculate how much silence is added? – Kosmo零 Dec 29 '15 at 20:38
  • It can differ depending on the codec. I'm guessing you won't be able to decode it and look at the data yourself programatically (if you're using HTML5.) Which means all you can do is make an educated guess. If you have a decent audio editor (Audacity, Audition, etc.) and you know the MP3s are all from the same codec, you can open up the result MP3 in the audio editor and zoom in enough to measure the amount of silence before the sound starts. – A.Konzel Dec 29 '15 at 20:44
  • Alternatively, if you need IE support, and don't mind making it IE9 minimum, you can provide both AAC and OGG formats. They are lossy formats, but should be sufficient. Between these two formats, you should have complete browser support. – A.Konzel Dec 29 '15 at 20:48
  • Thank you. Audacity is nice solution. I will use timer in JS and will start playing next fragment proactive. – Kosmo零 Dec 29 '15 at 21:01