3

I need to encode a flac file with seektables, ffmpeg's flac encoder does not include seektables, so I need to use the flac CLI. I'm trying to make it possible to convert any arbitrary audio file to a seekable flac file by first piping it through ffmpeg, then to the flac encoder.

export const transcodeToFlac: AudioTranscoder<{}> = ({
  source,
  destination
}) => {
  return new Promise((resolve, reject) => {
    let totalSize = 0

    const { stdout: ffmpegOutput, stderr: ffmpegError } = spawn("ffmpeg", [
      "-i",
      source,
      "-f",
      "wav",
      "pipe:1"
    ])

    const { stdout: flacOutput, stdin: flacInput, stderr: flacError } = spawn(
      "flac",
      ["-"]
    )

    flacOutput.on("data", (buffer: Buffer) => {
      totalSize += buffer.byteLength
    })

    ffmpegError.on("data", error => {
      console.log(error.toString())
    })

    flacError.on("data", error => {
      console.log(error.toString())
    })

    //stream.on("error", reject)

    destination.on("finish", () => {
      resolve({
        mime: "audio/flac",
        size: totalSize,
        codec: "flac",
        bitdepth: 16,
        ext: "flac"
      })
    })

    ffmpegOutput.pipe(flacInput)
    flacOutput.pipe(destination)
  })
}

While this code works, the resulting flac file is not correct. The source audio is of duration 06:14, but the flac file is of duration 06:45:47. Encoding the flac manually without piping ffmpeg to it works fine, but I cannot do that in a server environment where I need to utilize streams.

Here's what the flac encoder outputs when transcoding:

flac 1.3.2
Copyright (C) 2000-2009  Josh Coalson, 2011-2016  Xiph.Org Foundation
flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are
welcome to redistribute it under certain conditions.  Type `flac' for details.

-: WARNING: skipping unknown chunk 'LIST' (use --keep-foreign-metadata to keep)
-: WARNING, cannot write back seekpoints when encoding to stdout
-: 0% complete, ratio=0.357
0% complete, ratio=0.432
0% complete, ratio=0.482
0% complete, ratio=0.527
0% complete, ratio=0.541
1% complete, ratio=0.554
1% complete, ratio=0.563
1% complete, ratio=0.571
size=   36297kB time=00:03:30.70 bitrate=1411.2kbits/s speed= 421x
1% complete, ratio=0.572
1% complete, ratio=0.570
1% complete, ratio=0.577
1% complete, ratio=0.583
1% complete, ratio=0.584
1% complete, ratio=0.590
1% complete, ratio=0.592
size=   64512kB time=00:06:14.49 bitrate=1411.2kbits/s speed= 421x
video:0kB audio:64512kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead:
0.000185%

-: WARNING: unexpected EOF; expected 1073741823 samples, got 16510976 samples
2% complete, ratio=0.579
Sebastian Olsen
  • 10,318
  • 9
  • 46
  • 91

1 Answers1

1

First things first:

I need to encode a flac file with seektables,
-: WARNING, cannot write back seekpoints when encoding to stdout

from flac -H:

A single INPUTFILE may be - for stdin.  No INPUTFILE implies stdin.  Use of
stdin implies -c (write to stdout).  Normally you should use:
   flac [options] -o outfilename  or  flac -d [options] -o outfilename
instead of:
   flac [options] > outfilename   or  flac -d [options] > outfilename
since the former allows flac to seek backwards to write the STREAMINFO or
WAVE/AIFF header contents when necessary.

Try with flac - -o outfilename.flac instead of just flac -

It seems to work for me and the resulting audio is of correct length (in my case - which is different from yours though):
$ rm out.flac; ffmpeg -nostdin -i ~/audio/asmr/ASMR\ _\ Camera\ Touching\ _\ No\ Mouthsounds\ _\ NO\ TALKING-lQlZJ82ebBk.m4a -f wav - | flac - -o out.flac


flac 1.3.2
Copyright (C) 2000-2009  Josh Coalson, 2011-2016  Xiph.Org Foundation
flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are
welcome to redistribute it under certain conditions.  Type `flac' for details.

ffmpeg version n4.1.3 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8.2.1 (GCC) 20181127
  configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-nvdec --enable-nvenc --enable-omx --enable-shared --enable-version3
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/user/audio/asmr/ASMR _ Camera Touching _ No Mouthsounds _ NO TALKING-lQlZJ82ebBk.m4a':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    encoder         : Lavf58.3.100
  Duration: 00:53:29.09, start: 0.000000, bitrate: 126 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native))
Output #0, wav, to 'pipe:':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    ISFT            : Lavf58.20.100
    Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      encoder         : Lavc58.35.100 pcm_s16le
-: WARNING: skipping unknown chunk 'LIST' (use --keep-foreign-metadata to keep)
13% complete, ratio=0.284size=  552816kB time=00:53:29.09 bitrate=1411.2kbits/s speed= 665x      
video:0kB audio:552816kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000014%
-: WARNING: unexpected EOF; expected 1073741823 samples, got 141516800 samples
13% complete, ratio=0.283

$ ffprobe ./out.flac 
ffprobe version n4.1.3 Copyright (c) 2007-2019 the FFmpeg developers
  built with gcc 8.2.1 (GCC) 20181127
  configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-nvdec --enable-nvenc --enable-omx --enable-shared --enable-version3
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, flac, from './out.flac':
  Duration: 00:53:29.09, start: 0.000000, bitrate: 399 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16

$ ffprobe ~/audio/asmr/ASMR\ _\ Camera\ Touching\ _\ No\ Mouthsounds\ _\ NO\ TALKING-lQlZJ82ebBk.m4a
ffprobe version n4.1.3 Copyright (c) 2007-2019 the FFmpeg developers
  built with gcc 8.2.1 (GCC) 20181127
  configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-nvdec --enable-nvenc --enable-omx --enable-shared --enable-version3
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/user/audio/asmr/ASMR _ Camera Touching _ No Mouthsounds _ NO TALKING-lQlZJ82ebBk.m4a':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    encoder         : Lavf58.3.100
  Duration: 00:53:29.09, start: 0.000000, bitrate: 126 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

$ vlc out.flac
#sounds ok, even at the end

This
-: WARNING: skipping unknown chunk 'LIST' (use --keep-foreign-metadata to keep)
can't be helped:
ERROR: --keep-foreign-metadata cannot be used when encoding from stdin or to stdout

However, even though I'm getting
-: WARNING: unexpected EOF; expected 1073741823 samples, got 141516800 samples
which is the reason it stopped at 13% (141516800 * 100 / 1073741823 = 13.2%), the output seems fine and the same length as the input!

UPDATE: This happens because ffmpeg doesn't fill the correct ChunkSize value of the output wav because it's being sent to a pipe instead of to a file, so ffmpeg initially uses four 0xFF bytes for ChunkSize and by the time the wav encoding is finished, ffmpeg knows what the correct value would be but it cannot seek back into the output pipe to update the ChunkSize portion. When output is to a file, it can.

The canonical WAVE format starts with the RIFF header:

0         4   ChunkID          Contains the letters "RIFF" in ASCII form
                               (0x52494646 big-endian form).
4         4   ChunkSize        36 + SubChunk2Size, or more precisely:
                               4 + (8 + SubChunk1Size) + (8 + SubChunk2Size)
                               This is the size of the rest of the chunk 
                               following this number.  This is the size of the 
                               entire file in bytes minus 8 bytes for the
                               two fields not included in this count:
                               ChunkID and ChunkSize.
8         4   Format           Contains the letters "WAVE"
                               (0x57415645 big-endian form).

Here's how ffmpeg wav output differs when the output is to a file, compared to when it is to a pipe:

(note: don't run this, it will take a few minutes of 100% CPU usage aka one core, even on a fast processor)

$ colordiff -up <(hexdump -C toafile.wav) <(hexdump -C piped.wav)
--- /dev/fd/63  2019-05-19 21:28:20.621944056 +0200
+++ /dev/fd/62  2019-05-19 21:28:20.621944056 +0200
@@ -1,8 +1,8 @@
-00000000  52 49 46 46 46 c0 bd 21  57 41 56 45 66 6d 74 20  |RIFFF..!WAVEfmt |
+00000000  52 49 46 46 ff ff ff ff  57 41 56 45 66 6d 74 20  |RIFF....WAVEfmt |
 00000010  10 00 00 00 01 00 02 00  44 ac 00 00 10 b1 02 00  |........D.......|
 00000020  04 00 10 00 4c 49 53 54  1a 00 00 00 49 4e 46 4f  |....LIST....INFO|
 00000030  49 53 46 54 0e 00 00 00  4c 61 76 66 35 38 2e 32  |ISFT....Lavf58.2|
-00000040  37 2e 31 30 33 00 64 61  74 61 00 c0 bd 21 00 00  |7.103.data...!..|
+00000040  37 2e 31 30 33 00 64 61  74 61 ff ff ff ff 00 00  |7.103.data......|
 00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00003440  00 00 01 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
-----------
  1. The piped wav(generated by ffmpeg when output was to a pipe) will thus cause flac to give you 1 extra warning:

a. Generate piped wav: ffmpeg -nostdin -i ~/audio/asmr/ASMR\ _\ Camera\ Touching\ _\ No\ Mouthsounds\ _\ NO\ TALKING-lQlZJ82ebBk.m4a -f wav - > piped.wav

b. Pipe that to flac:

cat piped.wav | flac - -o out2.flac

flac 1.3.2
Copyright (C) 2000-2009  Josh Coalson, 2011-2016  Xiph.Org Foundation
flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are
welcome to redistribute it under certain conditions.  Type `flac' for details.

-: WARNING: skipping unknown chunk 'LIST' (use --keep-foreign-metadata to keep)
-: 13% complete, ratio=0.284-: WARNING: unexpected EOF; expected 1073741823 samples, got 141516800 samples
13% complete, ratio=0.283

flac doesn't know the input's correct size (ChunkSize is 0xFFFFFFFF)

  1. But the wav file(generated by ffmpeg when output was to a file) will be alright:

a. Generate wav into a file: ffmpeg -nostdin -i ~/audio/asmr/ASMR\ _\ Camera\ Touching\ _\ No\ Mouthsounds\ _\ NO\ TALKING-lQlZJ82ebBk.m4a -f wav toafile.wav

b. Pipe that to flac:

$ rm out2.flac; cat toafile.wav | flac - -o out2.flac

flac 1.3.2
Copyright (C) 2000-2009  Josh Coalson, 2011-2016  Xiph.Org Foundation
flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are
welcome to redistribute it under certain conditions.  Type `flac' for details.

-: WARNING: skipping unknown chunk 'LIST' (use --keep-foreign-metadata to keep)
-: wrote 160381819 bytes, ratio=0.283

This is because it has ChunkSize value set correctly. (ChunkSize is 0x21BDC046 = 566,083,654 which is 8 bytes less of the output infile.wav which has total size of 566,083,662 bytes)

  • The problem is that I still need to utilize streams when piping from flac. In my code, the destination stream could be piping to an external remote storage solution like GCS. – Sebastian Olsen May 19 '19 at 14:53
  • Ah, my mistake, having read your question, I mistakenly understood that you only wanted to pipe from ffmpeg to flac. The problem is then, as `flac -H` says, that flac cannot seek back into the stream(because it's a stream) to write those seekpoints that you said you need. So I don't see how you can get away with flac writing to a pipe. Maybe try modifying flac to write to a temporary file? (maybe O_TMPFILE so nothing else can see it - I've never tried it, but `man 2 open`) then start piping that file to the stream only after the `flac` process has exited. Seems like a bad solution, imho :( –  May 19 '19 at 18:29
  • That is actually what I am doing right now, but I hate it. I would rather not utilize the disk as this is supposed to be a scalable solution. I guess I'm just gonna have to suck it up. Though, something strange I noticed is that if I stream a wav file directly to flac, without ffmpeg, it seems to work. – Sebastian Olsen May 19 '19 at 19:04
  • Looks like `ffmpeg` cannot seek back either(since it's a stream), thus you get 4x2=8 bytes that are 0xFF if you ffmpeg the wav to a pipe than to a file. 4 bytes are [ChunkSize](https://web.archive.org/web/20190519191353/http://soundfile.sapp.org/doc/WaveFormat/) and the other 4 seem to be part of Lavf58.2.7 LIST header(I don't know) but 3 of them are the same as the in the first 4. Maybe this info is enough for `flac` to not have to seek back because now it knows the sizes(chunk and file) it needs to calculate the seekpoints and so can write normally without seek-backs, thus even to a stream. –  May 19 '19 at 19:22
  • I updated the answer to include the above information, but I guess you're still stuck with having to output to a temporary file, either from ffmpeg or from flac, though the latter seems to take less space(flac vs wav), the former seems to avoid that _seemingly dangerous_ flac warning with `unexpected EOF`. (note: this comment can be deleted) –  May 19 '19 at 20:11
  • I think your answer is well detailed and even if I didn't get a solution, I did get an answer to why. I'll accept it. Thanks! – Sebastian Olsen May 19 '19 at 22:55