0

We have a device that creates video files in MP4 file format containing H.264 video data.

Now we notice that within the first AVCC chunk, after the SPS there are 4 null-bytes (00 00 00 00). (I know that the SPS is technically not needed in the video data, but not disallowed either) Within the stsd header, in the AVCConfigurationBox, we also see these extra null bytes.

The question is: are these technically allowed by the standard? We have some python code checking this and complaining. So do we need to change the code in the device, or the checking code?

In an Annex-B byte-stream, they would be allowed, but not here, I think.

They can't be part of a NALU, or they should have been emulation-prevented into 00 00 03 00 00.

ffmpeg and vlc don't complain in the least about it, but they might just be more robust in order to allow as much video files to be played as possible.

EDIT enter image description here

unlord
  • 11
  • 5

2 Answers2

0

"We have a device that creates video files in MP4 file format containing H.264 video data."

Two possibilities:

Least likely...
(1) Those null bytes appear to be padding to make everything 32-bit aligned. This way you can read through the SPS in 4-byte chunks (using some readInt() command or similar).

For your length of 52 bytes (0x34) you would get 13 integers/chunks.

PS: Bytes can also be zero padded until a NALU starts on a new line/row.
(eg: Is obvious if displayed in a traditional "16 bytes per row" view of hex data).

Most likely...
(2) Those 4 zero-bytes are valid bytes of your SPS since the NALU size encapsulates them within SPS data. This would answer your question of: "Are these technically allowed by the standard?" as Yes since they are part of the actual SPS data itself. You unknowingly confirmed this with your "Within the stsd header, in the AVCConfigurationBox, we also see these extra null bytes." ...because they are supposed to be there.

...

"In an Annex-B byte-stream, they would be allowed, but not here, I think."

Note: SPS is known as Codec Private Data and can be stored as either Annex-B or AVCC format regardless of the MP4's own format (eg: they can be mixed together in some MP4 files).

...

"We have some Python code checking this and complaining. So do we need to change the code in the device, or the checking code?"

I would leave the MP4 bytes as they are (from device?) and just fix the checking side. For example what does it actually complain about? If the size is 52 bytes, then it must read following 52 bytes as SPS content. Then it can confirm a new NALU by skipping +4 bytes (to skip past the "length size" bytes and check to see if it has either an 0x06 for SEI, or an 0x65 for a keyframe, or an 0x41 for P/B frame.

In your image: It looks like you have 52 bytes of SPS, then 4 bytes of PPS and then 36 bytes of SEI.

VC.One
  • 14,790
  • 4
  • 25
  • 57
  • Those bytes can't be part of the SPS: if they were, they would be emulation-prevented to 00 00 03 00 00. Furthermore, a NAL always ends with a 1-stop-bit, followed by zeroes to make it byte-aligned. So, the SPS ends at the 0A. – unlord Oct 19 '22 at 08:43
  • The Python code complains about the emulation-prevention and extra data after the rbsp_trailing_bits. Which is legit in my opinion, but not everyone agrees, here. – unlord Oct 19 '22 at 08:49
  • _"Those bytes can't be part of the SPS..."_ I hear you but please stop looking for what the textbook says... The reality is that your MP4 has an SPS that ends with 4 zero bytes. It's unusual but not impossible. **(1)** Your video starts as H264 before being packed into an MP4 container, where the AVCC box has a **direct copy** of the H264's **original SPS**. Your MP4's AVCC **also has** 4 zero bytes because they existed and were copied from the incoming H264's own SPS bytes. – VC.One Oct 19 '22 at 10:42
  • **(2)** It's not needed to have MP4 frames/samples that are type SPS, since all frames will reference the one `STSD`'s own AVCC content for needed SPS info. Anyways you added it and now you can see that those SPS bytes end with 4 zero bytes because **those zeroes are part of the SPS**, no matter where you put the SPS... in H264? in AVCC? in MDAT? **(3)** I'm not sure why your Python code is checking for `rbsp_trailing_bits`, unless it actually tries to decode the SPS bits themselves?... Is the `STSZ` showing a size for this SPS Nalu as `0x38` (56 bytes)? If yes, then I can't see a problem. – VC.One Oct 19 '22 at 11:07
  • PS: Meant to say _"Can't see a problem with getting correct playback"_. The video decoding side should only worry about the size of NALU and then read that number of bytes. – VC.One Oct 19 '22 at 11:12
  • _I'm not sure why your Python code is checking for rbsp_trailing_bits, unless it actually tries to decode the SPS bits themselves?_ yes, exactly and it finds extraneous data. If you shouldn't follow the textbook, why do you have a textbook in the first place... The video data comes from a hardware encoder that spits out the data as AnnexB, which then gets sent to ffmpeg, so the problem might very well be there and that's what I'm trying to verify. – unlord Oct 19 '22 at 12:35
  • You can check your encoder's output with `ffmpeg -i input -codec copy output.h264` then open the output in a hex editor. Does the SPS have four trailing zero bytes? If yes, that explains why FFmpeg writes them also into the MP4. – VC.One Oct 19 '22 at 16:34
0

It turned out some part of the code was doing 64-bit alignment for some unneeded legacy reasons, before passing it on to the ffmpeg libs.

So the null-bytes did not belong there, and were certainly not part of the SPS

unlord
  • 11
  • 5