0

Background: I have encoded a raw h264 file using ffmpeg. I'm trying to create my own container like how Smooth Streaming works with fragmented mp4 containers. I'm not happy with the security of smooth stream though since anyone can completely rip a file from IIS with appropriate authentication.

Problem Anyway I have my raw h264 stream playback "kinda" working using MediaStreamSource within Silverlight with ssl enabled but I can't get my timestamp right for the chucks that I'm sending from server side to the MediaStreamSource within the silverlight client. There is a delay between h264 data chunks which I have parsed by sps Nals. I saw this question for getting duration. Wondering if there is an easy way to count frames in a h264 stream and get a duration so that I can relay an accurate timestamp to the MediaSampleSource. If someone can A: point me in the direction of an open source frame counter or give me some pseudo code for parsing out frames (Maybe some Hex parsing tips). Or maybe someone has some experience with this exact issue that would be great. Any help would be greatly appreciate. Thanks in advance.

Community
  • 1
  • 1
shibbybird
  • 1,245
  • 13
  • 28

2 Answers2

1

I dug through the ISO 14496-10 Documentation and found some hex strings for finding frames in a raw h264 stream:

0x00000141, 0x00000101, 0x00000165

If you go through your stream and count these hex strings and your encoding with ffmpeg and libx264 this should get you a pretty solid frame count. (Please Someone Correct Me If I'm wrong). So if you have the total duration of the h264 video and you have the FPS which you should be able to easily get from ffmpeg then you can use the amount of frames calculated in any given chunk of data that is passed to the MediaStreamSource to get a very accurate TimeStamp for you MediaSampleSource. Hope this helps someone because it was really frustrating me a couple nights ago when my playback was all choppy.

Edit

As I have tested my playback feature in directshow I have noticed that this is not perfect and only works for simplistically encoded h264 streams using ffmpeg. h264 has variable framerates and bitrates. Although the video runs pretty smoothly, a discerning eye can see that at more complex sequences in the video the timing is a bit awkward. I think for a crude video streaming player this is a fine method especially if keyframes are frequently used. I thought it would be fair to add this before I clicked answered.

shibbybird
  • 1,245
  • 13
  • 28
0

This is actually a bit of a rabbit hole. Start with ISO 14496 part 10 and go to section 7.3 for syntax.

The first approximation is to use the field rate in the vui_parameters ( num_units_in_tick/time_scale ) and the number of slice_header()s.

This breaks down if you're dealing with HD content and your encoder is using multiple slice_header()s per picture (then you have to check first_mb_in_slice ==0).

You'll have to pay attention to frame_mbs_only_flag and field_pic_flag.

The other hairball is Table D-1 which interprets the pic_struct field of the pic_timing SEI message. This covers things like field repetition (TBT or BTB), frame doubling, and frame tripling.

If you have a transport stream, you can make an end run around this by checking the DTS values on the PES headers (ISO 13818 part 1) for the first and last access unit.

Mutant Bob
  • 3,121
  • 2
  • 27
  • 52