I have a little understanding problem with the terminology at H.264.
When I receive a Stream over RTP I usually get some split up packages which I need to reassemble. It would be something like this:
[RTP Frame 0 / has Start Bit]
[RTP Frame 1]
[RTP Frame 2]
[RTP Frame n / has Stop Bit]
[RTP Frame n+1 / has Start Bit]
[....]
In this example I would need to put Frame 0 until Frame n together ( then add preceding NAL Bits etc. ).
So in the RFC3984 such a "unit" is referred to as a "Video Frame"
Now my question is; Is such a Video Frame the same as a Reference Frame?