The documents just don't seem to provide an answer..
Microsoft tried to explain the subject clearly, but it is still ambiguous. At least in our case.
We have an encrypted MP4 stream. It contains "SampleEncryptionBox"es or "PIFF" boxes, which contain 8-byte = 64-bit Initialization Vectors for encrypted blocks. BUT: The actual "counter block" for decrypting the "AES-128 Counter Mode"-encrypted video data is 128-bit. I don't know where exactly to put the IV in it!!
PIFF document says 16-byte IV is the entire counter block (obviously) for AES-CTR mode. Also, 8-byte IV is put at the beginning of the counter block, for AES-ECB mode (page 17). But for 8-byte IV in AES-CTR mode, it says nothing!
This RFC document says that the 128-bit should comprise 4-byte Nonce + 8-byte IV + 4-byte counter. And the Nonce value should be taken from the extra 4 bytes supplied for the main 128-bit AES key. I can only obtain the 128-bit key by the Protection Header, where should I get the 4-byte Nonce??
Any bit of extra knowledge will be highly appreciated.