imageio get_data skips frames

Question

For a viewer, I am trying to randomly access frame in an mp4 file. Unfortunately, depending on where I start off, I am getting different frames from the same index, in my case for any frame after frame 123:

import imageio
import hashlib
from tqdm import tqdm

reader = imageio.get_reader("360_0011.MP4")
reader2 = imageio.get_reader("360_0011.MP4")

# Build up a hash library
hashes = dict()
for i_fr, img in enumerate(tqdm(reader)):
    hashes[i_fr] = hashlib.md5(img).hexdigest()

# Query frame 123 after frame 0
fr_idx = 123
reader2.get_data(0)
fr_hash = hashlib.md5(reader2.get_data(fr_idx)).hexdigest()
print(fr_hash == hashes[fr_idx])
print(fr_hash in hashes.values())
list(hashes.values()).index(fr_hash)
>> True
>> True
>> 123

# Query frame 124 after frame 0
fr_idx = 124
reader2.get_data(0)
fr_hash = hashlib.md5(reader2.get_data(fr_idx)).hexdigest()
print(fr_hash == hashes[fr_idx])
print(fr_hash in hashes.values())
list(hashes.values()).index(fr_hash)
>> False
>> True
>> 125

# Query frame 125 after frame 0
fr_idx = 125
reader2.get_data(0)
fr_hash = hashlib.md5(reader2.get_data(fr_idx)).hexdigest()
print(fr_hash == hashes[fr_idx])
print(fr_hash in hashes.values())
list(hashes.values()).index(fr_hash)
>> False
>> True
>> 126

# Query frame 124
fr_idx = 124
fr_hash = hashlib.md5(reader2.get_data(fr_idx)).hexdigest()
print(fr_hash == hashes[fr_idx])
print(fr_hash in hashes.values())
list(hashes.values()).index(fr_hash)
>> False
>> True
>> 125

# Query frame 123
fr_idx = 123
fr_hash = hashlib.md5(reader2.get_data(fr_idx)).hexdigest()
print(fr_hash == hashes[fr_idx])
print(fr_hash in hashes.values())
list(hashes.values()).index(fr_hash)
>> True
>> True
>> 123

# Query frame 124
fr_idx = 124
fr_hash = hashlib.md5(reader2.get_data(fr_idx)).hexdigest()
print(fr_hash == hashes[fr_idx])
print(fr_hash in hashes.values())
list(hashes.values()).index(fr_hash)
>> True
>> True
>> 124

It seems like as long as I am querying in series it works for all subsequent frames (I checked multiple ranges):

reader2.get_data(0)
for i in range(1130,1140):
    fr_hash = hashlib.md5(reader2.get_data(i)).hexdigest()
    print(f"{i} - {fr_hash == hashes[i]} - {list(hashes.values()).index(fr_hash)}")

>> 1130 - False - 1131
>> 1131 - True - 1131
>> 1132 - True - 1132
>> 1133 - True - 1133
>> 1134 - True - 1134
>> 1135 - True - 1135
>> 1136 - True - 1136
>> 1137 - True - 1137
>> 1138 - True - 1138
>> 1139 - True - 1139

Is this somehow intended/expected? is this a bug? Is there a way to work around it except for always querying 2 frames?

pleas provide that video file, or any video file you can share that reproduces the issue. — Christoph Rackwitz, Feb 01 '23 at 19:40
Thanks. The videos contain me and some other stuff, so I am hesitant to post a public link, but I send you an email and offer to do the same for others on request. — mcandril, Feb 02 '23 at 06:28
imageio.v3.imread with ffmpeg plugin shows the same behavior, only that imreading the previous frame does not fix it. — mcandril, Feb 02 '23 at 08:13
pims.Video also has it, but at a different position, and it seems to slow down the higher the frame idx is that I want to access ... — mcandril, Feb 02 '23 at 08:25
video files don't have frame indices in general. they have Presentation Timestamps. often, "indices" are turned into PTSes given some fixed frame rate, which is also an assumption that does not generally hold. it may be that the calculation in imageio is broken. unless the author/maintainer of imageio shows up and shows you a way... you may want to try PyAV. that requires you to give it a PTS, and if you can assume a fixed frame rate, then you can do the calculation yourself. it also gives you control over how the seeking happens (there are several options) — Christoph Rackwitz, Feb 02 '23 at 09:25
I've analyzed the files. they appear to have fixed frame rates but GOPs of 15 frames for the one and a variable 50-94 frames for the other, so seeking may only be cheap if allowed to jump to keyframes. — Christoph Rackwitz, Feb 02 '23 at 09:26
you could try choosing different backends for imageio. it supports PyAV and OpenCV. maybe a different choice helps. — Christoph Rackwitz, Feb 02 '23 at 09:33
Thanks for looking into this. I thought pims is already based on PyAV? But I'll continue playing around and file a bug report to imageio — mcandril, Feb 02 '23 at 15:07
I am unfamiliar with "pims". your question's code says that you use `imageio`. — Christoph Rackwitz, Feb 02 '23 at 21:42
I put in another comment that I also tried PIMS (https://github.com/soft-matter/pims), which as far as I saw used pyAV. I put in an issue at imageio, #938. Thanks again. — mcandril, Feb 05 '23 at 10:55

imageio get_data skips frames

0 Answers0