2

I have collected a large-scale datasets of gifs(more than 100k) from the Internet, and I meet some rare strange GIFs when I try to extract frames of GIFs with python. Three common used packages(moviepy, PIL, imageio) provide totally different results of such rare strange gifs.

  1. moviepy>=1.0.3 will block in VideoFileClip.iter_frames() loop at the second frame forever, and the code won't throw an exception.
from moviepy.video.io.VideoFileClip import VideoFileClip

video = VideoFileClip(path)
frame_iterator = video.iter_frames()
  1. PIL>=7.1.2 will output multiple frames as same as the first frame.
from PIL import Image, ImageSequence
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

video = Image.open(path)
frame_iterator = ImageSequence.Iterator(video)
  1. imageio>=2.6.1 can extract the frames correctly while the output frames are strange.
import imageio

frame_iterator = imageio.get_reader(path)

Then you can dump the frames from frame_iterator provided by these packages above:

def dump_video_frames(video_frames):
    root = 'data/frames'
    if os.path.exists(root):
        shutil.rmtree(root)
    os.makedirs(root)

    for i, frame in enumerate(video_frames):
        frame.save(os.path.join(root, '%d.jpg' % i))

frames = []
for frame in frame_iterator:
    if isinstance(frame, np.ndarray):
        frame = Image.fromarray(np.uint8(frame))
    frames.append(frame.convert('RGB'))

dump_video_frames(frames)

Here is an example:

Original GIF:

enter image description here

The output of PIL:

enter image description here

The output of imageio:

enter image description here

You can see PIL only get the first frame without any black area which is quite different with the output of imageio.

So my question is how to detect such a strange gif in python? Since I use moviepy first for its good performance in other gifs, I need to detect such a kind of GIF before the code use moviepy to extract its frames in order to avoid the infinite loop in VideoFileClip.iter_frames() which won't throw any exception. I can't get any information about such a rare gif from Google.

I will provide 2 more example GIFs below:

enter image description here

enter image description here

jianjieluo
  • 319
  • 1
  • 4
  • 15
  • 1
    Suggestion: look up the GIF format specification. Walk through the structure of these GIFs (e.g. with a hexdump tool) to determine what the correct interpretation is. When you have done this, you can submit bug reports to at least two of the three packages you've used since they aren't implementing the standard fully. Has your web browser got a library that displays the GIFs correctly? If so, find out what it is, and see if it has a python binding. – Mad Physicist Jun 15 '20 at 13:35
  • 1
    These appear to be badly broken GIFs, created by a buggy piece of software I suppose. Specifically, they contain frames that do not fit within the bounds of the image, as given in the file header. Of the three packages you tried, one is choking on this invalid data, one is only showing you pixels within the given image bounds (which doesn't happen to include any of the areas that are actually being animated), and one is expanding the bounds to include all of the frames. – jasonharper Jun 15 '20 at 13:47
  • 1
    To detect these broken files, you'd need to scan through the file & frame headers, and see if any frames are out of bounds (or possibly you could rewrite the file header so that its bounds encompass all of the frames). This can be done without having to decompress any of the pixel data, so should be reasonably fast even in pure Python. I'm not aware of any existing code or packages to do this, unfortunately. – jasonharper Jun 15 '20 at 13:52
  • 1
    For reference: https://github.com/Zulko/moviepy/issues/1231 – Tom Burrows Jun 15 '20 at 14:42

0 Answers0