0

I have a simple script set up using ImageMagick to delete all images in a directory that aren't of size 157x200 pixels:

import subprocess, os, sys
from tqdm import tqdm
from pathlib import Path


def delete_opaque_files():
    pathlist = Path("faces").glob('*.png')
    for path in tqdm(pathlist):
        path_str = str(path)
        command = f"identify -format '%wx%h' {path_str}"
        process = subprocess.Popen(command.split(), stdout=subprocess.PIPE)
        output, error = process.communicate()
        if output.decode("utf-8") != "'157x200'":
            print(f"Deleting: {path_str}")
            os.remove(path_str)


delete_opaque_files()
sys.exit(0)

It should loop through all 14.5k images in the directory. However, tqdm reports the script running through only ~7220 images before the script apparently freezes (tqdm stops updating and nothing more is output to the console). When that happens, I need to manually kill the process in the terminal.

Are there any ways to diagnose why the script is freezing? I'm not seeing any error output.

Colin
  • 117
  • 8
  • 1
    Unless your files are in a very esoteric format, it might be better to use Pillow to identify the files instead of calling out to Imagemagick. – AKX Mar 10 '20 at 09:57
  • 1
    memory issues? or antivirus? – MEdwin Mar 10 '20 at 10:22
  • 1
    This would happen if the `identify` program prompted for user input. It would hang waiting but you'd never see it as you wait in `communicate`. Does `identify` have a run silent mode? You could do `output, error = process.communicate(timeout=somethingsane)` and deal with a timeout exception. – tdelaney Mar 10 '20 at 10:31
  • 1
    Try using `subprocess.run` and add a `timeout` argument – FredrikHedman Mar 10 '20 at 10:32
  • 1
    @FredrikHedman - I doubt `subprocess.run` would make a difference although the timeout certainly a good idea.is – tdelaney Mar 10 '20 at 10:35
  • 1
    Could well be, but it is a higher level interface and the `timeout` can maybe help give a clue of what hangs. – FredrikHedman Mar 10 '20 at 10:59
  • Thanks @FredrikHedman. Sure enough ImageMagick was getting hug on a specific file. – Colin Mar 10 '20 at 11:56

1 Answers1

1

Something like the following should allow you to loop through all files catching exceptions and printing the errors as they occur. Note that some files names may contain a space, hence the extra quoting for the last argument of command.

import subprocess
import os
import sys
from tqdm import tqdm
from pathlib import Path


def delete_opaque_files():
    pathlist = Path("faces").glob('*.png')
    for path in tqdm(pathlist):
        path_str = str(path)
        command = ["identify", "-format" "'%wx%h'", f"'{path_str}'"]
        try:
            process = subprocess.run(command,
                                     capture_output=True, check=True,
                                     encoding='utf-8', timeout=15)
            if process.stdout != "'157x200'":
                print(f"Deleting: {path_str}")
                os.remove(path_str)
        except subprocess.TimeoutException as err:
            print(f'Timed out on {path_str}: {err}')
        except subprocess.CalledProcessError as err:
            print(f'Error processing {path_str}: {err}')


delete_opaque_files()
sys.exit(0)
FredrikHedman
  • 1,223
  • 7
  • 14