3

Is it possible to use any *nix programs like 'find' or a scripting language like Python, PHP or Ruby, that can search your HDD and find all images that have the same width and height, aka square dimension?

bafromca
  • 1,926
  • 4
  • 27
  • 42
  • 1
    what kinds of images, the file command will tell you dimensions of png and gif – technosaurus Dec 20 '12 at 03:55
  • jpgs and pngs. But how would you return the dimensions from 'file' and be able to compare the two and then only return the filename if they're the same? – bafromca Dec 20 '12 at 04:04
  • 1
    Something along the lines `find . -type f | xargs -I {} sh -c 'res=\`identify -format "%w - %h" {} 2>|/dev/null\`; if [ -n "$res" ] && [ \`echo $res | bc\` -eq 0 ]; then echo {}; fi'` for a slow method and for any kind of image supported by ImageMagick. – mmgp Dec 20 '12 at 04:55
  • Right... not constructive. You people negatively surprise me. – mmgp Dec 20 '12 at 15:58

4 Answers4

6

The code below will recursively list files on a specified path, so it can look at all the subfolders on a specific hard disk as you mentioned. It will also check whether a file is as an image based on a set of file extensions that you can specify. It will then print the filename and width, height of any images that have a matching width and height. When you call the script you specify the path you want it search under. An example usage is shown below.

listimages.py

import PIL.Image, fnmatch, os, sys

EXTENSIONS = ['.jpg', '.bmp']

def list_files(path, extensions):
    for root, dirnames, filenames in os.walk(path):
      for file in filenames:
          if os.path.splitext(file)[1].lower() in extensions:
              yield os.path.join(root, file)

for file in list_files(sys.argv[1], EXTENSIONS):
    width, height = PIL.Image.open(file).size
    if width == height:
        print "found %s %sx%s" % (file, width, height)

usage

# listimages.py /home/user/myimages/
found ./b.jpg 50x50
found ./a.jpg 340x340
found ./c.bmp 50x50
found ./d.BMP 50x50
Marwan Alsabbagh
  • 25,364
  • 9
  • 55
  • 65
  • What if someone didn't name their files with correct extensions ? – mmgp Dec 20 '12 at 12:47
  • @mmgp If that is a concern then you can do as Michael Davis did in his answer and check all files and do a try except if it fails, thus ignoring the file extension. But this will be much slower as it will open and read every file on your computer instead of just files that have image extensions. – Marwan Alsabbagh Dec 20 '12 at 15:12
5

It would certainly be possible with Python.

You can use os.walk in order to traverse the file system, and use PIL in order to check if the image has the same dimensions in both direction.

import os, Image

for root, dir, file in os.walk('/'):
    filename = os.path.join(root, file)
    try:
        im = Image.open(filename)
    except IOError:
        continue

    if im.size[0] == im.size[1]:
        print filename
Michael Davis
  • 2,350
  • 2
  • 21
  • 29
2

In bash you can get image size by using something like this:

identify -verbose jpg.jpg | awk '/Geometry/{print($2)}'

Also read man find and man identify

ymn
  • 2,175
  • 2
  • 21
  • 39
  • 3
    using identify, you can drop the awk part by just using `identify -format "%w,%h"`, but he would still needs to compare those values – iagreen Dec 20 '12 at 04:35
2

This can be done in a single shell line but I don't recommend doing so. Do it in two steps. First, collect all image files and needed attributes in a file:

find . -type f -print0 | xargs -J fname -0 -P 4 identify \
    -format "%w,%h,%m,\"%i\"\n" fname 2>|/dev/null | sed '/^$/d' > image_list

The sed is there just to remove the blank lines that are produced. You might want to adjust the parameter -P 4 in xargs for your system. Here, ImageMagick's identify was used since it recognizes a lot of formats. This creates a file named image_list which is in a typical CSV format.

Now it is only a matter of filtering image_list according to your needs. For that I prefer to use Python as in:

import sys
import csv

EXT = ['JPEG', 'PNG']

for width, height, fformat, name in csv.reader(open(sys.argv[1])):
    if int(width) == int(height) and width:
        # Show images with square dimensions, and discard
        # those with width 0
        if fformat in EXT:
            print name

The first part of this answer can be easily rewritten in Python, but since it would either involve using ImageMagick bindings for Python or calling it through subprocess, I left it as a combination of shell commands.

mmgp
  • 18,901
  • 3
  • 53
  • 80