2

I'm making an image and video converter in Python. The basic approach is:

  1. Create a list which stores the 292 colors that comprise my output 'palette.'
  2. Create a dict which serves as a cache of previous color comparisons
  3. Shrink image/frame to smaller dimensions (max width: 132)
  4. Iterate over every pixel in the image.
  5. For each pixel, check if its color is in the cache. If so, use the palette color defined in the cache.
  6. If the pixel's color is not found in the cache, compare it to each of the 292 colors in the palette list using a variation of the algorithm here.
  7. Choose the palette color which has lowest distance.

So I end up with a for loop that calls the color comparison function each time. Here's an approximation:

possibles = [ list of color dicts here ]
match_cache = { }

def color_comparison( pixel, possibles ):
    closest_distance = INFINITY
    closest_possible = 0
    for possible in possibles:
        d = color_distance( pixel, possible )
        if d < closest_distance:
            closest_distance = d
            closest_possible = possible
    hash = pixel.makeHash()
    match_cache[hash] = closest_possible
    return closest_possible

def image_convert( image ):
    output = []
    for pixel in image:
        hash = pixel.makeHash()
        if hash in match_cache:
            output.append( match_cache[hash] )
        else:
            new_color = color_comparison( pixel, possibles )
            output.append( new_color )
    return output

My question is: how can I make this faster? Is there some better approach rather than iterating over every possible for every pixel?

Kirkman14
  • 1,506
  • 4
  • 16
  • 30
  • The keyword is colour quantization, cf http://stackoverflow.com/questions/5906693/how-to-reduce-the-number-of-colors-in-an-image-with-opencv or http://www.pyimagesearch.com/2014/07/07/color-quantization-opencv-using-k-means-clustering/ – tfv May 18 '16 at 17:15
  • 1
    You should take a look at the [colour quantization](http://pillow.readthedocs.io/en/3.1.x/reference/Image.html#PIL.Image.Image.quantize) function in Pillow, the modern fork of PIL. But if you want to do this by hand a quick improvement may be obtained by memoizing your `color_comparison` function so it doesn't have to search again for colors it has already found the nearest neighbour of. – PM 2Ring May 18 '16 at 17:49
  • To optimize the search in `color_comparison` take a look at Wikipedia's [Nearest neighbor search](https://en.wikipedia.org/wiki/Nearest_neighbor_search). If you can read C, you may be able to use this fairly simple colour searching code I posted on [xkcd](http://echochamber.me/viewtopic.php?t=41298#p1643328) a few years ago. – PM 2Ring May 18 '16 at 17:50
  • PM 2Ring, I totally forgot to include that in the post. I am caching lookups in a separate dictionary so that I can avoid calling the color comparison function if I've already compared that specific color. – Kirkman14 May 18 '16 at 17:56
  • 1
    Omit the square root, if you have one, at the end of `color_distance` function. – Mark Setchell May 18 '16 at 20:33
  • It occurred to me that maybe I could just pass the palette into PIL and let it do the conversion for me. However, my palette is actually 292 colors, which exceeds the maximum size of 256 for a PIL palette. – Kirkman14 May 18 '16 at 21:20
  • Omitting the square root does lead to a small savings. In my test (without a match_cache) it reduced the average "find closest color" function runtime from 1.53ms to 1.42ms. – Kirkman14 May 18 '16 at 21:34
  • How does `makeHash()` look? – Mark Setchell May 18 '16 at 21:48
  • Basically `hash = ''.join( map(str, desired_color) )` – Kirkman14 May 18 '16 at 21:58

1 Answers1

0

What are these ~200 colors? Solution might depend on the colors distribution.

Is walking through Python dict fast enough? Dictionary is intended for searching of exact matching, not for examining of all items. May be, list or array would be better choice?

But the main problem is exhaustive search among all possible colors.
Good data structure can accelerate searching greatly. Consider octree where separating is implemented for R,G,B color components.

Arbitrary example

MBo
  • 77,366
  • 5
  • 53
  • 86
  • Essentially I'm trying to match colors to ANSI/ASCII characters. I've created my own "palette" where particular character combinations correspond to specific RGB values. In this way, I can compare the colors in an image with the values in my 'palette' and return the desired character. I made a mistake with regards to the dict. It is actually a list of dicts/objects, something like: `{ 'fg': xx, 'bg': xx, 'character': xx, 'r': xx, 'g': xx, 'b': xx }` I have updated the question to reflect this. – Kirkman14 May 18 '16 at 18:59