0

I'm relatively new to Python and even more so to Tensorflow so I've been working through some tutorials such as this tutorial. A challenge given was to make an image greyscale. One approach taken here is to just take one colour channel value and duplicate it across all channels. Another is to take an average which can be achieved using tf.reduce_mean as done here. However there are many ways to make an image monochromatic as anyone who has played with GIMP or Photoshop will know. One standard method defined adjusts for the way humans perceive colour and requires that the three colour channels are individually adjusted this way:

Grey = (Red * 0.2126 + Green * 0.7152 + Blue * 0.0722)

Anyway I've achieved it by doing this:

import tensorflow as tf
import numpy as np
import matplotlib.image as mpimg
import matplotlib.pyplot as plt

filename = "MarshOrchid.jpg"
raw_image_data = mpimg.imread(filename)

image = tf.placeholder("float", [None, None, 3])

r = tf.slice(image,[0,0,0],[-1,-1,1])
g = tf.slice(image,[0,0,1],[-1,-1,1])
b = tf.slice(image,[0,0,2],[-1,-1,1])

r = tf.scalar_mul(0.2126,r)
g = tf.scalar_mul(0.7152,g)
b = tf.scalar_mul(0.0722,b)

grey = tf.add(r,tf.add(g,b))

out = tf.concat(2, [grey, grey, grey])
out = tf.cast(out, tf.uint8)


with tf.Session() as session:

    result = session.run(out, feed_dict={image: raw_image_data})

    plt.imshow(result)
    plt.show()

This to me seems hugely inelegant having to cut up the data and apply calculations and then recombine them. A matrix multiplication on individual RGB tuples would be efficient or barring that a function that takes an individual RGB tuple and returns a greyscaled tuple. I've looked at tf.map_fn but can't seem to make it work for this.

Any suggestions or improvements?

Community
  • 1
  • 1
Peter Pudaite
  • 406
  • 8
  • 18

2 Answers2

1

How about this?

img = tf.ones([100, 100, 3])
r, g, b = tf.unstack(img, axis=2)
grey = r * 0.2126 + g * 0.7152 + b * 0.0722
out = tf.stack([grey, grey, grey], axis=2)
out = tf.cast(out, tf.uint8)

sample of map_fn
shape of x is (2, 4), so shape of elms_fn is (4,)
if shape of x is (100, 100, 3), shape of elms_fn will be (100, 3)

x = tf.constant([[1, 2, 3, 4],
                 [5, 6, 7, 8]], dtype=tf.float32)

def avg_fc(elms_fn):
    # shape of elms_fn is (4,)
    # compute average for each row and return it
    avg = tf.reduce_mean(elms_fn)
    return avg

# map_fn will stack avg at axis 0
res = tf.map_fn(avg_fc, x)

with tf.Session() as sess:
    a = sess.run(res) #[2.5, 6.5]
xxi
  • 1,430
  • 13
  • 24
  • Thanks. That's a little more compact than what I did. But essentially we're still unpacking the matrix to get at the individual elements and then repacking. I am trying to figure out a way to apply the calculations directly in a way similar to the inbuilt methods. – Peter Pudaite Feb 03 '17 at 22:37
  • you mean you want to do this without stack? Can you show me some `inbuilt methods` maybe use your familiar language. I try if I can do this – xxi Feb 04 '17 at 07:32
  • This Q&A skims at what I'm trying to do: http://stackoverflow.com/questions/41471540/tensorflow-apply-a-function-to-each-row-of-a-matrix-variable According to the answer map_fn will apply a function to a slice which is what we're doing explicitly. Presumably map_fn does this without the overhead of unpacking and repacking as we have done. – Peter Pudaite Feb 04 '17 at 19:49
  • `map_fn` split the matrix at axis 0 and do `the same thing` for each split. e.g. you have a 2D matrix and you want to compute the average at each row, `map_fn` maybe is a good choice. so I am doubt that `map_fn` is better than `unpack` here, edit post for sample of `map_fn` – xxi Feb 05 '17 at 12:41
  • I've spent several hours researching around this and as far as I can see the current release of tensorflow won't allow granular access to values in a tensor. I thought of reshaping the matrix to 1D but as I have to process the values in sets of three's it's not possible either.... Thanks for all your help! – Peter Pudaite Feb 05 '17 at 21:08
  • I devised another way to achieve the same result by reshaping the matrix doing a matrix multiply and reshaping. So to compare I profiled for execution time and memory usage. Fastest was reshaping and matrix multiply, then unstacking (your solution) and slowest was slicing. Memory usage was highest for matrix multiplication, then unstacking and lowest was slicing..... – Peter Pudaite Feb 05 '17 at 23:06
1

So having really looked in to this topic, in the current release of tensorflow (r0.12) there doesn't appear to be a simple way to apply custom functions to tuples of values, especially if the result does not effect a reduce. As my initial effort and that of the answer from @xxi you pretty much have to dis-aggregate the tuples before applying a function to them collectively.

I figured out another way to get the result that I wanted without slicing or unstacking but instead reshaping and matrix multiplication which is:

import tensorflow as tf
import numpy as np
import matplotlib.image as mpimg
import matplotlib.pyplot as plt

filename = "MarshOrchid.jpg"
raw_image_data = mpimg.imread(filename)

image = tf.placeholder("float", [None, None, 3])

out = tf.reshape(image, [-1,3])
out = tf.matmul(out,[[0.2126, 0, 0], [0, 0.7152, 0], [0, 0, 0.0722]])
out = tf.reduce_sum(out, 1, keep_dims=True)

out = tf.concat(1, [out, out, out])
out = tf.reshape(out, tf.shape(image))
out = tf.cast(out, tf.uint8)


with tf.Session() as session:

    result = session.run(out, feed_dict={image: raw_image_data})

    plt.imshow(result)
    plt.show()

This worked for the narrow purpose of greyscaling an image but doesn't really give a design pattern to apply for dealing with more generic calculations.

Out of curiosity I profiled these three methods in terms of execution time and memory usage. So which was better?

  • Method 1 - Slicing: 1.6 seconds & 1.0 GiB memory usage
  • Method 2 - Unstacking: 1.6 seconds & 1.1 GiB memory usage
  • Method 3 - Reshape: 1.4 seconds & 1.2 GiB memory usage

So no major differences in performance but interesting nonetheless.

In case you were wondering why the process is so slow, the image used is 5528 x 3685 pixels. But yeah still pretty slow compared to Gimp and others.

Community
  • 1
  • 1
Peter Pudaite
  • 406
  • 8
  • 18
  • Nice work, I think the speed is caused by tensorflow initialize? how do you measure the time and memory usage? – xxi Feb 06 '17 at 02:22
  • For execution times I did this: `for i in range(0,1000): result = session.run(out, feed_dict={image: raw_image_data}) i += 1 print((time.time()-start)/1000)` For memory I used the python module memory_profiler. Tensorflow adds quite a significant overhead for such a simple operation so I wasn't expecting anything particularly amazing in absolute performance. I was more interested in the relative performance between the different methods. – Peter Pudaite Feb 06 '17 at 07:53