Convert a 3-channel RGB mage into a 1-channel binary image

Question

I can convert the RGB image into binary but its dimensions are still too large (1280x720x3). Since each pixel of the binary image only has a value of 0 or 1, I want to reduce its dimension to (1280x720x1) so I won't have to deal with memory issues (since I'm working with thousands of images).

import cv2
import glob

def convert_to_binary(source_path, destination_path):
    i = 0

    for filename in glob.iglob("{}*.png".format(source_path)):
        im_gray = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
        (thresh, im_bw) = cv2.threshold(im_gray, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
        cv2.imwrite("{}.png".format(destination_path + str(i)), im_bw)
        i += 1

How could I modify the above code to change the dimensions of saved images from (1280x720x3) to (1280x720x1)?

??? What's you true purpose? To get a binay of `WxHx1` or just to reduce the memory cost? — Kinght 金, Nov 19 '17 at 04:48
The second link I got when I googled "opencv binarize" was [this](https://docs.opencv.org/3.3.1/d7/d4d/tutorial_py_thresholding.html). Seems like it might help. — beaker, Nov 19 '17 at 20:26

Kinght 金 · Accepted Answer · 2017-12-31T11:17:17.453

Use np.newaxis or np.reshape to convert (H,W) to (H,W,1).

>>> h,w = 3,4
>>> binary = np.zeros((h,w))
>>> binary.shape
(3, 4)

(1) use np.newaxis to add the new dimenssion

>>> new_binary = binary[..., np.newaxis]
>>> new_binary.shape
(3, 4, 1)
>>>

(2) use reshape to change the dimension

>>> new_binary2 = binary.reshape((h,w,1))
>>> new_binary2.shape
(3, 4, 1)

Now see the result.

>>> binary
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])
>>> new_binary
array([[[ 0.],
        [ 0.],
        [ 0.],
        [ 0.]],

       [[ 0.],
        [ 0.],
        [ 0.],
        [ 0.]],

       [[ 0.],
        [ 0.],
        [ 0.],
        [ 0.]]])

NaN · Answer 2 · 2017-11-19T09:10:37.660

From various sources, this is one incarnation of converting rgb to gray scale.

# num_92 gray scale image from rgb
def num_92():
    """num_92... gray-scale image from rgb
    :Essentially gray = 0.2989 * r + 0.5870 * g + 0.1140 * b
    : np.dot(rgb[...,:3], [0.299, 0.587, 0.114])
    : http://stackoverflow.com/questions/12201577/how-can-i-convert
    :       -an-rgb-image-into-grayscale-in-python
    : https://en.m.wikipedia.org/wiki/Grayscale#Converting_color_to_grayscale
    : see https://en.m.wikipedia.org/wiki/HSL_and_HSV
     """
    frmt = """
    :---------------------------------------------------------------------:
    {}
    :---------------------------------------------------------------------:
    """
    import matplotlib.pyplot as plt
    a = np.arange(256).reshape(16, 16)
    b = a[::-1]
    c = np.ones_like(a)*128
    rgb = np.dstack((a, b, c))
    gray = np.dot(rgb[..., :3], [0.2989, 0.5870, 0.1140])
    plt.imshow(gray, cmap=plt.get_cmap('gray'))
    plt.show()
    args = [num_92.__doc__]
    print(frmt.format(*args))

rename the def, but in the interim, call it using

num_92()

or play around with portions

import matplotlib.pyplot as plt
a = np.arange(256).reshape(16, 16)
b = a[::-1]
c = np.ones_like(a)*128
rgb = np.dstack((a, b, c))
plt.imshow(rgb, cmap=plt.get_cmap('hot'))
plt.show()

but if you average the rescaled rgb you get a different picture, so it depends on what you want

avg = np.average(rgb, axis=-1)
avg.shape
(16, 16)
plt.imshow(avg, cmap=plt.get_cmap('gray'))
plt.show()

Convert a 3-channel RGB mage into a 1-channel binary image

2 Answers2