1

I am using images of driving data (i.e., trajectory of individual users) to predict their store visits in a CNN model. My images are mostly black and white, 360 x 360 pixels. When I convert the images to numpy array, the dimensions are 360 x 360 x 4, instead of 3 for RGB. A

from PIL import Image
from numpy import asarray
# load the image
image = Image.open(os.path.join(fig_path, 'image.png'))

# convert image to numpy array
data = asarray(image)
print(type(data))
# summarize shape
print(data.shape)

# create Pillow image
image2 = Image.fromarray(data)
print(type(image2))

# summarize image details
print(image2.mode)

print(image2.size)

I then stack all the images needed for the CNN using a temp[] vector and appending to it one by one and using . But when I input this entire matrix as input_shape into a CNN model, it breaks. Following is the error message:

link to my image

Any idea what's happening with my dimensions?

Fresco
  • 253
  • 1
  • 5
  • 16
Narang U
  • 11
  • 2
  • 1
    What is the output of all those print statements? – xdhmoore Jan 26 '21 at 04:07
  • In the error message, does `ndim=4` refer to the `4` image channels, or the fact that there are 4 dimensions in the shape list? Is it possible it's the latter? – xdhmoore Jan 26 '21 at 04:13
  • It also might help to see the definition of the layer that's throwing the error. – xdhmoore Jan 26 '21 at 04:14
  • Would you please share your input layer of your model? – Uzzal Podder Jan 26 '21 at 04:14
  • Without any more information, my guess would be that one of your layers automatically adds a 4th channel to the RGB, but that the error is unrelated and due to a layer expecting a 5 dimension input, such as `(None, None, 360, 360, 4)` and instead receiving 4, ie `(None, 360, 360, 4)`. – xdhmoore Jan 26 '21 at 04:41
  • @xdhmoore Thanks. Yes, the dimensions are right now (20, 360, 360, 4) where I have 20 images of size 360 x 360 and 4 seems to be the RGB channels. Each image is (360, 360, 4), i.e., the print outputs are: " (360, 360, 4), , RGBA, (360, 360)". I will check what additional dimension the code may be expecting. This code was originally written to work without images, with just timestamp data in rows and columns and I'm trying to make it work with images. – Narang U Jan 26 '21 at 16:06

1 Answers1

0

That could be the alpha channel. Try discarding it like this:

data = asarray(image)
data = data[:,:,:3]

print(type(data))
print(data.shape)

# <class 'numpy.ndarray'>
# (677, 586, 3)

Props to this answer.

j2abro
  • 733
  • 8
  • 17