4

I am trying to read multiple rgb images into one matrix, such that the matrix dimensions are (image_size, image_size, index) e.g. data[:,:,1] should retrieve the 1st image.

data = np.zeros((image_dim, image_dim, numImages), dtype=np.float64)
for fname in os.listdir('images/sample_images/'):
       name='....'
       image=mpimg.imread(name)
       data = np.append(data, image)
return data

image.shape = (512, 512, 3) data.shape = (512, 512, 100)

Apart from the fact that np.append leaves me with an empty data array, is there another way of appending the image-array values to a big data matrix?

Thanks in advance

jenpaff
  • 63
  • 1
  • 5

3 Answers3

6

Falko's post is certainly the canonical way to do it. However, if I can suggest a more numpy / Pythonic way to do it, I would let the first dimension be the index of which image you want, while the second and third dimensions be the rows and columns of the image, and optionally the fourth dimension being the colour channel you want. Therefore, supposing that your image has dimensions M x N and you had K images, you would create a matrix that is K x M x N long or K x M x N x 3 long in the case of colour images.

As such, a simple one-liner in numpy could be this given your current code:

data = np.array([mpimg.imread(name) for name in os.listdir('images/sample_images/')], dtype=np.float64)

As such, if you want to access the ith image, you would simply do data[i]. This will work independently of whether the image is RGB or grayscale... so by doing data[i], you'll get an RGB image or a grayscale image, depending on what you decided to use to pack the array. However, you need to make sure that all of the images are consistent... That is, they're all colour or all grayscale.

However, to show you that this works, let's try this with 5 x 5 x 3 "RGB" images where each starts from 0 and increases up to K-1 where K in this case will be 10:

data = np.array([i*np.ones((5,5,3)) for i in range(10)], dtype=np.float64)

Let's see a sample run (in IPython):

In [26]: data = np.array([i*np.ones((5,5,3)) for i in range(10)], dtype=np.float64)

In [27]: data.shape
Out[27]: (10, 5, 5, 3)

In [28]: img = data[0]

In [29]: img
Out[29]: 
array([[[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]])

In [30]: img.shape
Out[30]: (5, 5, 3)

In [31]: img = data[7]

In [32]: img
Out[32]: 
array([[[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]]])

In [33]: img.shape
Out[33]: (5, 5, 3)

In the above sample run, I created the sample data array and it's 10 x 5 x 5 x 3 as we expected. We have 10 5 x 5 x 3 matrices. Next, I extract out the first "RGB" image and it's all 0s as we expect, with a size of 5 x 5 x 3. I also extract out the eighth slice and we all get 7s as we expect, with a size of 5 x 5 x 3.

Obviously, choose whichever answer you think is best, but I personally would go with the above route as indexing into your array to grab the right image is simpler - you're letting dimension broadcasting do the work for you.

rayryeng
  • 102,964
  • 22
  • 184
  • 193
  • yes that seems like a really good way to do it, thanks. again, i'm having the problem when retrieving a single image from data: `img = data[0]`, the colors seem a bit off, shouldn't it give me the original image ? – jenpaff Aug 19 '15 at 11:52
  • I don't have experience in mpimg so I can't tell you for sure. It could be the datatype that you chose to load in the image as well. Can you post a snapshot of what the images look like? Post both the original and the off version. – rayryeng Aug 19 '15 at 12:01
  • @jenpaff - Great! no problem. – rayryeng Aug 20 '15 at 15:01
  • I get following error with this code ValueError: setting an array element with a sequence. – Spandyie Sep 05 '17 at 03:49
  • @Spandy I can't comment unless I see what you did. Do you have a question posted or somewhere I can see what you're doing? – rayryeng Sep 05 '17 at 03:53
  • path="C:/Users/...../00000" data = np.array([mpimg.imread(path+"/"+name) for name in os.listdir(path)], dtype=np.float64). So I am basically trying load all *.ppm files present in a folder. – Spandyie Sep 05 '17 at 03:59
  • @Spandy are all your images RGB? – rayryeng Sep 05 '17 at 04:00
  • Yes Sir https://stackoverflow.com/questions/46046459/how-to-load-mutiple-ppm-files-present-in-a-folder-as-single-numpy-ndarray/46046851#46046851 – Spandyie Sep 05 '17 at 04:05
3

You better use dstack for stacking arrays in the 3rd dimension:

data = np.zeros((3, 3, 0))
for i in range(5):
    image = np.random.rand(3, 3, 1)
    data = np.dstack((data, image))
print data.shape

Output:

(3, 3, 5)

Note: Here I assume that each (random) image has one channel. If you have RGB images, you'd end up with 3 times the number of resulting channels, i.e. shape (3, 3, 15).

Falko
  • 17,076
  • 13
  • 60
  • 105
  • amazing that seems to work. so how would i retrieve a single image from the data ? I tried `data[:,:,:3]` but it wouldn't save the original image with in the right colours. – jenpaff Aug 18 '15 at 17:47
  • @jenpaff - No, you use `data[:,:,i]` where `i` is the image index you want. That's for grayscale. If you have the case of RGB, you would do `data[:,:,3*i:3*(i+1)]` – rayryeng Aug 19 '15 at 01:25
1

How do I read images from the disk to NumPy 4D matrix (for Machine Learning):

First, a utility method (my 14x64px images have 3 channel each, image shape (14, 64, 3)):

def read_image(image_path):
    # cv2.IMREAD_COLOR 
    # cv2.COLOR_BGR2GRAY 
    image = cv2.imread(image_path, cv2.IMREAD_COLOR)
    #print("image shape", image.shape)
    #plt.imshow(image, cmap='gray')
    #plt.show()
    return np.array(image)

Next, I put all my images into a 4 dimensional NumPy matrix:

training_features = np.array([read_image(path) for path in image_paths])

Resulting matrix shape (5626, 14, 64, 3), it has 5626 14x64px color images.

Uki D. Lucas
  • 516
  • 6
  • 4