0

okay so the code is like this X1 is the loaded hyperspectral image with dimensions (512x512x91) what i am trying to do is basically crop 64x64x91 sized matrices with a changing stride of 2. this gives me a total of 49952 images each of 64x64x91 size however when i run the for loop i get the memory error. my system has 8 GB ram.

data_images_0=np.zeros((49952,256,256,91))
k=0
for i in range(0,512-64,2):
    r=64
    print(k)
    for j in range (0,512-64,2):   
        #print(k)
        data_images_0[k,:,:,:]=X1[i:i+r,j:j+r,:]
        k=k+1

I have a hyperspectral image loaded as a Mat file and the dimensions are (512x512x91). I want to use chunks of this image as the input to my CNN for example using crops of 64x64x91. The problem is once i create the crops out of the original image i have trouble loading the data as loading all the crops at once gives me memory error. Is there something i can do to load my cropped data in batches so that i dont receive such a memory error. Should i convert my data into some other format or proceed the problem in some other way?

Newbie_101
  • 11
  • 4
  • Is your image simply stored as a matrix or in some kind of image format? – Pablo Jeken Rico Nov 06 '18 at 13:22
  • 1
    Welcome to SO. Please provide a Minimal, Complete, and Verifiable example. **Show us the code for your latest attempt** and where you got stuck. and explain why the result is not what you expected. Edit your question to include the code, please don't add it in a comment, as it will probably be unreadable. https://stackoverflow.com/help/mcve – Dragonthoughts Nov 06 '18 at 13:24
  • yes its stored as matrix ..basically 512x512 are the spatial dimensions and 91 are the channels or the depth...its all just a matrix of values of each pixel – Newbie_101 Nov 06 '18 at 13:24
  • 1
    Matlab latest format for .mat files (v7.3) is compressed, so there is not much choice but to uncompress all. Besides, data is stores in column major order, so a 64x64x91 array will have samples spread all over the complete 512x512x91 volume. Now, the hyperspectral image contains only about 22.5M pixels: it should easily fit in memory. – Brice Nov 06 '18 at 13:30
  • 2
    Please include example code for how you generate the crops. This is relevant because it might explain why you run out of memory, show how you store the crops, and give ideas on how to put them into a file for easy access. – Cris Luengo Nov 06 '18 at 13:34
  • i have added the code. I am using multiple for loops which i am assuming is a major reason why i am running out of memory. but i am unsure how else i should cater for the problem – Newbie_101 Nov 06 '18 at 13:50
  • You need to extract chunk is one by one and work on them separately. The image will fit into memory easily. The set of all possible chunks will not. The memory requirement for simply initializing data_image with zeros exceeds 2000GB (slightly short of 300 billion values stored on 8 bytes each). Even with the correct (50625,64,64,91) size this would vastly exceed available memory. – Brice Nov 06 '18 at 14:41
  • `scipy` `loadmat` lets you specify which variables it loads, but doesn't provide a means of specifying chunks or slices. – hpaulj Nov 06 '18 at 15:35
  • `h5py` can load slices from HDF5 datasets. But finding your way through a MATLAB generated file is complicated. – hpaulj Nov 06 '18 at 15:39

3 Answers3

2

You are looking for the matfile function. It allows you to access the array on your harddisk and then only load parts of it.

Say your picture is named pic, then you can do something like

data = matfile("filename.mat");
part = data.pic(1:64,1:64,:);

%Do something

then only the (1:64,1:64,:) part of the variable pic will be loaded into part.

As always it should be noted, that working on the harddisk is not exactly fast and should be avoided. On the other hand if your variable is too large to fit in the memory, there is no other way around it (apart from buying more memory).

Nicky Mattsson
  • 3,052
  • 12
  • 28
0

I think you might want to use the matfile function, which basically opens a .mat file without pulling its entire content into the RAM. You basically read a header from your .mat file that contains information about the stored elements, like size, data type and so on. Imagine your .mat file hyperspectralimg.mat containing the matrix myImage. You'd have to proceed like this:

filename = 'hyperspectralimg.mat';
img = matfile(filename);

A = doStuff2MyImg(img.myImage(1:64,1:64,:)); % Do stuff to your imageparts

img.myImage(1:64,1:64,:) = A; %Return changes to your file

This is a brief example how you can use it in case you haven't used matfile before. If you have already used it and it doesn't work, let us know and as a general recommendation upload code snippets and data samples regarding your issues, it helps.

A quick comment regarding the tags: If your concern regards matlab, then don't tag python and similar things.

Pablo Jeken Rico
  • 569
  • 5
  • 22
  • i am actually working on python however my data file is a .MAT file. I am unsure if i should change my data format or not however the basic idea is that i want to create multiple images from my 512x512x91 image and use them as the input to my convolutional neural network. The issue is that once i create so many crops and save them together in some variable the size of the file becomes extremely large and i dont know how to cater for that – Newbie_101 Nov 06 '18 at 13:47
  • also important to note: this syntax is entirely wrong for python. No camelCase, no need for semicolons; part of the beautify of python over matlab. – 123 Oct 22 '19 at 19:22
  • Justin Mai, "beatify of python" is merely your point of view. It has nothing to do with the current matter. Using semicolons to surpress or actually not surpress is nice for debugging and envolves very little effort. – Pablo Jeken Rico Oct 23 '19 at 15:11
0

You can use numpy memory map. This is equivalent to matfile of MatLAB.

https://numpy.org/doc/stable/reference/generated/numpy.memmap.html

Maulik Madhavi
  • 156
  • 1
  • 3
  • 1
    What if the file is a struct in matlab, that is, not an ndarray? Could you elaborate a bit on how to do it? I went into the link and I am still confused. – Homero Esmeraldo Feb 18 '21 at 21:18