0

I've been working on the processing 2 lists of 855 4000x4000 matrices. Here is a list of 855 matrices of some value, another one is a list of coordinates (another 855 4000x4000 matrices). It's important to do it within the one cycle, in order to not to have thousands of useless variables. For every file, it cuts (read put NaN where I don't need data) coordinates data, then it cuts data related to the coordinates. Then it gathers all the values into one matrix. The code is:

for x = 1:length(list_with_par)
    cd 'D:\Coord'
    par_lon = ncread(list_with_coordinates(x,:), 'longitude');
    par_lon(par_lon>=15) = nan;
    par_lon(par_lon<=-18) = nan;
    
    par_lat = ncread(list_with_coordinates(x,:), 'latitude');
    par_lat(par_lon>=84) = nan;
    par_lat(par_lon<=76) = nan;
    
    cd 'D:\Par'
    par = ncread(list_with_par(x,:), 'PAR');
    
    for i = 1:size(ncread(list_with_par(x,:),'PAR'),1) %size(,1)
        for z = 1:size(ncread(list_with_par(x,:),'PAR'),2) %size(,2)
            if isnan(par_lon(i,z))
                par(i,z) = nan;
            end
            if isnan(par_lat(i,z))
                par(i,z) = nan;
            end
        end
    end
    if size(par,2) < size(PAR_main,2)
        left_cells = size(PAR_main,2) - size(par,2);
        temp_cell = NaN(4865,left_cells);
        C2 = cat(2,par,temp_cell);
    end
    if size(par,2) == size(PAR_main,2)
        C2 = par(:,:,1);
    end
    PAR_main(:,:,x) = C2(:,:,1);
end

But suddenly an error pops up after 4-5 hours of processing.

Error using netcdflib
The NetCDF library encountered an error during execution of 'open' function - 'HDF error (NC_EHDFERR)'.

Error in netcdf.open (line 67)
        [varargout{:}] = netcdflib ( 'open', filename, varargin{1} );

Error in internal.matlab.imagesci.nc/openToRead (line 1278)
            this.ncRootid = netcdf.open(this.Filename,'NOWRITE');

Error in internal.matlab.imagesci.nc (line 121)
                    this.openToRead();

Error in ncread (line 61)
ncObj   = internal.matlab.imagesci.nc(ncFile);

What might be an issue?

Rus
  • 13
  • 4
  • 1
    Do you pre-allocate `PAR_main`? That could save you a lot of time and memory fragmentation. Also, at the end of your loop you have to `if` statements that fill out `C2`, if neither of them is triggered, you won't define `C2`, and potentially use the previous iteration's values. This is likely a bug. You should always make sure you define variables in all possible situations. Maybe the second `if` is just an `else`? – Cris Luengo May 09 '22 at 20:23

1 Answers1

0

I'm not really familiar with ncread (and associated functions), but there two things that jump out at me that appear to be very inefficient:

  1. In your loops over 'i' and 'z', is there a reason to read in the data again to determine its size instead of just using the 'par' variable that you already saved?

    for i = 1:size(par,1)

    for z = 1:size(par,2)

  2. For that matter, unless I am missing something specific to this set of functions, you should be able to skip the loops over 'i' and 'z' completely and vectorize the calculation:

    par(isnan(par_lon))= nan;

    par(isnan(par_lat)) = nan;

This is certainly significantly slowing down your code. It is hard to say beyond that, but I could definitely see how having millions of extraneous file reads could cause some issues with either temporary file addresses, etc.) or memory leaks.

Jim Quirk
  • 606
  • 5
  • 18
  • Yeah, those are reasonable suggestions. Apparently, the problem was hidden in the most obvious thing. My 32gb ram laptop simply can't handle 4000x4000x855 matrix. So, I had to interpolate every layer and decrease the scale for every step in the cycle. – Rus May 18 '22 at 13:01
  • Glad it is resolved. Memory issues can be a big issue and minimizing the number of times you have to copies of a huge matrix will help you get the most of your laptop's limited memory. – Jim Quirk May 18 '22 at 17:32
  • The suggestions from @CrisLuengo are also spot-on and worth implementing. If/when you feel the issue is resolved, please remember to close out the question. – Jim Quirk May 18 '22 at 17:35