1

I am working on a model. The results are stored in a NetCFD file with masked data of lon, lat and time per particle. I want to get the last real value of lon, lat, time for each particle. I have managed to get the position of the last real number, but not the value itself.

Do you have any suggestions?

My code looks like this:

lat1= masked_array(data=[[-14.33945369720459, -14.33945369720459, -14.339454650878906,
     -14.339454650878906, -14.339454650878906, -14.339454650878906,
     -14.339454650878906, -14.339454650878906, -14.339457511901855,
     -14.339459419250488, -14.339459419250488, -14.339459419250488,
     --, --, --, --, --, --, --, --],
    [-5.621851444244385, -5.621865272521973, -5.621881008148193,
     -5.621898651123047, -5.621916770935059, -5.621936321258545,
     -5.6219563484191895, -5.621973037719727, -5.621990203857422,
     -5.622012615203857, -5.622034072875977, -5.622053146362305, --,
     --, --, --, --, --, --, --]], mask=[[False, False, False, False, False, False, False, False, False,
     False, False, False,  True,  True,  True,  True,  True,  True,
      True,  True],
    [False, False, False, False, False, False, False, False, False,
     False, False, False,  True,  True,  True,  True,  True,  True,
      True,  True]], fill_value=nan, dtype=float32)         #latitude values of 2 particles


def last_nonzero(lat1, axis, invalid_val=-9999):
    mask = lat1!=0
    val = lat1.shape[axis] - np.flip(mask, axis=axis).argmax(axis=axis) - 1
    return np.where(mask.any(axis=axis), val, invalid_val)

last_nonzero(lat1, axis=1, invalid_val=-9999)        #for each particle, gives the position of the last real number
print lat1[last_nonzero(lat1, axis=1, invalid_val=-9999)]
Georgy
  • 12,464
  • 7
  • 65
  • 73
Emma
  • 11
  • 2
  • 1
    Could you provide a [Minimal, Complete, and Verifiable](https://stackoverflow.com/help/mcve) example? I think you can safely remove the part with NetCDF, and just provide examples of input as NumPy arrays. – Georgy Dec 11 '18 at 09:52
  • I am not sure how to create a NumPy array similar to the one I get from the netCDF file. I have replaced the NetCDF part with the result I get for 2 particles from my files. Does that help? – Emma Dec 11 '18 at 11:58
  • Could you resolve the issue in the end? – Georgy Dec 14 '18 at 13:58

1 Answers1

1

If I understand correctly, what you could do is:

  1. Get the indices of last non-zero elements (you already know how to get them):

    >>> last_nonzero_indices = last_nonzero(lat1, axis=1, invalid_val=-9999)
    >>> last_nonzero_indices
    array([11, 11], dtype=int64)
    
  2. Get only the valid entries of your initial array:

    >>> valid_values = lat1[~lat1.mask]
    >>> valid_values
    masked_array(data=[-14.33945369720459, -14.33945369720459,
                       -14.339454650878906, -14.339454650878906,
                       -14.339454650878906, -14.339454650878906,
                       -14.339454650878906, -14.339454650878906,
                       -14.339457511901855, -14.339459419250488,
                       -14.339459419250488, -14.339459419250488,
                       -5.621851444244385, -5.621865272521973,
                       -5.621881008148193, -5.621898651123047,
                       -5.621916770935059, -5.621936321258545,
                       -5.6219563484191895, -5.621973037719727,
                       -5.621990203857422, -5.622012615203857,
                       -5.622034072875977, -5.622053146362305],
                 mask=[False, False, False, False, False, False, False, False,
                       False, False, False, False, False, False, False, False,
                       False, False, False, False, False, False, False, False],
           fill_value=nan,
                dtype=float32)
    
  3. As the returned array is flattened, calculate corresponding indices from the indices that we calculated before:

    >>> last_nonzero_indices = np.cumsum(last_nonzero_indices)
    >>> last_nonzero_indices
    array([11, 22], dtype=int64)
    
  4. Get the desired last non-zero values:

    >>> valid_values[last_nonzero_indices]
    masked_array(data=[-14.339459419250488, -5.622034072875977],
                 mask=[False, False],
           fill_value=nan,
                dtype=float32)
    

I don't really like this solution though, and hope that someone with better knowledge on masked arrays could propose something better.

Georgy
  • 12,464
  • 7
  • 65
  • 73
  • For one dimensional list, your ~mask method can pick the last not masked value, for example ```next(reversed(line.get_ydata()[~line.get_ydata().mask]),float("inf"))``` – Frank Mar 22 '21 at 18:47