0

I am saving a structured numpy array of the to a mat file using scipy.io.savemat this way:

 sio.savemat(filename, { ‘myStructuredArray’: myStructuredArray}, appendmat=True )

then I reload it this way:

mat = sio.loadmat( filename,  squeeze_me=True) # mat actually contains additional variables not shown here
myStucturedArray        = mat['myStucturedArray']

The content of the reloaded array is correct but the dtype of the reloaded array has changed.

Before saving dtype looks like this:

[('time', '<f8'), ('eType', '<i8'), ('name', 'S10')]

but after reload it looks like this instead:

[('time', 'O'), ('eType', 'O'), ('name', 'O')]

therefore when I subsequently try to append further structured data ( of form [('time', '<f8'), ('eType', '<i8'), ('name', 'S10')]) to that reloaded array python throws the following error:

TypeError: invalid type promotion with structured datatype(s).

How can i make sure that dtypes are preserved when I use savemat?

Baba
  • 475
  • 6
  • 19
  • 1
    Have you experimented with the various dtype/struct related options discussed in [the documentation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html)? – BrenBarn Feb 02 '22 at 22:41
  • 1
    I've loaded a variety of `.mat`, and know that MATLAB struct are returned as structured arrays. And that the use of `object` dtype is common. But I haven't tried this kind of write and read. I wonder what this mat looks like when loaded in MATLAB/Octave. Keep in mind that this pair of functions is intended primarily for MATLAB compatibility, not as a transparent `numpy` storage. The `np.save/load` pair are better for pure `numpy` use. – hpaulj Feb 02 '22 at 23:29
  • This change makes more sense when we realize that `loadmat` equivalent of a MATLAB cell array is an object dtype numpy array. https://www.mathworks.com/help/matlab/matlab_prog/cell-vs-struct-arrays.html – hpaulj Feb 03 '22 at 21:14

1 Answers1

1
In [103]: from scipy import io
In [104]: arr = np.ones(3, 'i,f,U10')
In [105]: arr
Out[105]: 
array([(1, 1., '1'), (1, 1., '1'), (1, 1., '1')],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<U10')])
In [106]: io.savemat('teststruct.mat', {'arr':arr})

loading:

In [107]: io.loadmat('teststruct.mat')
Out[107]: 
{'__header__': b'MATLAB 5.0 MAT-file Platform: posix, Created on: Wed Feb  2 17:41:27 2022',
 '__version__': '1.0',
 '__globals__': [],
 'arr': ...
In [109]: x=_['arr']
In [110]: x
Out[110]: 
array([[(array([[1]], dtype=int32), array([[1.]], dtype=float32), array(['1'], dtype='<U1')),
        (array([[1]], dtype=int32), array([[1.]], dtype=float32), array(['1'], dtype='<U1')),
        (array([[1]], dtype=int32), array([[1.]], dtype=float32), array(['1'], dtype='<U1'))]],
      dtype=[('f0', 'O'), ('f1', 'O'), ('f2', 'O')])
In [111]: x.shape
Out[111]: (1, 3)
In [112]: x.dtype
Out[112]: dtype([('f0', 'O'), ('f1', 'O'), ('f2', 'O')])
In [113]: x[0,0]
Out[113]: (array([[1]], dtype=int32), array([[1.]], dtype=float32), array(['1'], dtype='<U1'))

So the returned array is structured with object dtype fields, and each entry is a numpy array (2d) with dtypes matching the original.

The Octave load looks like a normal struct. The class isn't evident in the display, but does show as expected in the Workspace directory window.

In Octave

>> load teststruct.mat
>> arr
arr =

  1x3 struct array containing the fields:

    f0
    f1
    f2

>> arr.f0
ans = 1
ans = 1
ans = 1
>> arr.f1
ans =  1
ans =  1
ans =  1
>> arr.f2
ans = 1
ans = 1
ans = 1

As I commented, the primary goal of the savemat/loadmat pair is to interact with MATLAB seamlessly (as possible). And it seems to be to doing just that. Use np.save/load for transparent numpy round trips.

hpaulj
  • 221,503
  • 14
  • 230
  • 353