I need to perform multiplication involving 60000X70000 matrix either in python or matlab. I have a 16GB RAM and am able to load each row of the matrix easily (which is what I require). I am able to create the matrix as a whole in python but not in matlab. Is there anyway I can save the array as .mat file of v7.3 using h5py or scipy so that I can load each row separately?
1 Answers
For MATLAB v7.3 you can use hdf5storage
which requires h5py
, download the file here, extract, then type: python setup.py install
from a command prompt.
https://pypi.python.org/pypi/hdf5storage
import h5py
import hdf5storage
import numpy as np
matfiledata = {} # make a dictionary to store the MAT data in
matfiledata[u'variable1'] = np.zeros(100) # *** u prefix for variable name = unicode format, no issues thru Python 3.5; advise keeping u prefix indicator format based on feedback despite docs ***
matfiledata[u'variable2'] = np.ones(300)
hdf5storage.write(matfiledata, '.', 'example.mat', matlab_compatible=True)
If MATLAB can't load the whole thing at once, I think you'll have to save it in different variables matfiledata[u'chunk1'] matfiledata[u'chunk2'] matfiledata[u'chunk3']
etc.
Then in MATLAB if you save each chunk as a variable
load(filename,'chunk1')
do stuff...
clear chunk1
load(filename,'chunk2')
do stuff...
clear chunk2
etc.
The hdf5storage.savemat has a parameter to allow the file to be read into Python correctly in the future so worth checking out, and follows the scipy.io.loadmat format... although you can do something like this if saving data from MATLAB to make it easy to import back into Python:
MATLAB
save('example.mat','-v7.3')
Python
matdata = hdf5storage.loadmat('example.mat')
That will load back into Python as a dictionary which you can then convert into whatever datatypes you need.

- 5,721
- 4
- 31
- 50

- 2,602
- 13
- 36
-
Did you forget to put the name of the dictionary names as u'name' which makes it a unicode key? – Matt Apr 02 '16 at 02:28
-
Thank you! But I get a memory error while trying to execute the `hdf5storage.write` command is there a workaround? – SH_V95 Apr 02 '16 at 02:36
-
yes I forgot the right syntax for the dictionary key which I solved now. – SH_V95 Apr 02 '16 at 02:37
-
Note here that the first example shows that you can read a v7.3 MAT file 1 column at a time (just reverse (:,1) with (1,:) to get rows: http://www.mathworks.com/help/matlab/import_export/load-parts-of-variables-from-mat-files.html and avoid loading the whole MAT file into memory. That can make your memory usage as high as your computer can support without getting an out of memory error. – Matt Apr 02 '16 at 04:34
-
Use a for loop where i=1 to size(array,2) for columns, to size(array,1) for rows, to cycle through each respectively. Probably a better way to do it with more vectors grouped together but this is your minimal row or column at a time memory usage. – Matt Apr 02 '16 at 04:47