41

I'd like to save the contents of a numpy float array into a raw binary file as signed 16 bit integers. I tried to accomplish this using ndarray.tofile but I can't figure out the right format string. It seems that the file is saved in double format, mo matter how I choose the format string. How do I do this? Thanks.

Peter
  • 515
  • 1
  • 5
  • 8

3 Answers3

78

I think the easiest way to do this is to first convert the array to int16,

array.astype('int16').tofile(filename)
Bi Rico
  • 25,283
  • 3
  • 52
  • 75
  • 3
    The [numpy documentation](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.ndarray.tofile.html) states "Information on endianness and precision is lost". Is this guarenteed to work on all platforms? – Gilly Apr 08 '16 at 02:55
  • 7
    `tofile` only writes the raw binary data of the array, not the metadata of the array. A typical use case is to open a file, write a header appropriate for the file type, and use `tofile` to fill in the raw data. It's the responsibility of the software reading the file to infer the metadata (endianness, precision, shape) from the header and mutate the raw data into a form appropriate for that platform. Without metadata there is no way to correctly interpret the raw content of the array. If all you need is to read and write arrays, look into "python hd5" or `numpy.save`. – Bi Rico Apr 08 '16 at 16:21
  • Thank you! This extra `astype` was needed even when I created the array specifying `np.array([], dtype=np.int8)`. – Aaron D. Marasco May 05 '16 at 11:35
  • 2
    Haven't tested this... but you can [explicitly set endianness](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.dtypes.html). `dtype(' – RTbecard Apr 28 '20 at 12:51
8

Take a look at the struct module, try this example:

import struct
import numpy

f=open("myfile","wb")
mydata=numpy.random.random(10)
print(mydata)
myfmt='f'*len(mydata)
#  You can use 'd' for double and < or > to force endinness
bin=struct.pack(myfmt,*mydata)
print(bin)
f.write(bin)
f.close()
Jay M
  • 3,736
  • 1
  • 24
  • 33
  • 17
    Based on experience, I'd strongly advise **against** the use of `struct.pack`. Its API requires you to demangle the array (i.e. use the `*` operator) into variables which is going affect performance **drastically** for large array sizes. Every element of the array will need a new, on-the-fly 64-bit pointer (on a 64-bit machine) to be created on the stack to point into that element. Just stick to `tostring`/`tobytes`/`tofile` to avoid this overhead. For example, in our case, we were dealing with 100 Mb arrays which leads to a whopping 6.4 Gb of RAM so `struct.pack` can do its work. – André Anjos Jan 18 '16 at 12:40
-3

You may use scipy.io.savemat which allows to save a dictionary of names and arrays into Matlab-style file:

import scipy.io as sio
sio.savemat(filename, pydict)

Here pydict may be = {'name1':np.array1, 'name2':np.array2,...}

To load the dict you just need:

pydict = sio.loadmat(filename)
Alexey Antonenko
  • 2,389
  • 1
  • 18
  • 18