np.fromfile with count=-1 adds unexpected zeros

Question

I am trying to use np.fromfile in order to read a binary file that I have written with Fortran using direct access. However if I set count=-1, instead of max_items, np.fromfile returns a larger array than expected; adding zeros to the vector I've written in binary.

Fortran test code:

program testing
implicit none
integer*4::i
open(1,access='DIRECT', recl=20, file='mytest', form='unformatted',convert='big_endian')
write(1,rec=1) (i,i=1,20)
close(1)
end program

How I am using np.fromfile:

import numpy as np
f=open('mytest','rb')
f.seek(0)
x=np.fromfile(f,'>i4',count=20)
print len(x),x

so if I use it like this it returns exactly my [1,...,20] np array, but setting count=-1 returns [1,...,20,0,0,0,0,0] with a size of 1600.

I am using a little endian machine (shouldn't affect anything) and I am compiling the Fortran code with ifort.

I am just curious about the reason this happens, to avoid any surprises in the future.

Avoid using units less than 10 in Fortran. It is not a problem here, but it is risky. Small unit numbers are often pre-connected to something else. Most often units `0`, `5` and `6`, but they can be different numbers. — Vladimir F Героям слава, Jun 02 '17 at 14:01
I think you may be writing the array in Fortran incorrectly. `recl=4*20` is not portable and is incorrect for ifort. — Vladimir F Героям слава, Jun 02 '17 at 14:04
It is a duplicate of https://stackoverflow.com/questions/37770912/why-direct-access-i-o-works-incorrectly-with-intel-visual-fortran I made a mistake when closing, I reopened it again and now I can't vote anymore. — Vladimir F Героям слава, Jun 02 '17 at 14:07
nevermind, setting recl to = 20 didn't change anything for python, and the fortran code now can only read the file, if larger == initial_size. — , Jun 02 '17 at 14:16
*"and the fortran code now can only read the file, if larger == initial_size"* But that is correct! That is exactly how it should be! — Vladimir F Героям слава, Jun 02 '17 at 14:17
I know, but why does python still concatenates 1520 zero to my vector — , Jun 02 '17 at 14:18
How large is the datafile? 1600 integers is too much, that cannot be explained by that Fortran issue, because 4*4*20 is only 320 bytes. Anyway, it would be better to correct your code in the question to `recl=20` so that it is clear this was fixed. — Vladimir F Героям слава, Jun 02 '17 at 14:20
I am not sure what you mean.. numpy.fromfile automatically set 1600 as the minimum ? — , Jun 02 '17 at 14:45
Dumb mistake, i forget to delete the file and so it kept the size of an earlier test with a larger zero vector... oups. thanks guys — , Jun 02 '17 at 15:14
You should specify `status='replace'` (its been so long since I used direct access i thought that was the default ). Really you should probably use `access='stream'` anyway. — agentp, Jun 02 '17 at 18:27

np.fromfile with count=-1 adds unexpected zeros

0 Answers0