2

I am trying to use np.fromfile in order to read a binary file that I have written with Fortran using direct access. However if I set count=-1, instead of max_items, np.fromfile returns a larger array than expected; adding zeros to the vector I've written in binary.

Fortran test code:

program testing
implicit none
integer*4::i
open(1,access='DIRECT', recl=20, file='mytest', form='unformatted',convert='big_endian')
write(1,rec=1) (i,i=1,20)
close(1)
end program

How I am using np.fromfile:

import numpy as np
f=open('mytest','rb')
f.seek(0)
x=np.fromfile(f,'>i4',count=20)
print len(x),x

so if I use it like this it returns exactly my [1,...,20] np array, but setting count=-1 returns [1,...,20,0,0,0,0,0] with a size of 1600.

I am using a little endian machine (shouldn't affect anything) and I am compiling the Fortran code with ifort.

I am just curious about the reason this happens, to avoid any surprises in the future.

  • Avoid using units less than 10 in Fortran. It is not a problem here, but it is risky. Small unit numbers are often pre-connected to something else. Most often units `0`, `5` and `6`, but they can be different numbers. – Vladimir F Героям слава Jun 02 '17 at 14:01
  • I think you may be writing the array in Fortran incorrectly. `recl=4*20` is not portable and is incorrect for ifort. – Vladimir F Героям слава Jun 02 '17 at 14:04
  • It is a duplicate of https://stackoverflow.com/questions/37770912/why-direct-access-i-o-works-incorrectly-with-intel-visual-fortran I made a mistake when closing, I reopened it again and now I can't vote anymore. – Vladimir F Героям слава Jun 02 '17 at 14:07
  • I see, thanks for your quick answer –  Jun 02 '17 at 14:08
  • nevermind, setting recl to = 20 didn't change anything for python, and the fortran code now can only read the file, if larger == initial_size. –  Jun 02 '17 at 14:16
  • *"and the fortran code now can only read the file, if larger == initial_size"* But that is correct! That is exactly how it should be! – Vladimir F Героям слава Jun 02 '17 at 14:17
  • I know, but why does python still concatenates 1520 zero to my vector –  Jun 02 '17 at 14:18
  • How large is the datafile? 1600 integers is too much, that cannot be explained by that Fortran issue, because 4*4*20 is only 320 bytes. Anyway, it would be better to correct your code in the question to `recl=20` so that it is clear this was fixed. – Vladimir F Героям слава Jun 02 '17 at 14:20
  • I suspect that is just the minimum block size of the file. – Jack Jun 02 '17 at 14:36
  • I am not sure what you mean.. numpy.fromfile automatically set 1600 as the minimum ? –  Jun 02 '17 at 14:45
  • What is the size of the file? – agentp Jun 02 '17 at 14:58
  • Dumb mistake, i forget to delete the file and so it kept the size of an earlier test with a larger zero vector... oups. thanks guys –  Jun 02 '17 at 15:14
  • You should specify `status='replace'` (its been so long since I used direct access i thought that was the default ). Really you should probably use `access='stream'` anyway. – agentp Jun 02 '17 at 18:27

0 Answers0