1

I have this code which reads a Fortran unformatted data file and writes the ascii output to a new file output.dat. I want to read this output.dat file into a numpy array. However, the fromfile utility reads strange values, which I think is due to the "dtype" mismatch. I have tried all possible dtypes but I still am not getting the proper values. Can someone guide me what I should do here.

My code to read fortran unformatted and write ascii, and also read the ascii file into a numpy array:

# Code unformatierten Fortran-Dateien mit Python lesen

import numpy as np
from struct import *
import fortranfile as fofi
from array import array

f = fofi.FortranFile('extract.bin',endian='>',header_prec='i')
x = f.readInts()
xx = f.readReals('f')

print x
print 'Die Lange von x ist',len(x)
print 'Dies ist'
print xx[0:20]
print 'Die Lange ist',len(xx)
dd = list(xx)
d  = list(x)


df=len(xx)/8
print 'Der Wert ist',df
g = fofi.FortranFile('output.dat',mode='w')
g.writeRecord(str(d))
g.write('\n')
g.writeRecord(str(dd))
g.close()

filename = open('output.dat','rb')
field = np.fromfile(filename,dtype=np.float64)
filename.close()
print field

Python reads the unformatted fortran and writes to the output file as. The file includes some DLE,FS and NUL characters which I dont know how to remove. The 'YS' character is also a part of the conversion.

  [1, 167, 133, 6]   
YS [0.0, 4.3025989e-07, 1.5446712e-06, 3.1393029e-06, 5.0430463e-06, 7.1382601e-06, `9.301104e-06, 1.1476222e-05, 1.3561337e-05, 1.5552534e-05, 1.7355138e-05, 1.9008177e-05, `2.0416919e-05, 2.1655113e-05, 2.2624969e-05, 2.3426954e-05, 2.3961067e-05, 2.4346635e-05, 2.4482841e-05, 2.4501234e-05, 2.4301233e-05, 2.4020905e-05, 2.3559202e-05, 2.3056287e-05, 2.2411346e-05, 2.1758024e-05, 2.1005515e-05, 2.0265579e-05, 1.9453466e-05, 1.8686056e-05, 1.7860904e-05, 1.7103739e-05, 1.6299076e-05, 1.5573576e-05, 1.4809892e-05, 1.4126301e-05, 1.3412908e-05, 1.2775883e-05, 1.2116507e-05, 1.1522323e-05, 1.0915101e-05, 1.0356307e-05, 

Currently, my output is

[  1 167 133   6]
Die Lange von x ist 4 // The length of x is
Dies ist  // This is ( The actual value)
[  0.00000000e+00   4.30259888e-07   1.54467125e-06   3.13930286e-06
   5.04304626e-06   7.13826012e-06   9.30110400e-06   1.14762224e-05
   1.35613373e-05   1.55525340e-05   1.73551380e-05   1.90081773e-05
   2.04169191e-05   2.16551125e-05   2.26249686e-05   2.34269537e-05
   2.39610672e-05   2.43466347e-05   2.44828407e-05   2.45012343e-05]
Die Lange ist 133266 // The length is
Der Wert ist 16658  // The value (after reading with numpy) is
[  4.66529177e-062   3.47245665e-313   3.28870023e-086 ...,
   1.05249949e-153   1.69339332e-052   3.30673243e+093]

The value after numpy reading is not the same as the array before that. How can I fix this and read all these values into numpy arrays of my choice? Also, if you have better suggestions for reading fortran unformatted files, please comment.

atmaere
  • 345
  • 1
  • 8
  • 18
  • can you show an example file that needs to be read? – ev-br Jun 27 '13 at 12:32
  • The example file is the long row of numbers in the middle...the one which starts with [1,167,133,6] and then YS following.. – atmaere Jun 27 '13 at 15:28
  • with square brackets and such? What is the precise format of the file then? – ev-br Jun 27 '13 at 15:36
  • Yes, with square brackets and everything else shown there. Its .dat format. Its the output.dat file which my code gives, as you can see from the source. As much as I dont want the square brackets and those other ASCII characters like NULL,DLE and YS, I do not know how to remove them or read the file properly into a numpy array. – atmaere Jun 27 '13 at 15:49

1 Answers1

1

If you're on Linux, use the translate utility tr to remove all characters except: 0-9 + - . e f inf NaN blanks tabs newlines:

tr -C -d '0-9 + \- . ef EF inf NaN \t\n'  < in  > out  # delete non-numbers

(not quite sure if that's what you want to do).

Also, use fromfile( file, sep=' ' )
to read a text file with numbers separated by whitespace (blanks, tabs, newlines);
the default sep='' is for reading binary files.

Viel Glueck

denis
  • 21,378
  • 10
  • 65
  • 88