2

Unless using the new-ish stream-access available in F2003, Fortran classically considers files to be a sequence of records. If a file is connected for direct-access one can access any record in any order by specifying the record number. For example:

open(newunit=funit, file=filename, form='unformatted', access='direct', &
     recl=64, status='old')
read(funit, rec=2) data
close(funit)

So this sounds great...however, I'm not sure I understand the RECL parameter and how direct-access can be used effectively if the correct record length isn't already known. From the docs (various Intel Fortran versions):

All records have the length specified by the RECL option in the OPEN statement.

In other words, direct-access allows access to data in an amount equal to or less than RECL, while moving through the file in increments of RECL. That is, you can specify any value you like (equal to or less than the size of the file, I assume). I guess that's obvious in hindsight...yet I was hoping that the appropriate RECL was discoverable in some way.1 Perhaps I'm doing this wrong, but I would like to get the data from the specified record only - not more, not less.

Aside from encoding the appropriate RECL value in a 'header' section of the file, is there a way to access a single record at a time with a file connected for unformatted (or even formatted) direct-access if the correct record length is not known beforehand? What tricks-of-the-trade are used to do this?

1 I had hoped inquire(funit, recl=rl) would provide the appropriate RECL, but if the file was connected for direct-access, it returns the RECL value specified when the file was opened. If connected for sequential-access, it seems that it returns the maximum record length allowed (?), 2040 in my case.

Matt P
  • 2,287
  • 1
  • 11
  • 26
  • To your footnote: if a file is connected for sequential access then the returned value `rl` indeed corresponds to the maximum record length. – francescalus May 10 '17 at 17:13
  • @francescalus Since this is ifort, does that mean 4*2040 bytes? – Matt P May 10 '17 at 17:17
  • Possibly, but it could vary depending on options used when compiling. To be explicit: the maximum record length of the sequentially accessed file is unrelated to the appropriate record length for direct access. – francescalus May 10 '17 at 17:19
  • if you do not know `RECL` how do you know the file was written at a fixed record length at all? This seems to be a case where you should explain better what you are actually trying to do. – agentp May 10 '17 at 18:22
  • @agentp That's a reasonable question. The files are produced by a separate and independent program. In each file, the records are known to be all of the same length, but not necessarily what that length is. I think I may need to modify the file-writing program so that I can assume a maximum record length in the file-reading program, rather than trying to determine what the appropriate length is for each file that must be processed. – Matt P May 10 '17 at 18:35
  • If you have control of the writing side just use streams. – agentp May 10 '17 at 19:01
  • @agentp I am not very familiar with stream-access. While the capability for random access (using position) and for variable length "records" are advantages, it seems that you would then need to know the length/position of all records before you could access & use them effectively. Maybe I'm missing something...If you like, I can make a new question: to use streams would you have to read the entire file, and save the length/position of each record? In my case, I suppose I could read the first entry, and calculate the position of all subsequent records? – Matt P May 10 '17 at 19:33

1 Answers1

3

Indeed, it is not possible to find it out from looking at the file, because that is just the data and (normally) no record markers, so the compiler just sees a stream of unstructured bytes. At least in byte oriented computers. I know nothing about record oriented filesystems, only that they exist.

If you know what kind of data is stored in the direct access record, you can inquire the compiler by asking not about the file, but about the data.

For example, if the record consists of variables a, b, c, whatever they are,

 !just an example
 real :: a(10)
 type(my_type) :: b
 character(5) :: c(3)

you ask how large such a record is

 inquire(iolength=rl) a, b, c

and then you connect your file with recl=rl

open(newunit=funit, file=filename, form='unformatted', access='direct', &
     recl=rl, status='old')

See, for example, Why direct access I/O works incorrectly with Intel Visual Fortran

Be careful, the RECL value is not portable and will vary between compilers. Some measure it in bytes and some in 4-byte words. I just remember that gfortran and ifort differ, not which one is which. And I don't care which one is which.

If you find yourself specifying RECL with a magic constant as in recl=64 you are doing something wrong, because this will not work in a different compiler. You should always have a variable, not a fixed number.

Community
  • 1
  • 1
  • Then it shouldn't be opened that way. – Vladimir F Героям слава May 10 '17 at 17:51
  • Using `inquire` on the data makes sense. I tried using this previously, but incorrectly and was stumped. Now I think I have it... – Matt P May 10 '17 at 18:08
  • Ah, now I remember my problem with `inquire(iolength=rl) a`. Suppose `a` is an integer array. All records (arrays) in the file can be assumed to be the same length, but this length is not known *a priori*. Can you suggest how this might work? If not, I imagine I will need to modify the program that writes the file so that all arrays are created with a known maximum length, padded with zeros if needed. Does that make sense? – Matt P May 10 '17 at 18:25
  • Fortran standard does require recl measured in bytes. Ifort changes to this mode under -standard-semantics which implies byterecl – tim18 May 10 '17 at 18:32
  • @tim18 I did not reference the standard in my answer, I wrote about what common compilers do. I am also not sure the standard requires that, do you have a citation? IIRC the standard even allows record markers. – Vladimir F Героям слава May 10 '17 at 18:48
  • @tim18 I've also read that the standard suggests byterecl, but that it wasn't required. I haven't actually read the standard, however. I have read the Intel docs; with default compiler option: `assume nobyterecl ... Units for OPEN statement RECL values with unformatted files are in four-byte (longword) units.` – Matt P May 10 '17 at 19:13
  • 1
    For unformatted IO the requirement is for the record length to be specified in terms of _file storage units_, which may not be bytes, although an 8-bit octet is recommended where practical. For formatted IO even that doesn't hold. [@tim] – francescalus May 10 '17 at 20:15
  • 2
    To quote the standard: "The number of bits in a file storage unit is given by the constant FILE_STORAGE_SIZE (13.8.2.9) defined in the intrinsic module ISO_FORTRAN_ENV. It is recommended that the file storage unit be an 8-bit octet where this choice is practical." That recommendation was not in the standard prior to Fortran 2003 and some compilers, such as DEC's, chose "numeric storage unit" as the file storage unit. Intel, of course, inherited this as part of acquiring the DEC Fortran team. – Steve Lionel May 10 '17 at 20:41