1

Below is an excerpt of the hex dump of an unformatted Fortran file (generated by AERMOD compiled with gfortran):

00f3ee50: 0000da50 00b746d7 00000001 204c4c41 20202020 462df2dd 403f41fa c5f77f92...     
00f4c886: 6a031d65 f2923f8c 658037cc 01813f8b 740e846e d7f83f8a 93a93e15 da503f89  
00f4c8a8: 0000da50 00b746d8 00000001 204c4c41 20202020 1bce0f33 4040cf25 059a6d45...     
00f5a2de: 04de57c7 6f803fa0 cc5e786d c1983f9e a6fd14ae 05803f9d 970266e8 da503f9c
00f5a300: 0000da50 00b74725 00000001 204c4c41 20202020 9e95fa2a 4087b60e ef189339...     
00f67d36: 7d9a5b20 bbe53fd8 467cf2bf be063fd7 292414d4 0c943fd6 22a6cc90 da503fd5
00f67d58: 0000da50 00b74726 00000001 204c4c41 20202020 92ee2eb6 40868bcc 991a0bf2...     
00f7578e: 128e3196 a8063fe2 2418d1d5 185e3fe1 49a7e799 009a3fe0 01ea4bf1 da503fdf
00f757b0: 0000da50 00b74727 00000001 204c4c41 20202020 00000000 00000000 00000000...     
00f831e6: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 da500000

Each record starts with 0x0000da50, and ends with 0xda503f## (or 0xda500000 when it's a record full of zeroes). I understand that the compiler inserts four bytes at the beginning and end of each record to indicate its length.

I know from the documentation that all of the records have a constant length, and that the first 16 bytes of each record contain routine info such as counters and labels, but not any calculated data.

Now if you take the difference between the positions for two consecutive "start of record lines," you get 55896 (e.g., 0xf4c8a8 - 0xf3ee50 = 55896), and subtracting out the 8 bytes that the compiler added, you get 55888, or 0xda50. So there are only two bytes that are giving the actual record length.

What are the last four bytes of each record doing?

Update

I've found that the stray bytes only appear when certain column sizes are specified in xxd. Here are full hex dumps for a smaller file (generated using the same Fortran program), with two different column specifications (first with 16 octets, then with 25). Maybe it's an issue with xxd?

$ xxd -e -c16 test.pst
00000000: 00000250 00b74275 00000001 204c4c41  P...uB......ALL 
00000010: 20202020 1273c268 4042050f 6ec665a7      h.s...B@.e.n
00000020: 404c165c 2e64c72d 40469557 87eefed5  \.L@-.d.W.F@....
00000030: 404d86c1 349e2405 40502c03 435bacfe  ..M@.$.4.,P@..[C
00000040: 40501966 fe8ace3c 404e4ef7 7382fb52  f.P@<....NN@R..s
00000050: 404b73f0 01369ac9 4048029b 19a95098  .sK@..6...H@.P..
00000060: 40442608 39972ac4 403ff81f 87f5457a  .&D@.*.9..?@zE..
00000070: 403760aa 0933541e 402e31cf 96229471  .`7@.T3..1.@q.".
00000080: 40201210 31afaa2e 400ab597 06b2618a  .. @...1...@.a..
00000090: 3ff934b7 a5814a70 3ff6c2cf 13454924  .4.?pJ.....?$IE.
000000a0: 3ff6d543 9389bf37 3ff6db74 3630fe04  C..?7...t..?..06
000000b0: 3ff6d48e 0f5e2bc8 3ff6c0b8 fdbc2622  ...?.+^....?"&..
000000c0: 3ff6a063 a26cd9d4 3ff67447 150d1668  c..?..l.Gt.?h...
000000d0: 3ff63d53 37d9abef 3ff5fca5 b6c1bba5  S=.?...7...?....
000000e0: 3ff5b37e be32f197 3ff56334 586c0c65  ~..?..2.4c.?e.lX
000000f0: 3ff50d24 2bcc15c8 3ff4b2a7 15798027  $..?...+...?'.y.
00000100: 3ff4550a dca18880 3ff3f585 039bd6d6  .U.?.......?....
00000110: 3ff3953a 9221d30d 3ff33529 97fa7215  :..?..!.)5.?.r..
00000120: 3ff2d639 1909878a 3ff27931 4e95ad98  9..?....1y.?...N
00000130: 400b42d6 4641e3b8 400a4f52 c5ddb926  .B.@..AFRO.@&...
00000140: 40096e88 aa92c3ba 40089fd0 93f8d3c8  .n.@.......@....
00000150: 4007e269 749bbc48 40073588 afbbc2f8  i..@H..t.5.@....
00000160: 40069860 9005d1f8 40060a2a bde7eb31  `..@....*..@1...
00000170: 40058a27 3a0dc41a 400517a6 434deebc  '..@...:...@..MC
00000180: 4004b202 77353ddd 400458a7 6920fdd8  ...@.=5w.X.@.. i
00000190: 40040b11 dccebcd6 4003c8cb c5b1ee47  ...@.......@G...
000001a0: 40039172 2388cc5f 400364b2 ce0c693f  r..@_..#.d.@?i..
000001b0: 40034245 3c8b166d 400329f9 52837682  EB.@m..<.).@.v.R
000001c0: 40031ba7 377415b1 4003173a 3de1aa1e  ...@..t7:..@...=
000001d0: 40031cab dbcb3888 40032c02 b5346967  ...@.8...,.@gi4.
000001e0: 40034558 b7f5a14d 400368d3 466d447d  XE.@M....h.@}DmF
000001f0: 400396aa 6cd87c23 4003cf22 1ad6f796  ...@#|.l"..@....
00000200: 40041292 57a28483 4004615f 63a836e2  ...@...W_a.@.6.c
00000210: 4004bc00 b4f68bf3 400522fb b5195f5e  ...@.....".@^_..
00000220: 400596e7 1dd722b0 4006186a 5bf7aba7  ...@."..j..@...[
00000230: 4026b8df b0158858 402bd56c f3245c1f  ..&@X...l.+@.\$.
00000240: 40310f07 e8781d38 4034d28c 97d0997b  ..1@8.x...4@{...
00000250: 403a0634 00000250                    4.:@P...

second:

    $ xxd -e -c25 test.pst
    00000000: 00000250 00b74275 00000001 204c4c41 20202020 1273c268     P.0fuB......ALL     h.s..
    00000019: a7404205 5c6ec665 2d404c16 572e64c7 d5404695 c187eefe     .B86e.n\.L@-.d.W.F@......
    00000032: 2405404d 2c03349e acfe4050 1966435b ce3c4050 4ef7fe8a     M@4e.4.,P@..[Cf.P@<....NN
    0000004b: 82fb5240 4b73f073 369ac940 48029b01 a9509840 44260819     @R40s.sK@..6...H@.P...&D@
    00000064: 39972ac4 403ff81f 87f5457a 403760aa 0933541e 402e31cf     .*71..?@zE...`7@.T3..1.@q
    0000007d: 10962294 2e402012 9731afaa 8a400ab5 b706b261 703ff934     ."4a. @...1...@.a...4.?pJ
    00000096: c2cfa581 49243ff6 d5431345 bf373ff6 db749389 fe043ff6     ..30.?$IE.C..?7...t..?..0
    000000af: f6d48e36 5e2bc83f f6c0b80f bc26223f f6a063fd 6cd9d43f     6.a2?.+^....?"&..c..?..l.
    000000c8: 3ff67447 150d1668 3ff63d53 37d9abef 3ff5fca5 b6c1bba5     Gt7eh...S=.?...7...?....~
    000000e1: 973ff5b3 34be32f1 653ff563 24586c0c c83ff50d a72bcc15     ..b2.2.4c.?e.lX$..?...+..
    000000fa: 80273ff4 550a1579 88803ff4 f585dca1 d6d63ff3 953a039b     .?f3y..U.?.......?....:..
    00000113: 21d30d3f f3352992 fa72153f f2d63997 09878a3f f2793119     ?.3f.)5.?.r..9..?....1y.?
    0000012c: 4e95ad98 400b42d6 4641e3b8 400a4f52 c5ddb926 40096e88     ..ba.B.@..AFRO.@&....n.@.
    00000145: d0aa92c3 c840089f 6993f8d3 484007e2 88749bbc f8400735     ..c2..@....i..@H..t.5.@..
    0000015e: 9860afbb d1f84006 0a2a9005 eb314006 8a27bde7 c41a4005     ..0d.@....*..@1...'..@...
    00000177: 0517a63a 4deebc40 04b20243 353ddd40 0458a777 20fdd840     :.69@..MC...@.=5w.X.@.. i
    00000190: 40040b11 dccebcd6 4003c8cb c5b1ee47 40039172 2388cc5f     ..b2.......@G...r..@_..#.
    000001a9: 3f400364 45ce0c69 6d400342 f93c8b16 82400329 a7528376     d.1bi..EB.@m..<.).@.v.R..
    000001c2: 15b14003 173a3774 aa1e4003 1cab3de1 38884003 2c02dbcb     .@03t7:..@...=...@.8...,.
    000001db: 34696740 034558b5 f5a14d40 0368d3b7 6d447d40 0396aa46     @g40.XE.@M....h.@}DmF...@
    000001f4: 6cd87c23 4003cf22 1ad6f796 40041292 57a28483 4004615f     #|e2"..@.......@...W_a.@.
    0000020d: 0063a836 f34004bc fbb4f68b 5e400522 e7b5195f b0400596     6.22..@.....".@^_.....@."
    00000226: 186a1dd7 aba74006 b8df5bf7 88584026 d56cb015 5c1f402b     ..24.@...[..&@X...l.+@.\$
    0000023f: 310f07f3 781d3840 34d28ce8 d0997b40 3a063497 00025040     ..00@8.x...4@{...4.:@P...

After a 16-byte preamble (ending with 202020), the output is supposed to be a sequence of 72 numbers. Since they take up 576 bytes, I'm guessing that they are double precision reals in Fortran:

36.03952 56.17470 45.16672 59.05278 64.68770 64.39687 60.61694 54.90578 48.02036 40.29712 31.96923 23.37760 15.09728 8.03528 3.33867 1.57537 1.42256 1.42707 1.42858 1.42689 1.42205 1.41416 1.40339 1.38997 1.37418 1.35632 1.33672 1.31571 1.29362 1.27076 1.24744 1.22393 1.20048 1.17730 1.15459 3.40764 3.28873 3.17897 3.07803 2.98555 2.90114 2.82440 2.75496 2.69246 2.63655 2.58692 2.54329 2.50540 2.47305 2.44602 2.42417 2.40736 2.39549 2.38850 2.38634 2.38900 2.39649 2.40886 2.42619 2.44857 2.47614 2.50907 2.54755 2.59180 2.64208 2.69868 2.76192 11.36108 13.91684 17.05872 20.82246 26.02424
agentp
  • 6,849
  • 2
  • 19
  • 37
  • Is this for a sequential access file? Note that what gets written to disk can vary depending on open specifiers, compile options and environment variables. The default is that the starting and ending record lengths are both four bytes. – IanH Jan 30 '17 at 02:50
  • @IanH The `OPEN` statement in the source code doesn't have the `FORM` flag specified. I think that means it's sequential access by default. –  Jan 30 '17 at 03:06
  • `ACCESS` is the specifier to look for. Yes - the default if not specified otherwise is SEQUENTIAL. – IanH Jan 30 '17 at 03:19
  • But if it is sequential, then form should be specified to unformatted – Vladimir F Героям слава Jan 30 '17 at 06:41
  • The first and last four bytes of each sequential unformatted record produced by gfortran SHOULD be the record length. (An exception is if the record is longer than 1GB in which case there are some other things done.) I'd want to see the whole file for more analysis, but if what you show is accurate I'd wonder if there's a bug in gfortran. Note the use of record lengths like this is entirely implementation-dependent and not necessarily portable (though most current compilers use a similar method.) – Steve Lionel Jan 30 '17 at 23:59
  • @SteveLionel thanks for the comment. The file I was looking at was about 500 Mb (though I'd be happy to send it to you if you were willing to look at it). I tried generating a much shorter one (about 0.3 Mb) whose records were only 32 bytes apiece. Interestingly, I get `00000020 00b74275 00000001 204c4c41 20202020 53aad716 3e95eac5 56a96a69 3f1370da 00000020` as one of the records (note the first and last four bytes have only the record length, and no mystery bytes). I wonder if the issue is related to the large record size. –  Jan 31 '17 at 02:09
  • Test with a file with just a single record, and show the code (a test example should be very short) and show the beginning and end of the hex dump. – agentp Jan 31 '17 at 15:27
  • @agentp that's a good idea. I started playing with a one-line binary file and found that the stray bytes appear or disappear depending on the number of columns specified in xxd. I'm going to update the question based on this. –  Feb 01 '17 at 01:20
  • looks like specifying a bytes per row that isn't a multiple of four is messing it up. There are other issues, eg the first byte (a7) in the second row does not appear in the first sequence – agentp Feb 01 '17 at 03:09
  • @agentp good catch. Now I want to see if either version is actually the correct output if parsed. The output is supposed to be a sequence of 72 numbers. I'll add them to the question text. –  Feb 01 '17 at 04:27
  • @agentp the `a7` definitely shouldn't be there, as I think `1273c268 4042050f` is meant to represent `36.03952` (you have to read it as `0x4042050F1273C268`). –  Feb 01 '17 at 05:13

1 Answers1

1

Parsing the file with the perl unpack function I'm able to account for all of the bytes properly. It seems to just be a small glitch with xxd, when certain column sizes are used.