Different results using OCCURS with different compilers

Question

I'm attempting to output the following row using DISPLAY and am getting the correct result in Micro Focus COBOL in Visual Studio and the Tutorialspoint COBOL compiler, but something strange when running it on a z/OS Mainframe using IBM's Enterprise COBOL:

01 W05-OUTPUT-ROW.
   05 W05-OFFICE-NAME PIC X(13).
   05 W05-BENEFIT-ROW OCCURS 5 TIMES.
       10 PIC X(2) VALUE SPACES.
       10 W05-B-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.
   05 PIC X(2) VALUE SPACES.
   05 W05-OFFICE-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.

It appears in Enterprise COBOL that the spaces are being ignored, and is adding an extra zero-filled column even though the PERFORM VARYING and DISPLAY code is the exact same in both versions:

PERFORM VARYING W02-O-IDX FROM 1 BY 1
   UNTIL W02-O-IDX > W12-OFFICE-COUNT

   MOVE W02-OFFICE-NAME(W02-O-IDX) TO W05-OFFICE-NAME

   PERFORM 310-CALC-TOTALS VARYING W02-B-IDX FROM 1 BY 1
       UNTIL W02-B-IDX > W13-BENEFIT-COUNT

   MOVE W02-O-TOTAL(W02-O-IDX) TO W05-OFFICE-TOTAL
   DISPLAY W05-OUTPUT-ROW
END-PERFORM

W13-BENEFIT-COUNT is 5 and never changes in the program, so the 6th column is a mystery to me.

Correct output:

Strange output:

Edit: as requested, here is W02-OFFICE-TABLE:

01 W02-OFFICE-TABLE.
    05 W02-OFFICE-ROW OCCURS 11 TIMES
    ASCENDING KEY IS W02-OFFICE-NAME
    INDEXED BY W02-O-IDX.
        10 W02-OFFICE-CODE PIC X(6).
        10 W02-OFFICE-NAME PIC X(13).
        10 W02-BENEFIT-ROW OCCURS 5 TIMES
        INDEXED BY W02-B-IDX.
            15 W02-B-CODE PIC 9(1).
            15 W02-B-TOTAL PIC 9(5)V99 VALUE ZERO.
        10 W02-O-TOTAL PIC 9(5)V99 VALUE ZERO.

and W12-OFFICE-COUNT is always 11, never changes:

01 W12-OFFICE-COUNT PIC 99 VALUE 11.

Need to see: W02-O-IDX; W02-B-IDX; W12-OFFICE-COUNT. Because some people use the suffix "IDX" and a pseudo-random way, we need to see the definitions. Take off the VALUE clause on W-B-TOTAL and the zeros will magically go away (they'll be something else). — Bill Woodger, Mar 27 '16 at 19:30
Now that the fix is in, does all the output line up the same? Look at your screenshots. Why is the TOTAL column, unaffected by the OCCURS issue, in a different place relative to the headings on the two examples? — Bill Woodger, Mar 28 '16 at 06:29

Bill Woodger · Answer 1 · 2016-03-28T11:43:35.347

The question is not so much "why does Enterprise COBOL do that?", because it is documented, as "why do those other two compilers generate programs that do what I want?", which is probably also documented.

Here's a quote from the draft of what became the 2014 COBOL Standard (the actual Standard costs money):

C.3.4.1 Subscripting using index-names

In order to facilitate such operations as table searching and manipulating specific items, a technique called indexing is available. To use this technique, the programmer assigns one or more index-names to an item whose data description entry contains an OCCURS clause. An index associated with an index-name acts as a subscript, and its value corresponds to an occurrence number for the item to which the index-name is associated.

The INDEXED BY phrase, by which the index-name is identified and associated with its table, is an optional part of the OCCURS clause. There is no separate entry to describe the index associated with index-name since its definition is completely hardware oriented. At runtime the contents of the index correspond to an occurrence number for that specific dimension of the table with which the index is associated; however, the manner of correspondence is determined by the implementor. The initial value of an index at runtime is undefined, and the index shall be initialized before use. The initial value of an index is assigned with the PERFORM statement with the VARYING phrase, the SEARCH statement with the ALL phrase, or the SET statement.

[...]

An index-name may be used to reference only the table to which it is associated via the INDEXED BY phrase.

From the second paragraph, it is clear that how an index is implemented is down to the implementor of the compiler. Which means that what an index actually contains, and how it is manipulated internally, can vary from compiler to compiler, as long as the results are the same.

The last paragraph quoted indicates that, by the Standard, a specific index can only be used for the table which defines that specific index.

You have some code equivalent to this in 310-CALC-TOTALS: take a source data-item using the index from its table, and use that index from the "wrong" table to store a value derived from that in a different table.

This breaks the "An index-name may be used to reference only the table to which it is associated via the INDEXED BY phrase."

So you changed your code in 310-CALC-TOTALS to: take a source data-item using the index from its table, and use a data-name or index defined on the destination table to store a value derived from that in a different table.

So your code now works, and will give you the same result with each compiler.

Why did the Enterprise COBOL code compile, if the Standard (and this was the same for prior Standards) forbids that use?

IBM has a Language Extension. In fact two Extensions, which are applicable to your case (quoted from the Enterprise COBOL Language Reference in Appendix A):

Indexing and subscripting ... Referencing a table with an index-name defined for a different table

and

OCCURS ... Reference to a table through indexing when no INDEXED BY phrase is specified

Thus you get no compile error, as using an index from a different table and using an index when no index is defined on the table are both OK.

So, what does it do, when you use another index? Again from the Language Reference, this time on Subscripting using index-names (indexing)

An index-name can be used to reference any table. However, the element length of the table being referenced and of the table that the index-name is associated with should match. Otherwise, the reference will not be to the same table element in each table, and you might get runtime errors.

Which is exactly what happened to you. The difference in lengths of the items in the OCCURS is down to the "insertion editing" symbols in your PICture for the table you DISPLAY from. If the items in the two tables were the same length, you'd not have noticed a problem.

You gave a VALUE clause for your table items (unnecessary, as you would always put something in them before the are output) and this left your "sixth" column, the five previous columns were written as shorter items. Note the confusion caused when the editing is done to one length and the storing done with a different implicit length, you even overwrite the second decimal place.

IBM's implementation of INDEXED BY means that the length of the item(s) being indexed is intrinsic. Hence the unexpected results when the fields referenced are actually different lengths.

What about the other two compilers? You'd need to hit their documentation to be certain of what was happening (something as simple as the index being represented by an entry-number (so plain 1, 2, 3, etc), and the allowing of an index to reference another table would be enough). There should be two extensions: to allow an index to be used on a table which did not define that index; to allow an index to be used on a table where no index is defined. The two logically come as a pair, and both only need to be specific (the first would do otherwise) because the are specifically against the Standard.

Micro Focus do have a Language Extension whereby an index from one table may be used to reference data from another table. It is not explicit that this includes referencing a table with no indexes defined, but this is obviously so.

Tutorialspoint uses OpenCOBOL 1.1. OpenCOBOL is now GnuCOBOL. GnuCOBOL 1.1 is the current release, which is different and more up-to-date than OpenCOBOL 1.1. GnuCOBOL 2.0 is coming soon. I contribute to the discussion area for GnuCOBOL at SourceForge.Net and have raised the issue there. Simon Sobisch of the GnuCOBOL project has previously approached Ideaone and Tuturialspoint about their use of the out-dated OpenCOBOL 1.1. Ideaone have provided positive feedback, Tutorialspoint, who Simon has again contacted today, nothing yet.

As a side-issue, it looks like you are using SEARCH ALL to do a binary-search of your table. For "small" tables, it is likely that the overhead of the mechanics of the generalised binary-search provided by SEARCH ALL outweighs any expected savings in machine resources. If you were to be processing large amounts of data, it is likely that a plain SEARCH would be more efficient than the SEARCH ALL.

How small is "small" depends on your data. Five is likely to be small close to 100% of the time.

Better performance than SEARCH and SEARCH ALL functionality can be achieved by coding, but remember that SEARCH and SEARCH ALL don't make mistakes.

However, especially with SEARCH ALL, mistakes by the programmer are easy. If the data is out of sequence, SEARCH ALL will not operate correctly. Defining more data than is populated gets a table quickly out of sequence as well. If using SEARCH ALL with a variable number of items, consider using OCCURS DEPENDING ON for the table, or "padding" unused trailing entries with a value beyond the maximum key-value that can exist.

Thanks for the explanation. I just assumed an index was a simple for-loop counter and entirely reusable. I never would have suspected that it was bound to the size of the table, but that makes complete sense now. Also, Microfocus gave this warning which I ignored because I was getting the correct results before I switched to z/OS: `"warning COBCH1118 : Index-name belongs to different table - unexpected behaviour may occur"` It would have been nice if the mainframe compiler threw a warning as well. — Jeremy Robson, Mar 28 '16 at 00:40
@JeremyRobson It is valid for Enterprise COBOL, so why would there be a warning? The way it works smoothly is "everyone knows you can't do that", COBOL 101, as it were. So no-one does it. Even though you can. For me, always take warnings seriously. Always find out exactly what they (or any other messages) mean. — Bill Woodger, Mar 28 '16 at 06:10

Magoo · Accepted Answer · 2016-03-27T22:53:57.557

2

I'd be very hesitant about mixing VALUE with OCCURS and re-code the WS as

01 W05-OUTPUT-ROW.
   05 W05-OFFICE-NAME  PIC X(13).
   05 W05-BENEFITS     PIC X(55) VALUE SPACES.
   05 FILLER REDEFINES W05-BENEFITS.
     07 W05-BENEFIT-ROW OCCURS 5 TIMES.
       10 FILLER       PIC X(02).
       10 W05-B-TOTAL  PIC ZZ,ZZ9.99.
   05 FILLER           PIC X(02) VALUE SPACES.
   05 W05-OFFICE-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.

Perhaps it has something to do with the missing fieldname?

Ah! evil INDEXED. I'd make both ***-IDX variables simple 99s.

edited Mar 27 '16 at 22:53

answered Mar 27 '16 at 20:30

Magoo

77,302
8
62
84

There's no problem with the VALUE statement, although it is not particularly useful. This will make no difference to the output. What missing fieldname? – Bill Woodger Mar 27 '16 at 21:07
@BillWoodger : I don't have access to the mainframe, so I can't test. I believe that it will indeed fix the problem - OP can determine. I've never used or seen the like of `10 PIC X(2)` and would always code it as `10 FILLER PIC X(2)`. – Magoo Mar 27 '16 at 21:17
I tried adding the FILLER keyword to the spaces values, no change. But you're right that removing the VALUE ZEROS removes the unexplained zeros. I also tried your code and the result is the same: the spaces are ignored (or perhaps being pushed to the end?) It must have something to do with DISPLAY-ing a table working differently on the mainframe. – Jeremy Robson Mar 27 '16 at 22:57
An unnamed data-item is an implicit FILLER. A VALUE clause is legal, and does what is expected (gives an initial value to all the occurrences of that field in the table). Both of those were introduced in the 1985 COBOL Standard. Prior to that they would both be compile errors. – Bill Woodger Mar 27 '16 at 23:11
@JeremyRobson "Take off the VALUE clause on W-B-TOTAL and the zeros will magically go away (they'll be something else)" I said in my comment. Look at the output in hex, you'll probably see them as binary zeros. – Bill Woodger Mar 27 '16 at 23:13
It was because I was using INDEXED instead of separate PICs for the subscripts. I still require the indices for SEARCH, but used different PICs for the PERFORM VARYING and the display now has the correct output. Thanks for your help, Magoo! – Jeremy Robson Mar 27 '16 at 23:15
@JeremyRobson do you want the explanation? I've already started writing eight minutes ago. Are you still interested that the other two compilers behave differently with your original code? Which code do you consider correct? You have come to a correct solution, and the problem is broadly as you describe. Are you interested in the detail? – Bill Woodger Mar 27 '16 at 23:25
Yes, definitely. Since this is part of an assignment for a COBOL course it would be helpful for everyone who might be stuck now or in the future who wants to reuse indices. I'm especially interested in why Microfocus and tutorialspoint.com gave the same result, and z/OS gave something different. – Jeremy Robson Mar 27 '16 at 23:33
And what is a "simple 99"? – Bill Woodger Mar 28 '16 at 06:11
And the advice to not use indexes is no use if @JeremyRobson is using SEARCH. – Bill Woodger Mar 28 '16 at 06:35

Different results using OCCURS with different compilers

2 Answers2

Linked