I assume you want to do this "on your own" without external tools (a faster approach can be found at the end).
You first have to "know" your source:
- which compiler was it compiled with (get a manual for this compiler)
- which options were used
Then you have to preparse the source:
- include copybooks (doing the given
REPLACING
rules if any)
- if the source is in free-form reference format: concatenate contents of last line and current line if you find a
-
in column 7
- check for
REPLACE
and change the result accordingly
- remove all comments (maybe only
*
and \
in column 7 in fixed-form reference format or similar (extensions like "variable" format / "terminal" format", ... exist, maybe only inline comments - when in free-form reference-format, otherwise maybe inline comments *>
or compiler specific extensions like |
) - depending on the further re-engineering you want to do it could be a good idea to extract them and store them at least with a line number reference
The you finally can track the procedure name with the following rule:
- go backwards to the last separator period (there are more rules but the rule "at least one line break, another period, a space a comma or a semicolon" [I've never seen the last two in real code but it is possible" should be enough)
- check if there is only one word between this separator period and the next
- if this word is no reserved COBOL word (this depends on your compiler) it is very likely a procedure name
Start from here and check the output, then fine grade the rule with actual false positives or missing entries.
If you want to do more than only extract the procedure-names for PERFORM
and GO TO
(you should at least check the sources for PERFROM ... THRU
) then this can get to a lot of work...
Faster approach with external tools:
- run a COBOL compiler on the complete sources and tell it to do the preparsing only - this way you have the big second point solved already
- if you have the option: tell the compiler or an external tool to create a symbol table / cross reference - this will tell you in which line a procedure is and its name (you can simply find the correct procedure by comparing the line)
Just a note: You may want to check GnuCOBOL (formerly OpenCOBOL) for the preparsing and/or generation of symbol tables/cross-reference and/or printcbl for a completely external tool doing preparsing and/or cobxref for a complete cross reference generation.