How do I decompress data compressed with CSRCMPSC (Compression/Expansion) Macro or CMPSC mainframe instruction?

Question

I'm working on a mainframe migration, and the input data file is compressed using the CMPSC instruction. I'm reading about how to compress/uncompress here: CSRCMPSC (Compression/Expansion) Macro and here: Compressing and Expanding data, but it doesn't go into any details. What I'm looking for is code (in any language) or an algorithm to uncompress the file that I can run on linux (my target language is Java). I see references to a manual ESA/390 Data Compression (SA22-7208) but I can't seem to find that either anywhere online. Any help would be appreciated!

If the company running the mainframe is using a program (using CMPSC) to compress data, they would need to also have a program to uncompress the data . How would they be able to use that data once it has been compressed? Is there no option to run that decompress program, and only send the decompressed data to linux? — phunsoft, May 12 '23 at 11:21

Mark Adler · Answer 1 · 2023-05-10T00:48:05.673

I found SA22-7208 here, where it is in "BKMGR" format, which seems to be an IBM-proprietary format for books. This is a Windows reader for that format. I was able to open the book and read it.

This may also help, at least with what a decompressor would need. It comes from PKWare's ZIP format appnote:

5.17 IBM z/OS CMPSC Compression - Method 16
-------------------------------------------

Method 16 utilizes the IBM hardware compression facility available
on most IBM mainframes.  Hardware compression can significantly 
increase the speed of data compression.  This method uses a variant 
of the LZ78 algorithm.  CMPSC hardware compression is performed
using the COMPRESSION CALL instruction.  

ZIP archives can be created using this method only on mainframes
supporting the CP instruction.  Extraction MAY occur on any
platform supporting this compression algorithm.  Use of this 
algorithm requires creation of a compression dictionary and
an expansion dictionary.  The expansion dictionary MUST be
placed into the ZIP archive for use on the system where
extraction will occur.

Additional information on this compression algorithm and dictionaries
can be found in the IBM provided document titled IBM ESA/390 Data 
Compression (SA22-7208-01). Storage requirements for using CMPSC 
compression are as follows.

The format for the compressed data stream placed into the ZIP
archive following the Local Header is:

    [dictionary header]
    [expansion dictionary]
    [CMPSC compressed data] 

If encryption is used to encrypt a file compressed with CMPSC, these 
sections MUST be encrypted as a single entity.

The format of the dictionary header is:

          Value            Size          Description
          -----            ----          -----------
          Version          1 byte        1
          Flags/Symsize    1 byte        Processing flags and
                                         symbol size
          DictionaryLen    4 bytes       Length of the 
                                         expansion dictionary

Explanation of processing flags and symbol size:

The high 4 bits are used to store the processing flags.  The low
4 bits represent the size of a symbol, in bits (values range
from 9-13).  Flag values are defined below.

    0x80 - expansion dictionary
    0x40 - expansion dictionary is compressed using Deflate
    0x20 - Reserved
    0x10 - Reserved

Thank you, Mark! Can you also help me find the reference manuals for IBM BAL and the s390 instruction set? TIA! — snoopyjc, May 11 '23 at 20:08
Ok I found this one https://www.ibm.com/resources/publications/OutputPubsDetails?PubID=SA22720108 — snoopyjc, May 12 '23 at 08:52
And here is the BAL one - this was hard to find!! https://www.ibm.com/resources/publications/OutputPubsDetails?PubID=SC26494004 — snoopyjc, May 12 '23 at 17:01

How do I decompress data compressed with CSRCMPSC (Compression/Expansion) Macro or CMPSC mainframe instruction?

1 Answers1