Simplifying a DF/Sort job thats reads SMF to analyse a dataset's lifecycle

Question

So I have a batch job that extracts SMF type 14, 15 and 17 records into 3 separate files and then formats the files to produce a list of which datasets were read, written to and delete by which jobs. This is then sorted by timestamp so you can see the 'lifecycle' for a particular dataset.

However, I know that DF/Sortt is pretty powerful and I think that my initial step to separate out the type 14, 15 and 17 records isn;t necessary, and it could be done in one step, but I'm not really sure where to start as DFSort/ICETOOL has gotten pretty sophisticated.

Here's my current JCL:

//JBSP03DL JOB (JSDBBSP,P10),'SMF F NOW',                              
//         NOTIFY=&SYSUID,
//         CLASS=L,
//         MSGCLASS=X,
//         REGION=8M                                            
//*
//DELETE   EXEC PGM=IEFBR14
//OUTDSN DD DISP=(MOD,DELETE),DSN=JSDBSP.JBSP03.DSLIFE.TXT,
//       UNIT=SYSDA
//*
//SMFDUMP  EXEC PGM=IFASMFDP,REGION=6M
//*
//SYSPRINT DD SYSOUT=*
//* Extract type 14, 15 and 17 records into 3 temporary datasets
//DUMPIN   DD DISP=SHR,DSN=JSHSMF.SMF.JXSF.MANDUMP
//*
//DUMP14   DD  DISP=(,PASS),DSN=&&TYPE14,
//            UNIT=SYSDA,SPACE=(CYL,(500,200),RLSE),
//            BUFNO=20,BLKSIZE=27998,LRECL=32760,RECFM=VBS
//DUMP15   DD  DISP=(,PASS),DSN=&&TYPE15,
//            UNIT=SYSDA,SPACE=(CYL,(500,200),RLSE),
//            BUFNO=20,BLKSIZE=27998,LRECL=32760,RECFM=VBS
//DUMP17   DD  DISP=(,PASS),DSN=&&TYPE17,
//            UNIT=SYSDA,SPACE=(CYL,(500,200),RLSE),
//            BUFNO=20,BLKSIZE=27998,LRECL=32760,RECFM=VBS
//*
//SYSIN    DD *
INDD(DUMPIN,OPTIONS(DUMP))
OUTDD(DUMP14,TYPE(14))
OUTDD(DUMP15,TYPE(15))
OUTDD(DUMP17,TYPE(17))
//*
//SORTPROC PROC
//SORTWRTE EXEC PGM=SORT,REGION=8M
//SORTOUT  DD  DISP=MOD,DSN=&&SORTTMP,
//             SPACE=(CYL,(20,20)),UNIT=SYSDA
//SYSOUT   DD   SYSOUT=*
//SYSPRINT DD   SYSOUT=*
//SORTWK01 DD   DISP=(NEW,DELETE),DSN=&&TEMPSORT,UNIT=SYSDA,
//  SPACE=(CYL,(50,50))
//         PEND
//*
//* Process the type 14 records
//TYPE14   EXEC SORTPROC
//SORTIN   DD   DISP=SHR,DSN=&&TYPE14
//SORTOUT  DD  DISP=(,PASS),DSN=&&SORTTMP,
//             SPACE=(CYL,(20,20)),UNIT=SYSDA,
//             LRECL=133
//SYSIN    DD   *
  SORT FIELDS=(11,4,PD,A,7,4,PD,A)
  SUM FIELDS=NONE
  OUTREC BUILD=(11,4,DT1,EDIT=(TTTT-TT-TT),   DATE OF RECORD
                C' AT ',
                7,4,TM4,EDIT=(TT:TT:TT.TT), TIME OF RECORD
                C' ',
                69,44,
                C' was opened by ',
                19,8),CONVERT
//*
//* Process the type 15 records
//TYPE15   EXEC SORTPROC
//SORTIN   DD   DISP=SHR,DSN=&&TYPE15
//SYSIN    DD   *
  SORT FIELDS=(11,4,PD,A,7,4,PD,A)
  SUM FIELDS=NONE
  OUTREC BUILD=(11,4,DT1,EDIT=(TTTT-TT-TT),   DATE OF RECORD
                C' AT ',
                7,4,TM4,EDIT=(TT:TT:TT.TT), TIME OF RECORD
                C' ',
                19,8,
                C' opened ',
                69,44,
                C' for output'),CONVERT
//*
//* Process the type 17 records
//TYPE17   EXEC SORTPROC
//SORTIN   DD   DISP=SHR,DSN=&&TYPE17
//SYSIN    DD   *
  SORT FIELDS=(11,4,PD,A,7,4,PD,A)
  SUM FIELDS=NONE
  OUTREC BUILD=(11,4,DT1,EDIT=(TTTT-TT-TT),   DATE OF RECORD
                C' AT ',
                7,4,TM4,EDIT=(TT:TT:TT.TT), TIME OF RECORD
                C' ',
                19,8,
                C' deleted ',
                44,44),CONVERT
//*
//*  Finally sort the output file by the date & time stamp 
//*
//FINAL   EXEC SORTPROC
//SORTIN   DD   DISP=(OLD,DELETE),DSN=&&SORTTMP
//SORTOUT  DD   DISP=(NEW,CATLG),DSN=JSDBSP.JBSP03.DSLIFE.TXT,
//            UNIT=SYSDA,LRECL=121,RECFM=FB,SPACE=(CYL,(20,30))
//SYSIN    DD   *
SORT FIELDS=(1,23,CH,A)

It is possible to do this without separating the 14, 15 and 17 records into separate files?

Edit : the above JCL does exactly what I wan, but I'd like to be able to filter by dataset name or job name if possible, as this can produce a lot of output which is then too big for ISPF Edit or View for further analysis

Edit:

    Type 14 : 
5   5   SMF14RTY    1   binary  Record type 14 (X'0E').
18  12  SMF14JBN    8   EBCDIC  Job name.
68  44  SMF14_JFCBDSNM  44  EBCDIC DATA SET NAME (DSNAME=)

    Type 15 : 
5   5   SMF14RTY    1   binary  Record type 14 (X'0F').
18  12  SMF15JBN    8   EBCDIC Jobname
68  44  SMF15_JFCBDSNM  44  EBCDIC DATA SET NAME (DSNAME=)

    Type 17:
5   5   SMF17RTY    1   binary  Record type 17 (X'11').
18  12  SMF17JBN    8   EBCDIC  Job name.
44  2C  SMF17DSN    44  EBCDIC  Data set name.

A further enhancement would be to check if an OPEN was actually creating the dataset. I should also add RENAMES, otherwise you might lose track of what happened to a particular dataset.

Edit:

Following Bill's guidelines, my JCL is now:

//DELETE   EXEC PGM=IEFBR14                                   
//OUTDSN DD DISP=(MOD,DELETE),DSN=JSDBSP.JBSP03.DSLIFE.TXT,   
//       UNIT=SYSDA                                           
//*                                                           
//SORTWRTE EXEC PGM=SORT,REGION=8M                            
//*                                                           
//SORTIN   DD   DISP=SHR,DSN=JSHSMF.SMF.JXSG.MANDUMP          
//SORTOUT  DD  DISP=(MOD,CATLG),DSN=JSDBSP.JBSP03.DSLIFE.TXT, 
//             SPACE=(CYL,(20,20)),                           
//             UNIT=SYSDA,LRECL=133                           
//*                                                           
//SYSOUT   DD   SYSOUT=*                                      
//SYSPRINT DD   SYSOUT=*                                      
//SYMNOUT  DD   SYSOUT=*                                      
//SYMNAMES DD   *                                             
 SMF-RECORD-TYPE,5,1,BI                                       
 SMF-JOB-NAME,19,8,CH                                         
 SMF-14-15-DSN,69,44,CH                                       
 SMF-17-DSN,44,44,CH                                          
 SMF-DATE,11,4,DT1                                            
 SMF-TIME,7,4,TM4                                             
//*                                                           
//SYSIN    DD   *                                             
  SORT FIELDS=(11,4,PD,A,7,4,PD,A)                            
  OUTREC IFTHEN=(WHEN=(SMF-RECORD-TYPE,EQ,14),                
              BUILD=(SMF-DATE,EDIT=(TTTT-TT-TT),              
                     C' AT ',                                 
                     SMF-TIME,EDIT=(TT:TT:TT.TT),             
                     C' ',                                    
                     SMF-14-15-DSN,                           
                     C' was opened by ',                      
                     SMF-JOB-NAME)),CONVERT

But this gives:

OUTREC IFTHEN=(WHEN=(5,1,BI,EQ,14),BUILD=(11,4,DT1,EDIT=(TTTT-TT-TT),C' AT ',7,4
,TM4,EDIT=(TT:TT:TT.TT),C' ',69,44,C' was opened by ',19,8)),CONVERT            
                                                            *                                  
WER268A  OUTREC STATEMENT  : SYNTAX ERROR

Leaving off the

,CONVERT

gives me :

WER235A  OUTREC   RDW NOT INCLUDED

Edit - latest update:

Just trying to isolate type 14 records, so current input is now:

//SYMNAMES DD   *       
 SMF-RECORD-TYPE,6,1,BI 
 SMF-JOB-NAME,11,8,CH   
 SMF-14-15-DSN,65,44,CH 
 SMF-17-DSN,44,44,CH    
 SMF-DATE,11,4,DT1      
 SMF-TIME,7,4,TM4       

SYSIN DD *
    SORT FIELDS=(11,4,PD,A,7,4,PD,A)                  
    OUTFIL IFTHEN=(WHEN=(SMF-RECORD-TYPE,EQ,14),      
                BUILD=(1,4,SMF-DATE,EDIT=(TTTT-TT-TT),
                       C' AT ',                       
                       SMF-TIME,EDIT=(TT:TT:TT.TT),   
                       C' ',                          
                       SMF-14-15-DSN,                 
                       C' was opened by ',            
                       SMF-JOB-NAME))

dstaudacher · Answer 1 · 2016-10-16T12:30:31.810

2

"Is it possible to do this without separating the 14, 15 and 17 records into separate files?"

According to http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/IEA2G2C0/3.2.1

... a DD statement

//DUMP   DD  DISP=(,PASS),...

... with a control statement of

OUTDD(DUMP,TYPE(14,15,17))

... would combine all the types into one file.

edited Oct 16 '16 at 12:30

answered Oct 16 '16 at 12:22

dstaudacher

546
3
11

1

Hi David. Yes, but the question is how to combine the SORT statements thereafter. – Bill Woodger Oct 16 '16 at 14:07
David - yes, now I can distinguish between the record types in code, I can simply use the SMF archive as input to my SORT. I previously separated them out so I could run 3 slightly different sets of control statements against each type. – Steve Ives Oct 16 '16 at 16:48

score 2 · Accepted Answer · answered Oct 16 '16 at 15:10

Yes, and it is fairly painless.

IFTHEN=(WHEN= allows various types of conditional process.

Here you can us the IFTHEN=(WHEN=(logicalexpression) to make a case/select/evaluate-type structure:

IFTHEN=(WHEN=(5,1,B,EQ,14),
         ...),
IFTHEN=(WHEN=(5,1,B,EQ,15),
         ...),
IFTHEN=(WHEN=NONE,
         ...)

WHEN=NONE is the "catch-all", for when none of the previous tests is true. IFTHEN=(WHEN=(logicalexpression) stops for the current record when one test is true. Even if a second condition on the current record were to be true, it would not get actioned. If you want two or more "hits" in IFTHEN=(WHEN=(logicalexpression) then you have to use HIT=NEXT at the end of each test where you may want to "pass it on" to the next test. Here, that isn't relevant, since it is the same field tested for a single value.

IFTHEN can appear on INREC, OUTREC, or OUTFIL. You have your processing on OUTREC, so you would have (although see my later comment):

OUTREC IFTHEN=(WHEN=(5,1,B,EQ,14),
                ...),
       IFTHEN=(WHEN=(5,1,B,EQ,15),
                ...),
       IFTHEN=(WHEN=NONE,
                ...)

BUILD, OVERLAY and PARSE can be used within IFTHEN.

Some thoughts and tips.

I am suspicious of your SUM FIELDS=NONE. This would drop any records with a duplicate key. Which of the records from the input which is retained depends. If you use OPTION EQUALS or EQUALS on the SORT (or MERGE) then the first record will always be retained. If you don't the record which is retained when the key is duplicate can vary from run to run. EQUALS has some impact on performance.

Anyway, I'm not sure why you have FIELDS=NONE it here. You can even get an "accidental" match across entirely different data sets.

If you are going to SORT and then select only part of the data (in OUTREC or OUTFIL), then always consider "cutting down" the record which is to be sorted, so that it only includes the data you will later use. When SORTing, the less data, the less time, memory and temporary storage is used.

Consider using DYNAM for temporary storage, and remove your SORTWKn DD names from the JCL (you only have one here, but...). Dynamic allocation of workspace means you don't have to think much at all about the workspace (unless you have huge datasets with widely variant record-lengths for the data) and you don't "overallocate".

SORT Symbols. Symbols allow you to name your data, so references to the same field can be done by name, and SORT looks after the less thrilling task of typing the start-position and length each time. It also reduces the amount of comments required, because the field already has a name, which you can make descriptive.

Symbols are defined in a separate data set (F/FB 80) with a SYMNAMES DD. The translated symbols (which also provide a record of what was used) are held in a SYMNOUT dataset, which is not required, but is useful.

SORT then applies the symbols to your control cards, and as well as showing your original source in the SYSOUT, shows you the translated cards.

Symbols for this task could be specified along these lines

SMF-RECORD-TYPE,5,1,BI
SMF-JOB-NAME,18,8,CH
SMF-14-15-DSN,68,44,CH
SMF-17-DSN,44,44,CH
SMF-DATE,11,4,DT1
SMF-TIME,7,4,TM4

Then you can replace the multiple definitions of the same field with the symbol, and let SORT do the work.

If you want to do selection on data sets, you can look at using the PARM and the special symbols JP0-JP9. Or hard-coding. Or generating the SORT control cards from a list of data sets, or by using JOINKEYS.

Oh, and I know that you know, but you are actually using SYNCSORT. DFSORT does not have CONVERT on OUTREC, but it does on OUTFIL. To be transportable, here simply change your OUTREC to OUTFIL.

score 1 · Answer 3 · answered Oct 19 '16 at 14:02

OK - with help from Bill (who's answer I'm accepting as he got me going) and after taking the decision to get stuck into the manuals, this is my result:

//jobname JOB (acct_code),'pgmr_name',                               
//         NOTIFY=&SYSUID,
//         CLASS=L,
//         MSGCLASS=X,
//         REGION=8M
//*
// SET OUTFILE=your.results.file
//*
//DELETE   EXEC PGM=IEFBR14
//OUTDSN DD DISP=(MOD,DELETE),DSN=&OUTFILE,
//       UNIT=SYSDA
//*
//SMFDUMP  EXEC PGM=IFASMFDP,REGION=6M
//*
//SYSPRINT DD SYSOUT=*
//*
//DUMPIN   DD DISP=SHR,DSN=your.smf.dataset
//*
//DUMPOUT  DD  DISP=(,PASS),DSN=&&SMFTEMP,
//            UNIT=SYSDA,SPACE=(CYL,(500,200),RLSE),
//            BUFNO=20,DCB=*.DUMPIN
//*
//SYSIN    DD *
INDD(DUMPIN,OPTIONS(DUMP))
OUTDD(DUMPOUT,TYPE(14,15,17,18))
//*
//SORTPROC PROC
//SORTWRTE EXEC PGM=SORT,REGION=8M
//SORTIN   DD   DUMMY
//SORTOUT  DD   DUMMY
//SYSOUT   DD   SYSOUT=*
//SYSPRINT DD   SYSOUT=*
//SYMNAMES DD   *
 RDW,1,4,BI
 SMF-RECORD-TYPE,6,1,BI
 SMF-JOB-NAME,19,8,CH
 SMF-14-15-DSN,69,44,CH
 SMF-17-18-DSN,45,44,CH
 SMF-17-DSN,45,44,CH
 SMF-18-DSN,45,44,CH
 SMF-18-NDSN,89,44,CH
 SMF-DATE,11,4,DT1
 SMF-TIME,7,4,TM4
 SMFDEBOP,253,1
//         PEND
//*
//PROCESS  EXEC SORTPROC
//SORTIN   DD   DISP=OLD,DSN=&&SMFTEMP
//SORTOUT  DD  DISP=(,PASS),DSN=&&SORTTMP,
//             SPACE=(CYL,(20,20)),UNIT=SYSDA
//SYSIN    DD   *
  SORT FIELDS=(11,4,PD,A,7,4,PD,A)
  OUTREC IFTHEN=(WHEN=(SMF-RECORD-TYPE,EQ,14),
            BUILD=(RDW,SMF-DATE,EDIT=(TTTT-TT-TT),
                C' AT ',
                SMF-TIME,EDIT=(TT:TT:TT.TT),
                C' ',
                SMF-14-15-DSN,
                C' was opened by ',
                SMF-JOB-NAME)),
         IFTHEN=(WHEN=(SMF-RECORD-TYPE,EQ,15),
            BUILD=(RDW,SMF-DATE,EDIT=(TTTT-TT-TT),
                C' AT ',
                SMF-TIME,EDIT=(TT:TT:TT.TT),
                C' ',
                SMF-JOB-NAME,
                C' opened ',
                SMF-14-15-DSN,
                C' for output')),
         IFTHEN=(WHEN=(SMF-RECORD-TYPE,EQ,17),
            BUILD=(RDW,SMF-DATE,EDIT=(TTTT-TT-TT),
                C' AT ',
                SMF-TIME,EDIT=(TT:TT:TT.TT),
                C' ',
                SMF-JOB-NAME,
                C' deleted ',
                SMF-17-DSN)),
         IFTHEN=(WHEN=(SMF-RECORD-TYPE,EQ,18),
            BUILD=(RDW,SMF-DATE,EDIT=(TTTT-TT-TT),
                C' AT ',
                SMF-TIME,EDIT=(TT:TT:TT.TT),
                C' ',
                SMF-JOB-NAME,
                C' renamed ',
                SMF-18-DSN,
                C' to ',
                SMF-18-NDSN))
//*
//FINAL   EXEC SORTPROC
//SORTIN   DD   DISP=OLD,DSN=&&SORTTMP
//SORTOUT  DD   DSN=&OUTFILE,
//           DISP=(NEW,CATLG),UNIT=SYSDA,SPACE=(CYL,(20,30),RLSE)
//SYSIN    DD   *
  OPTION VLSHRT,VLSCMP
  SORT FIELDS=(5,25,CH,A)
  INCLUDE COND=(1,125,SS,EQ,C'PEEL',
                AND,
                1,125,SS,EQ,C'XCOM')
  OUTFIL FNAMES=SORTOUT,VTOF,OUTREC=(5,126)

I couldn't work out if it was possible to incorporate the final step into the main step, but I'm happy with it as it is. Note that we are actually using Syncsort, not DF/Sort, so be aware that changes might be required if you are a DF/Sort shop.

The INCLUDE COND is there because most of the time, the output dataset is too large for ISPF Edit or View, otherwise you could just edit the output and filter it there.

The original SORT I think is only there to control the order in the output, rather than something necessary for processing (or thought necessary. going back to the SUM). You could SORT for your final output, and use INCLUDE= on OUTFIL to do that selection. However, if you are going to "prepare" the data with the first step, and do various runs of the FINAL, then better to have them split. Remember, the symbols can help in FINAL as well. — Bill Woodger, Oct 19 '16 at 14:54
@BillWoodger - indeed - this is driven by an ISPF panel so I'll probably add an option to save the SMF extract and then run another 'PROCESS' step against it, to save the potentially lengthy SMF step. — Steve Ives, Oct 19 '16 at 14:57
@Bill - Just realised that I should have a 'filter' step before the main sort with the search arguments, to cut down on the amount of data being sorted. I got a reduction in elapsed time of 32% (12.91 minutes down to 8.89) — Steve Ives, Oct 21 '16 at 10:46
Ys, it would depend on the prokected use. If occasional but needs up-to-date information. cut it down early. If "historical" but multiple, prepare the data for the period (end of month/week/day/whatever) and do the multiple queries off the back. And of course anyting in between, and possibly different concurrent usage :-) — Bill Woodger, Oct 21 '16 at 12:04

Simplifying a DF/Sort job thats reads SMF to analyse a dataset's lifecycle

3 Answers3