5

I have a long file of the type

Processin SCRIPT10 file..
Submitted batch job 1715572
Processin SCRIPT100 file..
Processin SCRIPT1000 file..
Submitted batch job 1715574
Processin SCRIPT10000 file..
Processin SCRIPT10001 file..
Processin SCRIPT10002 file..
Submitted batch job 1715577
Processin SCRIPT10003 file..
Submitted batch job 1715578
Processin SCRIPT10004 file..
Submitted batch job 1715579

I want to find out jobs (script names) that were not submitted. That means there is not line submitted batch job right after processing line.

So far I have tried to do that task using

pcregrep -M "Processin.*\n.*Processin" execScripts2.log | awk 'NR % 2 == 0'

But it does not handle properly the situation when multiple scripts does not get processed. It outputs, surprisingly, only SCRIPT1000 and SCRIPT10001 lines. Can you show me a better one-liner?

Ideally the output would be only the lines without 'Submitted' on the next line (or just script names) that means:

SCRIPT100
SCRIPT10000
SCRIPT10001

Thanks.

VojtaK
  • 483
  • 4
  • 13

2 Answers2

3

This awk can do the job:

awk -v s='Submitted' '$1 != s{if(p != "") print p; p=$2} $1 == s{p=""}' file

SCRIPT100
SCRIPT10000
SCRIPT10001

Reference: Effective AWK Programming

anubhava
  • 761,203
  • 64
  • 569
  • 643
1

Without using awk you could write a bash command/file and run it. If you have less knowledge of awk then this bash script works better if you want further customization.

#!/bin/bash


tempText=""
Processing="Processin"

while read line
do
  tempText=$line
  if [[ "$line" == Processin* ]];
  tempText=$line
  then
        read line
        if [[ "$line" != Submitted* ]];
        then
                echo $tempText
                tempText=$line
                while read line
                do
                        if [[ "$line" != Submitted* ]];
                        then
                                echo $tempText
                                tempText=$line
                        else
                                break
                        fi
                done
        fi
  fi

Run using ./check.sh filename

The current answer works fine though.

Basit Anwer
  • 6,742
  • 7
  • 45
  • 88
  • 1
    That will be immensely slow and fail in various ways given various input values. Don't do it. Read [why-is-using-a-shell-loop-to-process-text-considered-bad-practice](https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice) to learn some, but not all, of the problems with it. – Ed Morton May 24 '17 at 11:43
  • 1
    Hmm, didn't know of that. Thank-you! – Basit Anwer May 24 '17 at 12:21
  • 1
    I think this is useful too because of lot of 'magic' when using awk harms readability of the code. – VojtaK Jun 14 '17 at 14:07