-1

I have the input data in below format from command and I need to parse it using foreach loop or awk or any efficient technique to get desired output as shown below.

Input Data:

General Information
  Job = Job1
  Workstation Folder = /test/
  Workstation = xyz-test
  Monitored = No
  Requires Confirmation = No
  Interactive = No
  Critical = No
Runtime Information
  Status = Successful
  Internal Status = SUCC
  Not Satisfied Dependencies = 0
  Rerun Options =
  Rerun on the same workstation = No
  Information =
  Promoted = No
  Return Code = 0
  Return Code Mapping Expression =
Time Information
  Actual Start = 04/30/2023 00:00:05 TZ EDT
  Earliest Start = 04/30/2023 00:00:00 TZ EDT
  Latest Start = 04/30/2023 23:59:00 TZ EDT
  Latest Start Action = Suppress
  Maximum Duration =
  Maximum Duration Action =
  Minimum Duration =
  Minimum Duration Action =
  Critical Latest Start =
Recovery Information
  Action = Stop
  Message =
  Job Definition =
  Workstation Folder =
  Workstation =
Extra Run Information
  -
General Information
  Job = Job2
  Workstation Folder = /test/
  Workstation = xyz-test
  Monitored = No
  Requires Confirmation = No
  Interactive = No
  Critical = No
Runtime Information
  Status = Running
  Internal Status = EXEC
  Not Satisfied Dependencies = 0
  Rerun Options = Every run
  Rerun on the same workstation = No
  Information = Every
  Promoted = No
  Return Code =
  Return Code Mapping Expression =
Time Information
  Actual Start = 04/30/2023 19:17:12 TZ EDT
  Earliest Start = 04/30/2023 19:15:00 TZ EDT
  Latest Start = 04/30/2023 23:59:00 TZ EDT
  Latest Start Action = Suppress
  Maximum Duration =
  Maximum Duration Action =
  Minimum Duration =
  Minimum Duration Action =
  Critical Latest Start =
Recovery Information
  Action = Stop
  Message =
  Job Definition =
  Workstation Folder =
  Workstation =
  Extra Run Information
  -

Output can be populated by parsing input data and using input fields as shown below.

Eg. $Job(From General Information).$Actual Start(From Time Information).$Actual Start(From Time Information).$Internal Status(From Runtime Information)

Expected Output:

Job1.04/30/2023.00:00:05.SUCC
Job2.04/30/2023.19:17:12.EXEC
sandy
  • 55
  • 8

2 Answers2

1

A potential awk solution:

awk 'BEGIN {
    RS = "General Information"
    FS = "\n"
    OFS = "."
}

{
    for (i = 1; i <= NF; i++) {
        if ($i ~ "Job =") {
            gsub(".*= ", "", $i)
            a = $i
        } else if ($i ~ "Actual Start =") {
            gsub(".*= ", "", $i)
            gsub(" TZ.*", "", $i)
            gsub(" ", ".", $i)
            b = $i
        } else if ($i ~ "Internal Status =") {
            gsub(".*= ", "", $i)
            c = $i
        }
    }
    print a, b, c
}' logfile.txt
Job1.04/30/2023.00:00:05.SUCC
Job2.04/30/2023.19:17:12.EXEC
jared_mamrot
  • 22,354
  • 4
  • 21
  • 46
0

I would harness GNU AWK for this task following way, let file.txt content be

General Information
  Job = Job1
  Workstation Folder = /test/
  Workstation = xyz-test
  Monitored = No
  Requires Confirmation = No
  Interactive = No
  Critical = No
Runtime Information
  Status = Successful
  Internal Status = SUCC
  Not Satisfied Dependencies = 0
  Rerun Options =
  Rerun on the same workstation = No
  Information =
  Promoted = No
  Return Code = 0
  Return Code Mapping Expression =
Time Information
  Actual Start = 04/30/2023 00:00:05 TZ EDT
  Earliest Start = 04/30/2023 00:00:00 TZ EDT
  Latest Start = 04/30/2023 23:59:00 TZ EDT
  Latest Start Action = Suppress
  Maximum Duration =
  Maximum Duration Action =
  Minimum Duration =
  Minimum Duration Action =
  Critical Latest Start =
Recovery Information
  Action = Stop
  Message =
  Job Definition =
  Workstation Folder =
  Workstation =
Extra Run Information
  -
General Information
  Job = Job2
  Workstation Folder = /test/
  Workstation = xyz-test
  Monitored = No
  Requires Confirmation = No
  Interactive = No
  Critical = No
Runtime Information
  Status = Running
  Internal Status = EXEC
  Not Satisfied Dependencies = 0
  Rerun Options = Every run
  Rerun on the same workstation = No
  Information = Every
  Promoted = No
  Return Code =
  Return Code Mapping Expression =
Time Information
  Actual Start = 04/30/2023 19:17:12 TZ EDT
  Earliest Start = 04/30/2023 19:15:00 TZ EDT
  Latest Start = 04/30/2023 23:59:00 TZ EDT
  Latest Start Action = Suppress
  Maximum Duration =
  Maximum Duration Action =
  Minimum Duration =
  Minimum Duration Action =
  Critical Latest Start =
Recovery Information
  Action = Stop
  Message =
  Job Definition =
  Workstation Folder =
  Workstation =
  Extra Run Information
  -

then

awk 'BEGIN{OFS="."}/Job =/{job=$3}/Actual Start =/{sdate=$4;stime=$5}/Internal Status =/{status=$4}$1=="-"{print job, sdate, stime, status}' file.txt

gives output

Job1.04/30/2023.00:00:05.SUCC
Job2.04/30/2023.19:17:12.EXEC

Explanation: I inform GNU AWK to use . character as output field separator (OFS) then when encountering line with Job = I save 3rd field in variable job, for Actual Start = I save date and time in variables sdate and stime, for line with Internal Status = I save 4th field in variable status, for line where 1st field is - I print collected data. Disclaimer: this solution assumes that job name has never whitespace character AND all mentioned data are present AND - in 1st field is terminator, if this does not hold ignore this answer entirely

(tested in GNU Awk 5.1.0)

Daweo
  • 31,313
  • 3
  • 12
  • 25