I'm working with a simplified example in which there are workers which can have multiple lifecycles in which they perform tasks. (This is similar to the example of users logging into different sessions and performing shell commands given in https://community.splunk.com/t5/Splunk-Search/Any-example-for-MAP-command/m-p/88473).
When a task is started, a taskID
and lifecycleID
is logged. However, I would also like to look up the corresponding workerID
which would have been logged together with the lifecycleID
in a previous log line when the lifecycle started.
Consider the following example data:
{
"level": "info",
"lifecycleID": "af331787-654f-441f-ac06-21b6b7e0c984",
"msg": "Started lifecycle",
"time": "2022-04-02T21:15:38.07991-07:00",
"workerID": "c51df20b-f157-4002-8292-4583ebd3ba9e"
}
{
"level": "info",
"lifecycleID": "af331787-654f-441f-ac06-21b6b7e0c984",
"msg": "Started task",
"taskID": "9de93d09-5e6e-4648-9488-dda0e3e58765",
"time": "2022-04-02T21:15:38.181107-07:00"
}
{
"level": "info",
"lifecycleID": "03d2148c-b697-4d8e-a3ca-f0fb68d2bbb9",
"msg": "Started lifecycle",
"time": "2022-04-02T21:15:38.282264-07:00",
"workerID": "c51df20b-f157-4002-8292-4583ebd3ba9e"
}
{
"level": "info",
"lifecycleID": "03d2148c-b697-4d8e-a3ca-f0fb68d2bbb9",
"msg": "Started task",
"taskID": "243bf757-85c6-4c6e-9eec-6d74886ec407",
"time": "2022-04-02T21:15:38.383176-07:00"
}
{
"level": "info",
"lifecycleID": "9cab44b4-5600-47b3-9acd-47b2641cb0d5",
"msg": "Started lifecycle",
"time": "2022-04-02T21:15:38.483304-07:00",
"workerID": "0b82966c-cc98-48f0-9a36-a699e2cee48c"
}
{
"level": "info",
"lifecycleID": "9cab44b4-5600-47b3-9acd-47b2641cb0d5",
"msg": "Started task",
"taskID": "864819ed-208d-4d3d-96b9-1af4c4c42b08",
"time": "2022-04-02T21:15:38.584478-07:00"
}
{
"level": "info",
"lifecycleID": "9cab44b4-5600-47b3-9acd-47b2641cb0d5",
"msg": "Finished task",
"taskID": "864819ed-208d-4d3d-96b9-1af4c4c42b08",
"time": "2022-04-02T21:15:38.684633-07:00"
}
I would like to generate a table which shows the workerID
, lifecycleID
, and taskID
for each of the three tasks started. So far what I've come up with is
index="workers" msg="Started task"
| stats count by lifecycleID
| map search="search index=workers msg=\"Started lifecycle\" lifecycleID=$lifecycleID$"
| table workerID, lifecyleID, taskID
However, this doesn't appear to retain the lifecycleID
and taskID
(like it would if I were to omit the map
and simply count by lifecycleID, taskID
):
How can I make it such that I can display all three values in the table?
Update
I've attempted RichG's answer using a subsearch,
index=workers msg="Started lifecycle"
[ search index="workers" msg="Started task"
| stats count by lifecycleID
| fields lifecycleID
| format ]
| table workerID, lifecyleID, taskID
but it generates output that is identical to the one generated in my own attempt using a map
, i.e. without the lifecycleID
or taskID
: