0

I have some log files that are logged in a Unix server by a front end application. These files have all the logging statements that starts with a timestamp value followed by the logging text. An example of how the logging is done in these files is shown below sample :

02 07:31:05.578 logging text........(I bolded the timestamp here because I will use it to explain this timestamp notation below)

02 07:31:05.579 logging text........

02 07:31:05.590 logging text........

02 07:31:05.591 logging text........

02 07:31:05.593 logging text........

Time stamp value explanation -

02 : Date Value (If date is July 02, so the value will be 02)

07 : Hours

31 : Minutes

05 : Seconds

578 : Milliseconds

Note : Kindly ignore if you are looking for the 'YYYY' i.e year field. For simplicity please stick to above format only.

What I have to achieve : I have to find out those exact two consecutive timestamps in a give file that has the maximum difference between them as compared to all other pairs of consecutive timestamps in the given file.

example : If you see above sample of logging text you will see the only pair of consecutive time stamp in the above sample having max difference is 02 07:31:05.579 and 02 07:31:05.590

I am looking for a shell script that I can run on the required file and get the output as the two consecutive timestamps that have the maximum difference.

Why I need it : There are many such log files that I need to monitor for the cases where there is a huge difference between the logging statements. This could potentially help me find out situations like SQL query is waiting for long for the transaction to happen due to locks, API request is not getting the response from the destination etc.

If anyone can also share any other posts to this question or any other efficient way that would be helpful.

Thank you everyone for reading and taking out your time. Please let me know if anymore information is required.

utkarsh-k
  • 836
  • 8
  • 17
  • ... You must read [ask] – J. Chomel Sep 12 '18 at 12:50
  • @J.Chomel, thanks please see I updated the post after going through your link. – utkarsh-k Sep 12 '18 at 13:59
  • In general this will not be possible because of the castrated date used. You could have `28 23:59:59.999`, followed by `01 00:00:00.001`. This could mean 2 milliseconds (at the end of February in a non-leap-year) or three days (at the end of August). But since we don't know the month, we just can take a guess. Converting the timestamps into UNIX epoch times (seconds wince 1970-01-01) also isn't possible given this input. So who ever came up with this time stamp format made your job quite hard. – Alfe Sep 13 '18 at 09:28

2 Answers2

0

What you could do is write a script with the awk command. You have examples here on how to convert dates with awk: Converting dates in AWK.

This will help you parse the file, and add 2 columns at the beginning of each line:

  • line number
  • difference compared to previous line

Then you have to sort the resulting file using the second column, and you are done.

Of course, it would be too easy if I wrote the script for you (and it is a lot of time I don't really have). So you must try the above on your own, and then come back with specific questions. Here your question is too broad in comparison with the on topic questions of SO.

J. Chomel
  • 8,193
  • 15
  • 41
  • 69
  • thanks for your reply. I have started working on it, and yes my question is broad. Let me work on the script first and I will get back to this post with my work. – utkarsh-k Sep 12 '18 at 16:07
0

I would propose to walk through the lines, convert every timestamp into a UNIX epoch time (seconds since 1970-01-01, date can do this). Unfortunately you lack the month and year, but maybe you can just assume the current month and year; except for month-borders this should give correct results for the distances anyway.

Then I would just give out each line again with e difference of its timestamp to the former one. So out of

02 07:31:05.579 logging text........
02 07:31:05.590 logging text........
02 07:31:05.591 logging text........
02 07:31:05.593 logging text........

I would make

0.000 02 07:31:05.579 logging text........
0.011 02 07:31:05.590 logging text........
0.001 02 07:31:05.591 logging text........
0.002 02 07:31:05.593 logging text........

Then you can simply sort -g this new output to sort it by time between the line its predecessor. The last line will be the line with the maximum timestamp difference.

Alfe
  • 56,346
  • 20
  • 107
  • 159