2

Consider an access log of a REST API, you will see lines (simplified) that looks like this:

2017-01-01T12:12:41Z "GET /api/posts" HTTP/1.1 200 "-"
2017-01-01T12:12:42Z "GET /api/posts/56/comments" HTTP/1.1 200 "-"
2017-01-01T12:12:42Z "GET /api/posts" HTTP/1.1 200 "-"
2017-01-01T12:12:56Z "POST /api/posts" HTTP/1.1 202 "Safari"
2017-01-01T12:12:58Z "GET /api/posts/134/comments" HTTP/1.1 200 "-"

To parse that you could write something like :

_collector=access.log | regex parse "(?<method>[A-Z]+) /api/(?<path>[\w\d\/]+) HTTP"

This would extract METHOD and PATH form the log lines, BUT you would see these unique values:

  • GET posts
  • POST posts
  • GET posts/56/comments
  • GET posts/134/comments

I wish to throw away all the dynamic parts of the url, so I could find the following instead:

  • GET posts
  • POST posts
  • GET posts/{id}/comments

I could figure out this in a search and replace regex easily enough, but is it even possible in Sumologic?

Alexander Morland
  • 6,356
  • 7
  • 32
  • 51
  • Is the ID consistently the second resource in the URL? For example, "/api/posts/56/comments" or "/api/posts/56/shares", or "/api/users/72". You know what I mean? If so, why not parse out each resource of the URL, then you can keep or toss whatever you want (parent resource, id, child resource, etc.). Maybe I'm misunderstanding the question though... – the-nick-wilson Oct 31 '17 at 21:24
  • Also what exactly are you trying to see in the Sumo Logic output? – the-nick-wilson Oct 31 '17 at 21:24

0 Answers0