Goal
I wish to use RRDTool to count logical "user activity" from our web application's apache/tomcat access logs.
Specifically we want to count, for a period, occurrences of several url patterns.
Example
We have two applications (call them 'foo' and 'bar')
These url's interest us. They indicate when users 'did interesting stuff'.
/foo/hop
/foo/skip
/foo/jump
/bar/crawl
/bar/walk
/bar/run
Basically we want to know for a given interval (10 minutes, hour, day, etc.) how many users: hopped,skipped,jumped,crawled, walked, etc.
Reference/Starting point
This article on importing access logs into RRDTool seemed like a helpful starting point. http://neidetcher.com/programming/2014/05/13/just-enough-rrdtool.html
However to clarify, this example uses the access log directly , whereas we want to a handful of url's 'in buckets' and count the 'number in each bucket'
Some Scripting Required..
I could do this with bash & grep & wc --iterating through the patterns, sending output to an 'intermediate results' text file....but believe RRDTool could do this with minimal 'outside coding'
That said, I believe RRDTool could do this with minimal 'outside coding'--but am unclear on the details.
Some points
- I mention 'two applications' because we actually serve them up from separate servers with different log file formats. I'd like go get them into the same RRA file
Eventually I'd like to report this in cacti; initially however, I wanted to understand RRDTool details
Open to doing any coding, but would like to keep it as efficient as possible--both administratively and computer-resources. (By administratively, I mean: easy to monitor new instances)
I am very new to RRDTool and am RTM'ing . (and Walking through the Tutorial). I'm used to relational databases and spreadsheets, etc and don't have my mind around all the nuances of the RRA format.
Thanks in advance!