I am setting up an ad tracking system where I need to store & analyze access logs. I am using an image pixel for this purpose. The parameters to be tracked will be sent via HTTP Get parameters. Any call to the pixel will contain the parameters - like IP, userid & timestamp that I need to store and analyze.
Which one of the work flows will be better? 1. Make use of apache logging. Setup a process to gather the logs in a common place (HDFS?) and analyse. 2. Store each log entry into a data store (Cassandra?). Analyse.
Would be good to know the pro's and con's of both approaches from someone who has done this before.
Regards,