One of my servers has recently been switched to using php-fpm.
The error logs now log 404's in a new format:
[Sun Dec 26 00:11:37.827426 2021] [proxy_fcgi:error] [pid 25239:tid
140600822003456] [client 66.249.66.136:37676] AH01071: Got error 'PHP message: File
does not exist: /ads.txt'
[Sun Dec 26 00:14:53.732771 2021] [proxy_fcgi:error] [pid 24741:tid
140601015035648] [client 207.46.13.93:9600] AH01071: Got error 'PHP message: File
does not exist: /events/view/id/633/supercharge'
I previously used a command-line script (using awk
), written by one of my colleagues many years ago, to parse the logs and extract the URLs that were 404ing and then did some manual excel work to get a tally of any addresses that were erroring but receiving a reasonable number of requests. I'm reasonably comfortable (with the awk manual) to update this script...
But, before I jump in and start editing this script I was suspecting there must be a better way to parse these large log files. Any suggestions for a better approach?