-2

I want analysis access_log, then store the visitors ip, date, pageview or image view into a mysql table.

I'm plan to use cron job call a php script every minute, open the access_log.

But now the access_log has already more than 500MB, and it will increase about 0.4MB per minute. (usually 50-350 records)

so how to open such a big file in php? read last 500 records then regex get the last minute's records. My server has 32GB RAM, memory is no problem, but i need a lower cpu usage. Anyone can help me with a simple code? Thanks.

Edit

take advice by @Jeremiah Winsley, I use rotatelogs in my httpd.conf. But no log has been created. where is the problem?

<IfModule log_config_module>
    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" com                                                                                        bined
    LogFormat "%h %l %u %t \"%r\" %>s %b" common
    <IfModule logio_module>
      LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I                                                                                         %O" combinedio
    </IfModule>
    CustomLog "logs/access_log" combined env=!dontlog
    CustomLog "|sbin/rotatelogs -f logs/my_log 60" combined env=!dontlog
    #create a my_log every 1 minute.
    SetEnvIf Remote_Addr "127\.0\.0\.1" dontlog
    SetEnvIf Remote_Addr "::1" dontlog
</IfModule>
fish man
  • 2,666
  • 21
  • 54
  • 94
  • Please provide your current code! – cmorrissey Jan 30 '15 at 21:31
  • 1
    That is the wrong way to go about doing that. If you really need to handle these yourself instead of using a log analytics service, you should use logrotate with cron to parse the logfiles at reasonable intervals, instead of trying to read the live access log every minute. – Jeremiah Winsley Jan 30 '15 at 21:38
  • @Jeremiah Winsley, how to set in logrotate? if it could save twice. one is the original access_log, it store the log in all time; another strore access information per minute. thanks. – fish man Jan 30 '15 at 21:47
  • http://www.rackspace.com/knowledge_center/article/understanding-logrotate-utility may be helpful - note the postrotate setting, which you could use to trigger your script each time you rotated it. – Jeremiah Winsley Jan 30 '15 at 22:01

1 Answers1

0

Don't try an reinvent the wheel. Always try and search for tools others may have developed. Web servers have been around for a long time so there are plenty of log parsers and server tools.

Give this PHP parser a try and save your self some time.

https://github.com/kassner/log-parser

As noted I wouldn't do it on live files. You should probably setup a cronjob for like every 5 or 10 minutes and rotate the files more frequently. And use the recently rotated file. However that can cause a lot of files to build up.

Panama Jack
  • 24,158
  • 10
  • 63
  • 95