0

I have a setup where there are multiple Apache / PHP servers collecting data from a request - Mainly the Get parameters and doing processing on it and saving it to either a database or flat file. The database is fine as everyone can connect independently and do the updates but with flat file, I am using a custom made queue which sends all data to one server where the flat file resides. My questions are:

  1. Are there any good and reliable Log files processing systems I can use. I basically need to aggregate data coming in the log files and save it after some post processing into DB. If this is possible, I can simply have a nginx to log all requests (access.log) and run backend daemons to crunch the logs. I receive around 1000+ requests / second so I definately need a very robust system

  2. Are there any good queue systems compatible with PHP and which is shared across multiple machines. A solution mainly over memcache where all information can be added from any node and can be accessed at any node at a very high speed. I need a system which can take bulk of data form the queue every second, process it and do the needful with it (save in DB). I dont think having queues on individual servers is very scalable as What I need to do is some level of aggregation before saving data. Since data can come on any of the PHP server (in round robin method), I do processing in MySQL (and hence use complicated queries), if I could collect all data at one server and let it do some processing and save in DB, that would ease my job

Thanks Sparsh

Sparsh Gupta
  • 1,127
  • 7
  • 21
  • 31
  • for (1) you could use something like syslog-ng and log to some central set of servers -- not entirely sure if that's appropriate for your situation, but it might be a bit easier to manage. for (2), have you looked at http://www.zeromq.org/ ? – MrTuttle Jun 08 '11 at 22:22

1 Answers1

1

Are you really suggesting that you are going to use access logs as a data substrate for an asynchronous message handling system? If so - please don't. It's not transactionally secure, its not intended for concurrent access.

I've read your question several times and its not clear what you are trying to process where and why.

Are there any good queue systems compatible with PHP and which is shared across multiple machines

OK - that's a proper question. One solution I used a long time ago was to to use the BSD LPD system to manage job queues - but if I were implementing a solution today, then I'd be looking at rabbitmq, beanstalkd, sam...

symcbean
  • 21,009
  • 1
  • 31
  • 52