6

I have added a chat capability to a site using jquery and PHP and it seems to generally work well, but I am worried about scalability. I wonder if anyone has some advice. The key area for me I think is efficiently managing awareness of who is onine.

detail: I haven't implemented long-polling (yet) and I'm worried about the raw number of long-running processes in PHP (Apache) getting out of control.

My code runs a periodic jquery ajax poll (4secs), that first updates the db to say I am active and sets a timestamp. Then there is a routine that checks the timestamp for all active users and sets those outside (10mins) to inactive. This is fairly normal from my research so far. However, I am concenred that if I allow every active user to check every other active user and then everyone update the db to kick off inactive users, then I will get duplicated effort, record locks and unnecessary server load.

So I have implemented an idea of the role of a 'sweeper'. This is just one of the online users, who inherits the role of the person doing the cleanup. Everyone else just checks whether there is a 'sweeper' in existence (DB read) and carries on. If there is no sweeper when they check, they make themselves sweeper (DB write for their own record). If there are more than one, make yourself 'non-sweeper', sleep for a random period and check again.

My theory is that this way there is only one user regularly writing updates to several records on the relevant table and everyone else is either reading or just writing to their own record. So it works OK, but the problem possibly is that the process requires a few DB reads and may actually be less efficient than just letting everyone do the cleanup as with other research as I mentioned.

I have had over 100 concurrent users running OK so far, but the client wants to scale up to several 100's, even over 1,000 and I have no idea of knowing at this stage whether this idea is good or not.

Does anyone know whether this is a good approach or not, whether it is scalable to hundreds of active users, or whether you can recommend a different approach?

AS an aside, long polling / comet for the actual chat messages seems simple and I have found a good resource for the code, but there are several blog comments that suggest it's dangerous with PHP and apache specifically. active threads etc. Impact minimsed with usleep and session_write_close.

Again does anyone have any practical experience of a PHP long polling set up for hundreds of active users, maybe you can put my mind at ease ! Do I really ahve to look to migrate this to node.js (no experience) ?

Thank you in advance

Tony

Tony Jackson
  • 135
  • 1
  • 1
  • 6

3 Answers3

3

My advice would be to do this with meteor framework, which should be pretty trivial to do, even if you are not an expert, and then simply load such chat into your PHP website via iframe.

It will be scalable, won't consume much resources, and it will get only better in the future, I presume.

And it sure beats both PHP comet solutions and jquery & ajax timeout based calls to server.

I even believe you could find on github more or less a completed solution that just requires tweaking.

But of course, do read the docs before you implement it.

If you worry about security issues, read security with meteor

tonino.j
  • 3,837
  • 28
  • 27
  • 1
    Meteor is in early alpha; started as one gaping security hole and way too magic for most needs. I think anyone who's just getting started with this stuff will be more capable of grasping pure nodejs, rather than meteor. – Evert Nov 19 '12 at 15:25
  • Especially since this is an answer to a question from a person that himself has not yet gotten to the conclusion Node (or something else) is the answer; One is better off reducing the problem space and simplifying the code. – Evert Nov 19 '12 at 15:27
  • Except if what you build is a high-budget highly confidential priority app, meteor is a good solution. It has solid, growing community, professional team behind it, working with budget, it is simple to apply and sure beats both PHP comet and building a solution in node from scratch. Believe me, that would present many more security issues for a beginner. Both cases. – tonino.j Nov 19 '12 at 15:44
  • I would agree with you with most frameworks; but with Meteor security has been an after thought. Anything that shares the entire database as a default, allows live code to be transmitted with ease and actively shares code between browser and server should not be trusted as much, as something simpler. Not arguing for PHP Comet, I would go for NodeJS and Socket.io. – Evert Nov 19 '12 at 15:58
  • 1
    But aside from these statements; It's a heavy framework with lots of magic. The fact that the developers consider it early alpha should at the very least raise some concerns. – Evert Nov 19 '12 at 16:04
  • It looks quite nice for quick prototyping and experimenting though.. As long as it's not public-facing. – Evert Nov 19 '12 at 16:05
  • Security IS completely viable with meteor. http://britto.co/blog/security_with_meteor Here is a little quote from 5 months ago: Just because you can request data and functionality from the client doesn't mean you'll get said data or functionality. On Meteor the database API is the same on client and server, but that's just an API. A Mongo command on the client doesn't go directly to MongoDB on the server. You could have an access layer (or controller) on the server thru which all client requests are passed. And this access layer determines which client requests are relayed to the database. – tonino.j Nov 19 '12 at 16:09
  • And **auth** is implemented in meteor. You should be more up-to-date. – tonino.j Nov 19 '12 at 16:10
  • 1
    It is my humble opinion that security is an after thought. Access restriction should be implied, and all access should be closed down by default, not the other way around. – Evert Nov 19 '12 at 16:10
  • Well, we all have our opinions. But numbers in https://github.com/meteor/meteor say more. Obviously there's an enormous number of people that differ. Anyway, for auth, implemented in latest version, and security, here's the link to meteor docs: https://github.com/meteor/meteor http://docs.meteor.com/#dataandsecurity – tonino.j Nov 19 '12 at 16:14
  • Here is also a reply by meteor developers on quora: http://www.quora.com/Meteor-web-framework/Whats-cool-about-Meteor/answer/Rory-I-Sinclair/comment/878076 – tonino.j Nov 19 '12 at 16:17
  • "By default, a new Meteor app includes the autopublish and insecure packages, which together mimic the effect of each client having full read/write access to the server's database. These are useful prototyping tools, but typically not appropriate for production applications. When you're ready, just remove the packages." – tonino.j Nov 19 '12 at 16:18
  • 1
    I understand that this is how the developers feel; I have a very strong (healthy?) paranoia for the type of architecture of this framework. I stand by my main point that in order to write secure Meteor applications, you have to really understand how the entire stack works, and I feel it's a trap for newcomers, because it's so easy, magic and automatic. I have no doubt that the authors of the system can write secure applications with it. – Evert Nov 19 '12 at 16:19
  • I also sincerely doubt that a newcomer to both node, cocket.io and meteor would create more secure application with node & socket.io than with meteor. I think meteor is the way to go. And if I have time in the future, I'll be glad to make an app with it and dare you to hack it :) – tonino.j Nov 19 '12 at 16:23
  • **"meteor is secure enough to be used by banks"** http://britto.co/blog/security_with_meteor – tonino.j Nov 19 '12 at 16:26
  • 1
    I saw that statement, and it made me frown a bit. If a real security researcher would agree with that, I will immediately secede and agree you were correct all along. Meteor is known to have security problems in the past, and has gotten a bad name for it. This guy, who has a vested interest in Meteor claims quite the opposite with this bold statement, and little backup. How can you as a thinking person believe he would be unbiased in making such a statement? – Evert Nov 19 '12 at 21:39
  • Thank you. It's a very interesting technology but it just feels too invasive and heavy for what I want to do. – Tony Jackson Nov 23 '12 at 10:39
  • You would need to use it in an iframe or something. But other than that, its the fastest thing you can do. If you went with socket.io & node, let me know how it went. – tonino.j Nov 25 '12 at 18:56
2

Long polling is indeed pretty disastrous for PHP. PHP is always runs with limited concurrent processes, and it will scale great as long as you optimize for handling each request as quickly as possible. Long polling and similar solutions will quickly fill up your pipe.

It could be argued that PHP is simply not the right technology for this type of stuff, with the current tools out there. If you insist on using PHP you could try ReactPHP, which is a framework for PHP quite similar to how NodeJS is built. The implication with React is also that it's expected to run as a separate deamon, and not within a webserver such as apache. I have no experience on the stability of this, and how well it scales, so you will have to do the testing yourself.

NodeJS is not hard to get into, if you know javascript well. NodeJS + socket.io make it really easy to write the chat-server and client with websockets. This would be my recommendations. When I started with this is, I had something nice up and running within several hours.

Evert
  • 93,428
  • 18
  • 118
  • 189
  • Thank you. As with many comments I've seen, NodeJS + socket.io seems the way to go, assuming I build this myself. I won't waste any more time dabbling with long polling, I'll steer clear ! Have now downloaded node and socket and will dabble. Thanks again! – Tony Jackson Nov 23 '12 at 10:44
2

If you want to keep your application stack using PHP, you want the chat application running in your actual web app (not an iframe) and your concerned about scaling your realtime infrastructure then I'd recommend you look at a hosted service for the realtime updates, such as Pusher who I work for. This way the hosted service handles the scaling of the realtime infrastructure for you and lets you concentrate on building your application functionality.

This way you only need to handle the chat message requests - sanitize/verify the content - and then push the information through Pusher to the 1000's of connected clients.

The quick start guide is available here: http://pusher.com/docs/quickstart

I've a full list of hosted services on my realtime web tech guide.

leggetter
  • 15,248
  • 1
  • 55
  • 61
  • Really interesting. Thank you. I like the pusher concept and have read through some of the docs, seems reasonably simple to put together. I may be in contact, but will start researching with nodejs first. Thanks again. – Tony Jackson Nov 23 '12 at 10:50