5

I am considering building a site using php, but there are several aspects of it that would perform far, far better if made in node.js. At the same time, large portions of of the site need to remain in PHP. This is because a lot of functionality is already developed in PHP, and redeveloping, testing, and so forth would be too large of an undertaking, and quite frankly, those parts of the site run perfectly fine in PHP.

I am considering rebuilding the sections in node.js that would benefit from running most in node.js, then having PHP pass the request to node.js using Gearman. This way, I scan scale out by launching more workers and have gearman handle the load distribution.

Our site gets a lot of traffic, and I am concerned if gearman can handle this load. I wan't to keep this question productive, so let's focus largely on the following addressable points:

  • Can gearman handle all of our expected load assuming we have the memory (potentially around 3000+ queued jobs at at time, with several thousand being processed per second)?
  • Would this run better if I just passed the requests to node.js using CURL, and if so, does node.js provide any way to distribute the load over multiple instances of a given script?
  • Can gearman be configured in a way that there is no single point of failure?
  • What are some issues that you guys can see arising both in terms of development and scaling?

I am addressing these wide range of points so anyone viewing this post can collect a wide range of information in one place regarding matters that strongly affect each other.

Of course I will test all of this, but I want to collect as much information as possible before potentially undertaking something like this.

Edit: A large reason I am using gearman is not because of it's non-blocking structure, but because of it's sheer speed.

user396404
  • 2,759
  • 7
  • 31
  • 42
  • _Can gearman handle all of our expected load?_ That's a how-long-is-a-piece-of-string question, and depends on how many worker servers you have available, how quickly your queues fill up, and how much effort each item takes to process. If you can do several thousand per second on one server, and you are not queueing them at a higher constant rate, I should think you'll be fine. Gearman can be used in blocking and non-blocking modes btw, and is definitely worth a go. – halfer May 30 '12 at 11:46

2 Answers2

4

I can only speak to your questions on Gearman:

Can gearman handle all of our expected load assuming we have the memory (potentially around 3000+ queued jobs at at time, with several thousand being processed per second)?

Short: Yes

Long: Everything has its limit. If your job payloads are inordinately large you may run into issues. Gearman stores its queue in memory.. so if your payloads exceed the amount of memory available to Gearman you'll run into problems.

Can gearman be configured in a way that there is no single point of failure?

Gearman has a plugin/extension/component available to use MySQL as a persistence store. That way, if Gearman or the machine itself goes down you can bring it right back up where it left off. Multiple worker-servers can help keep things going if other workers go down.

Community
  • 1
  • 1
Mike B
  • 31,886
  • 13
  • 87
  • 111
  • Thanks. This answers some of the key points I had about gearman. I will test these things out to see if it works properly for our implementation. – user396404 May 25 '12 at 20:12
1

Node has a cluster module that can do basic load balancing against n processes. You might find it useful.

A common architecture here in nodejs-land is to have your nodes talk http and then use some way of load balancing such as an http proxy or a service registry. I'm sure it's more or less the same elsewhere. I don't know enough about gearman to say if it'll be "good enough," but if this is the general idea then I'd imagine it would be fine. At the least, other people would be interested in hearing how it went I'm sure!

Edit: Remember, number-crunching will block node's event loop! This is somewhat obvious if you think about it, but definitely something to keep in mind.

Josh Holbrook
  • 1,611
  • 13
  • 10
  • Thanks. This seems like a viable load balancing option for node.js. I will have to look into how it distributes the load, and what types of redundancy and persistence are available. – user396404 May 25 '12 at 20:14
  • Also, I should have clarified in my post that event blocking is not a major concern of mine since I am using node.js primarily for it's speed. I have updated my original post to make this more clear,. – user396404 May 25 '12 at 20:20
  • While this is likely too early stage for your site, you may be interested in this port of Node to PHP: http://nodephp.org – Wes Johnson May 26 '12 at 14:14