1

Let's say that when a user logs into a webapp, he sees a list of information.

Let's say that list of information is served by one of two dynos (via heroku), but that the list of information originates from a single mongo database (i.e., the nodejs dynos are just passing the mongo information to a user when he logs into the webapp).

Question: Suppose I want to make it possible for a user to both modify and add to that list of information.

At a scale of 1,000-10,000 users, is the following strategy suitable:

  1. User modifies/adds to data; HTTP POST sent to one of the two nodejs dynos with the updated data.
  2. Dyno (whichever one it may be) takes modification/addition of data and makes a direct query into the mongo database to update the data.
  3. Dyno sends confirmation back to the client that the update was successful.

Is this OK? Would I have to likely add more dynos (heroku)? I'm basically worried that if a bunch of users are trying to access a single database at once, it will be slow, or I'm somehow risking corrupting the entire database at the 1,000-10,000 person scale. Is this fear reasonable?

George
  • 6,927
  • 4
  • 34
  • 67

1 Answers1

1

Short answer: Yes, it's a reasonable fear. Longer answer, depends.

MongoDB will queue the responses, and handle them in the order it receives. Depending on how much of it is being served from memory, it may or maybe not be fast enough.

NodeJS has the same design pattern, where it will queue responses it doesn't process, and execute them when the resources become available.

The only way to tell if performance is being hindered is by monitoring it, and seeing if resources consistently hit a threshold you're uncomfortable with passing. On the upside, during your discovery phase your clients will probably only notice a few milliseconds of delay.

The proper way to implement that is to spin up a new instance as the resources get consumed to handle the traffic.

Your database likely won't corrupt, but if your data is important (and why would you collect it if it isn't?), you should be creating a replica set. I would probably go with a replica set of data before I go with a second instance of node.

Stephen Punwasi
  • 466
  • 4
  • 14
  • 1
    So the bottleneck isn't going to be the node dynos, it's going to be the fact that there's only one db? What do you mean "spin up a new instance as the resources get consumed" -- spin up a new instance in node? – George Apr 05 '15 at 19:21
  • @George Well not necessarily, but unless your application is CPU intensive, your bottle neck will most likely be the database writes/reads. 1k-10k isn't a huge number of users unless they all decide to use it at the same second. Are you already hitting this number with your users? I used the term instance as in EC2 instance, but I meant Dyno in your case. (Dynos are fractional uses of EC2 instances). Heroku has Adept Scale, which will +/- Dynos automatically, but you still need to optimize your app to use it properly. Starting point https://devcenter.heroku.com/articles/node-concurrency – Stephen Punwasi Apr 05 '15 at 21:51