0

I have an application which I am gearing up for a SaaS multi-tenant situation, but I am trying to figure out the best way for server redundancy and load balancing to support a lot of active users simultaneously.

I use ISPConfig which appears to support clustering or server mirroring, but is mirroring the right method to support large scale usage? Should one setup ISPConfig in a multi-server situation, and have many separate servers/databases, or should one have many servers as a mirrored cluster? What effects would this have on the database? Should one look for an external database, perhaps Amazon RDS, which all the servers connect to?

I appreciate any direction anybody can provide with this.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
  • Have you heard of database systems such as [Cassandra](https://cassandra.apache.org/)? – Alexis Wilke Sep 07 '16 at 20:53
  • I have not. I will look in to it. Do you know if its compatible with ISPConfig at all? I would preferably not like to change that if possible. – Darren Peck Sep 07 '16 at 21:50
  • It looks like they only support MySQL. Mirroring is probably your best bet in that case. Anyway... if you never heard of Cassandra and already have an App. you're too late. Converting would be a lot of work... You may want to check out here what people say about that situation: http://serverfault.com/search?q=mirror+mysql – Alexis Wilke Sep 07 '16 at 22:31
  • Thanks Alexis! I'll take a look. The good news is I am working on a version 2 of the app, and am rebuilding the architecture trying to prep it for a lot more users down the road. So now's the time i'm trying to pick the right way to get this working well. – Darren Peck Sep 07 '16 at 23:44

1 Answers1

0

If you are working on a new version of a software that requires a very large number of users, there is usually one big bottleneck: the database.

For that reason, many large scale websites, such as Facebook and Twitter have moved away from SQL databases and moved on using systems that allow them to scale mostly horizontally. These are call NoSQL databases. This one I use is called Cassandra. The more users you get, the more Cassandra nodes you add and it can grow pretty much as big as you'd like. Netflix said around 2015 that they had over 2,000 such nodes.

Cassandra automatically mirrors data between nodes.

If you'd like, you can also look at Scyllab which is a Cassandra in C++ instead of Java. I would be using that if I did not have a problem with deletion from the database... right now, it is not an option for me, unfortunately. There are several other NoSQL systems, but these are the only two I tried.

Next... you write your application that runs on a separate computer (while doing development you can do it on the same computer, that's fine, but you need at least 4Gb of RAM for Cassandra...) Make 100% sure that ALL the data which is not read-only static data installed by your application is saved IN THE DATABASE. Otherwise you would have a copy on computer A and not on computer B. So someone accessing computer B would never have access to that data on computer A. However, if all the data (i.e. even files uploaded) is in the database, any front end application instance can access it. What does this mean? That you can have as many front ends as you need to sustain your load. You structure becomes something like this:

                        Internet Users
                               |
                               v
                     +--------------------+
                     |                    |
                     |   Load Balancer    |
                     |                    |
                     +--------------------+
                               |
       +-----------------------+--------------------+
       |                       |                    |
       v                       v                    v
+----------------+    +----------------+    +----------------+
|                |    |                |    |                |
|  Apache/App    |    |  Apache/App    |    |  Apache/App    |
|                |    |                |    |                |
+----------------+    +----------------+    +----------------+
       ^                       ^                    ^
       |                       |                    |
       v                       v                    v
+----------------+    +----------------+    +----------------+
|                |    |                |    |                |
|  Cassandra     |<-->|  Cassandra     |<-->|  Cassandra     |
|                |    |                |    |                |
+----------------+    +----------------+    +----------------+

Note: Each box represents a computer, your minimum cluster will probably be around 7 computers. However, each can be a pretty cheap Cloud computer instead of a 16 processors dedicated server...

One important note: To run properly (as in safely for the data), Cassandra requires at least 4 nodes and a replication factor of 3. You also need a secure (private) LAN network. If such is not available, you will have to install something such as OpenVPN to make sure all data travels encrypted between nodes (this is true even with other products that offer such duplication.)

The Load Balancer is just a process that checks the load of the frontend computers and decides to which to send the next request. Apache2 has a module for that. Note that I know of Apache, I'd bet other HTTP servers also offer similar modules. That's up to you to decide which solution is best for your company. You may also scratch that completely and use the DNS round robins capability. It's not as good, but definitely much easier to setup and may be enough if each access to your app. is pretty symmetrical (uses about the same amount of time). If some requests take 3 seconds when others use 50 ms, you may run in problems with just a DNS RR setup.

The Application can be in any language. I use C++, but I could have used Java, PHP, plain C (really?!), perl, etc. Whatever works for you. Just remember, NO DATA on the frontends. All in the database and you'll be fine. The mirroring is done by Cassandra so your work in done at that point.

Note that the graph I showed above is not quite correct. The frontends should connect to any Cassandra node. Often the drivers will connect to 2 or 3 nodes to make sure you always have access to the data. Then, once you go from 1,000 users to 10,000, you just add a few frontends and a few Cassandra nodes, and voilà! It will all work as before (same speed for the clients).

As far as using ISPConfig, certainly. Will ISPConfig help you with all of those pieces? Probably not. At least not without adding your own module(s)... But you should be able to use it for what it offers: setup your DNS, setup your emails, probably do a large part of the Apache setup, etc.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
  • Alexis, thank you so much! That is very helpful information! – Darren Peck Sep 09 '16 at 00:01
  • ISPConfig I know you can setup to cluster apache and mysql. It would make sense then for me to still use ISPConfig (which I am very comfortable with), and PHP which I program in. ISPConfig can take care of my apache server clustering, and from there I just need to figure out a good way to connect to the Cassandra cluster. How does the Cassandra cluster work in this case? Your diagram shows each apache server to each Cassandra node, is this good practice for a 1:1 ratio of Apache vs Cassandra serves? How does an Apache server know which Cassandra node to use? – Darren Peck Sep 09 '16 at 00:03
  • @DarrenPeck, you want to read up on Cassandra to learn how it works. To connect, you use the driver and generally you specify what is called a seed node. You should have 3 to 6 seeds in one "rack." (one location) If you are to create multiple locations, have another few seeds there. As you read about Cassandra, you'll learn all those things and you should ask other questions after you started testing with it to see how it works for you. It's not like MySQL... Their language is called CQL and has limits... http://www.planetcassandra.org/apache-cassandra-client-drivers/ – Alexis Wilke Sep 09 '16 at 00:43
  • Thanks Alexis, I appreciate your help! – Darren Peck Sep 09 '16 at 12:53