Clustering with Vsphere and Apache load balacing

Question

I was looking at Vsphere to automatically load balance my apache web servers, and mysql servers but will this actually do the job?

I know it says it does auto load balancing but not actually sure it quite means what I want to achieve.

Is there an easier way to set up a php-apache and mysql cluster within virtual machines?

Or are there any guides to clustering? (I have tried googling without much luck)

Any help towards understanding/setting up clustering and load balancing within virtual machines appreciated.

GioMac · Accepted Answer · 2012-09-12T08:41:05.307

What vSphere does:

Clustering - DRS = running several VM's on more than one host, vSphere will automatically decide on which host to move running VM
High Availability = running guest on more than one host in parallel - if one goes down - second one assures service availability

MySQL clustering - not easy, cannot describe details in here, but generally:

Depends on DB Engine (MyISAM, InnoDB, XtraDB)
Very limited
Shared storage - mostly not possible
Easier to implement single write and multiple read nodes with replication

Apache/web load balancing, optimal config:

Network level - TCP load balancing, requires at least 1 balancer node for normal operation, 2 for HA (active-backup). Implemented via Linux Virtual Server (LVS) kernel module:
- Piranha web GUI (RHEL and Fedora) (enough, easy to implement)
- Keepalived
- UltraMonkey
- Linux-HA
OS level - networking configuration, gateway, ARP configuration (LVS-DR), assigning global IP's to the localhost (127.0.0.1 - LVS-NAT).
Application level
- Apache level - no changes, just keep it running on port 80, so applications can see themselves (loopback requests)
- PHP level - sessions, tmp must be shared between nodes too
- Use mysql-proxy running on both socket and port listening mode, forwarding requests to mysql server
Storage level - data between web servers must be shared with something very fast, like GlusterFS

Performance for LVS is pretty high - it's kernel-level, dual core 1.6 GHz can run more than gigabit in NAT mode and prevent many network level problems. Virtual IP will be taken on L2 by the LB server and requests will be forwarded by the kernel to the server IP's which might have same IP (LVS-DR) or with NAT (LVS-NAT). LVS-DR requires specific arp configuration on server sides, because all the servers will be running with same IP. LVS-NAT implementation is easier and you may load-balance everything you want to any server. To work with LVS-NAT normally, use kernel 2.6.37 or later. Balancer can act as gateway and firewall too. Connection persistence must be set to avoid some problems (see LVS Docs) and kernel TCP timeouts must be set to minimal values. You must also write scripts for checking host availability - if it's working fine or not. Try to have 100% equal configuration on web nodes.

Globally - it's very nice, very effective and optimal architecture, but will require some tweaking after. Bottleneck is storage response and mysql. php-apache HA/Balancing is working ideally.

Squid, mod_proxy_balancer, HAproxy etc are user-level apps, ineffective and dumb:

Squid can be used as proxy or reverse proxy (better use Varnish) for HTTP termination or/and security - filter. You can add it on each web server and forward requests internally to hide your web server info etc, additionally will need mod_extract_forwarded to keep source ip's correctly identified by the web server and application.
mod_proxy_balancer it just another "had nothing to do - i wrote apache module" thing in this situation.
haproxy is very dumb thing - application listening to the port and initiating requsts to backend - i hate it.

You can play with RedHat cluster suite and get some improvements for file sharing (GFS2 etc), have application HA there, but will require more effort from your side.

For balancer node - I strictly recommend Fedora 15 or later (newer kernel - the better). For else - whatever you want, even Windows (but will have some problems with loopback http access).

Also, I recommend you to use LACP-based bonding on storage-web sides.

Wow, I never knew clustering was that complicated. Just from reading your post I know I'm going to in over my head. Thanks for your post - very detailed and does help a lot. I was wondering if you could point me in a direction where I could learn how to use fedora as a load balancer please. I'm not exactly familiar with linux, I can use it but nothing past configuring it, setting up anything complicated etc. As for mysql clustering, I've looked at the mysql cluster tool on their website, complicated but I'm sure after a while I could figure it out. Thanks — RobAtStackOverflow, Sep 12 '12 at 11:40

Chopper3 · Answer 2 · 2012-09-12T07:02:16.093

0

This isn't the solution you're looking for. vSphere clustering is about protecting whole VMs, not the applications on those VMs, you need a load-balancer for that sorry, something as simple as a Squid/HA proxy all the way up to Cisco ASA/F5 kit would be appropriate.

edited Sep 12 '12 at 07:02

answered Sep 12 '12 at 06:56

Chopper3

101,299
9
108
239

Ah right ok thanks, I must've totally misunderstood what Vsphere was about. Thanks for the clarification though, appreciate that. – RobAtStackOverflow Sep 12 '12 at 11:44

Clustering with Vsphere and Apache load balacing

2 Answers2