What vSphere does:
- Clustering - DRS = running several VM's on more than one host, vSphere will automatically decide on which host to move running VM
- High Availability = running guest on more than one host in parallel - if one goes down - second one assures service availability
MySQL clustering - not easy, cannot describe details in here, but generally:
- Depends on DB Engine (MyISAM, InnoDB, XtraDB)
- Very limited
- Shared storage - mostly not possible
- Easier to implement single write and multiple read nodes with replication
Apache/web load balancing, optimal config:
- Network level - TCP load balancing, requires at least 1 balancer node for normal operation, 2 for HA (active-backup). Implemented via Linux Virtual Server (LVS) kernel module:
- Piranha web GUI (RHEL and Fedora) (enough, easy to implement)
- Keepalived
- UltraMonkey
- Linux-HA
- OS level - networking configuration, gateway, ARP configuration (LVS-DR), assigning global IP's to the localhost (127.0.0.1 - LVS-NAT).
- Application level
- Apache level - no changes, just keep it running on port 80, so applications can see themselves (loopback requests)
- PHP level - sessions, tmp must be shared between nodes too
- Use mysql-proxy running on both socket and port listening mode, forwarding requests to mysql server
- Storage level - data between web servers must be shared with something very fast, like GlusterFS
Performance for LVS is pretty high - it's kernel-level, dual core 1.6 GHz can run more than gigabit in NAT mode and prevent many network level problems. Virtual IP will be taken on L2 by the LB server and requests will be forwarded by the kernel to the server IP's which might have same IP (LVS-DR) or with NAT (LVS-NAT). LVS-DR requires specific arp configuration on server sides, because all the servers will be running with same IP. LVS-NAT implementation is easier and you may load-balance everything you want to any server. To work with LVS-NAT normally, use kernel 2.6.37 or later. Balancer can act as gateway and firewall too. Connection persistence must be set to avoid some problems (see LVS Docs) and kernel TCP timeouts must be set to minimal values. You must also write scripts for checking host availability - if it's working fine or not. Try to have 100% equal configuration on web nodes.
Globally - it's very nice, very effective and optimal architecture, but will require some tweaking after.
Bottleneck is storage response and mysql. php-apache HA/Balancing is working ideally.
Squid, mod_proxy_balancer, HAproxy etc are user-level apps, ineffective and dumb:
- Squid can be used as proxy or reverse proxy (better use Varnish) for
HTTP termination or/and security - filter. You can add it on each web server and forward requests internally to hide your web server info etc, additionally will need mod_extract_forwarded to keep source ip's correctly identified by the web server and application.
- mod_proxy_balancer it just another "had nothing to do - i wrote apache module" thing in this situation.
- haproxy is very dumb thing - application listening to the port and initiating requsts to backend - i hate it.
You can play with RedHat cluster suite and get some improvements for file sharing (GFS2 etc), have application HA there, but will require more effort from your side.
For balancer node - I strictly recommend Fedora 15 or later (newer kernel - the better). For else - whatever you want, even Windows (but will have some problems with loopback http access).
Also, I recommend you to use LACP-based bonding on storage-web sides.