Clustering F5 Big-IP Devices; Is it Possible?

Question

I developed a web application. It will run on a farm of five IIS servers. It's absolutely essential that once a session is established with a server in the farm, further HTTP requests within that session are directed to that same server for the remainder of the session.

Today I discovered F5's web site and learned about "sticky sessions." Due to the fact that most of my users will be mobile (e.g. iPhone), it's possible that their IP address may change in the midst of a single session. This means that the source IP can't be used to identify unique sessions. White papers suggest that the F5 LTM device provides a solution for this, allowing me to use content within the HTTP request itself [or a cookie] to determine session identity.

So far so good. But then I got to thinking… That F5 device is a single point of failure. Furthermore, what if I drastically increase capacity and want to add more F5 devices? Setting up a cluster of them makes sense. But despite Googling as best I can, I can't find a white paper describing the basic concepts of how a cluster functions "under the hood."

Consider… Let's say I buy two F5 devices. They share a common "virtual" IP on the external (Internet-facing) interface of my network. An HTTP connection comes in and somehow the two devices determine that F5 #1 should answer the call. F5 #1 now has an in-memory map that associates the identity of that session [via cookie, let's say] to internal web server #4. Two minutes later that same customer initiates a new HTTP connection as part of the same session. The destination "virtual" IP is the same but his source IP address has changed. How in the world could I guarantee that F5#1 will receive that connection instead of F5#2? If the former receives it, we're in good shape because it has an in-memory map to identify the session. But if the latter receives it, it won't be aware that the session is associated with web server #4.

Do the two F5 devices share information with each other somehow in order to make this work? Or is the configuration I'm describing just not a practical/common way to do things?

Sorry for the newb questions… this stuff is all new to me.

Usually you'll find that clustering in these devices is for High Avaliability, so #1 or #2 will answer *every* request, until they lose heartbeat to eachother and then the idle device will start serving the requests. All the session information is synced between the two devices in real time, so when the other device kicks in, it should have an up to date copy of the data. I don't know much about these F5 devices though (way out of our price range), so their milage may be different. — Mark Henderson, Dec 01 '11 at 02:20
yes, I've noticed that they are quite pricey. I will probably get a used one. Thanks for the response. — Chad Decker, Dec 01 '11 at 02:32
"It's absolutely essential that once a session is established with a server in the farm, further HTTP requests within that session are directed to that same server for the remainder of the session." What rationale could result in such an unorthodox design? Seems to me a better alternative would have the clients download a list of available servers, and connect to a specific server. Or learn how to maintain and restore session state between requests. — Greg Askew, Dec 01 '11 at 02:46
With all due respect, Greg, I've only been studying this topic for a matter of hours and even *I* am aware that using sticky sessions is not an "unorthodox design." It's very common. — Chad Decker, Dec 01 '11 at 03:00
@GregAskew - that's very, very common practice. For example, PHP sessions are stored on a servers local disk. If your request moves to another server, PHP loses the session. That's very, very normal behaviour. — Mark Henderson, Dec 01 '11 at 03:01
... but then isn't each session-binding webserver now a single point of failure? — danlefree, Dec 01 '11 at 03:06
@danlefree - at least then all you lose is a users session, not an entire website — Mark Henderson, Dec 01 '11 at 03:20

score 6 · Accepted Answer · answered Dec 01 '11 at 02:21

6

Most of F5 come in HA pairs, so these would be clustered. As soon as one F5 goes down, the IPs are assumed by the other F5 in the pair, so there is no downtime. For your question, each IP is assigned to only one F5 at a time and is not truly active/active on both.

That is your solution, now next question you should ask is what happens if the whole site goes down where both F5 are hosted?(and then look into global load balancing).

answered Dec 01 '11 at 02:21

R D

203
1
2
7

Ah, yes, I see what you mean. The failure aspect is critical but I guess I was coming from a standpoint of increasing capacity. Let's say two years from now my single F5 is getting overwhelmed. Wouldn't I "cluster" it with another one to share the load? And if so, do they share their state with each other? Maybe I'm using the term "cluster" incorrectly. Sorry if that's the case. Thanks. – Chad Decker Dec 01 '11 at 02:37
I believe in higher end you can do active/active clustering and by active/active it actually means actually/passive for certain IPs and passive/active for another set of IPs or IP. So at any given time, one IP will only be active on only one F5. As your capacity increases, practically, you would end up just upgrading your F5 hardware to higher performance box. – R D Dec 01 '11 at 02:44
@ChadDecker - the term "cluster" is a bit ambigious, which is why you'll usually find it called a HA cluster, or an Active/Passive cluster, or an Active/Active cluster. What you're looking for is a cluster used for scaling out, rather than high availability. – Mark Henderson Dec 01 '11 at 03:04
Aha, this makes sense. I tried to like/up-vote your response but the system says I need fifteen reputation points in order to do that. This site reminds me of a role-playing game. I like it though. Very informative stuff! – Chad Decker Dec 01 '11 at 03:07
@ChadDecker - well, you're now at 13 rep. Only 2 more rep to go! – Mark Henderson Dec 01 '11 at 03:26

Mark Hillick · Answer 2 · 2012-05-11T10:47:22.317

I don't believe that you can truly 'cluster' two F5 content switches, though I believe they're working on the feature but I could be wrong. Clustering load-balancers is a huge engineering challenge - how do they share information at layer4 or layer 7, how do they communicate over layer 2 or layer 3 and how is clustering and information-sharing enabled without affecting performance as load-balancers have to operate at wire-speed essentially.

Think of firewalls in the old days, proxy-based firewalls were always standalone-nodes because they did so much at layer 7 and it was essentially impossible to share this information across nodes without killing performance, whereas stateful packet filters only had to transfer layer 4 information and even that was an overhead. Load-balancers will typically be deployed with much of their configuration, as in your case with VIPs, acting as an endpoint and so the whole TCP session is rewritten and the load-balancer becomes the client to the server (i.e. there's essentially two flows and this is one reason why the actual client can't perform http pipelining direct to the back-end server).

With HA, you don't achieve the scale out that you want so you'll have to scale-up your load-balancer to handle the load etc. Vendors like that as with scale-up, it generally (not always as sometimes you can enable extra CPU with a license upgrade) means a new, bigger box HA does provide resiliency and reliability though obviously. With HA, you generally have command propagation, configuration synchronisation and some element of session exchange (which can be configured to a degree as this can cause load).

You could look scaling by load-balancing your load-balancers (i.e. LB -> LB -> Web Farm) but that's not great, can introduce latency, is (very) costly and your infrastructure has another point of failure though I have seen it successfully implemented.

You can use something like VRRP, which is like a quasi-cluster almost In this implementation you could have two sets of HA pairs of load-balancers in front of your web farm, call them HA1 and HA2. With VRRP, you could create two VIPs, one being live on HA1 (vip1) and the other being live on HA2 (vip2) due to the higher VRRP priority configuration. Both vip1 and vip2 can failover to the other HA pair through VRRP, either automatically (based on monitors etc) or manually by lowering the VRRP priority.

Most vendors have KB articles on the above configurations. I believe that there is one vendor who has true clustering in their product but I'll let you google for that.

All load-balancers have various forms of persistence, which you apply to the back-end server association. Popular forms today are cookie and hash (based on the 4-tuple and one or two other things). When the load-balancer acts as an endpoint like in your scenario, once the TCP connection is fully established, it will create a protocol control buffer essentially, which will contain information on the connection (essentially the 4-tuple and a couple of other things again). There's two such buffers, one representing each side of the connection and this buffer lives in memory on the load-balancer until the session is ended, when they're cleared to free up the memory for use again.

score 1 · Answer 3 · answered Jun 20 '12 at 22:55

Citrix NetScaler seems to be the most advanced load balancing solution especially in the area of clustering. You don't need to install an HA pair, just use a two box cluster. Then as you want to grow, just add more boxes to cluster. They also can cluster load balancers running in VM's.

score 0 · Answer 4 · answered Dec 06 '11 at 07:02

For all load balancers it is possible to set up an HA pair to prevent a single point of failure. This has always been the case. The Big IP is expensive. Have you considered the Citrix NetScaler. There is a virtual machine version that is very much cheaper. In fact there is a free version that you can use in production limited throughput though. Not only that but you can deploy the NetScaler VPX on the same servers as your application if necessary. Just a thought. email me if you want more details.

score 0 · Answer 5 · answered Aug 24 '12 at 03:19

You basically have two options here on the F5 platform, depending on the type of VIP you're configuring. On a Layer 4 VIP (effectively a NAT) you can configure connection mirroring, which allows the TCP session to not be interrupted during an HA event. This isn't possible for a Layer 7 VIP - there's simply too much state to "back up" to the Standby in realtime - but you can mirror cookie persistence records, which will make sure that after the HA failover, when the client reconnects, it will be redirected to the same backend server.

I don't have first-hand knowledge, but I believe that the Netscaler has similar capabilities.

That said, a scheme that's completely dependent on this type of persistence is going to have problems, especially if you have backend servers that are coming in and out of the VIP rotation on any sort of regular basis. I'd encourage you investigate standing up a shared cache (memcached is great for this) that any member of your server pool can query to validate the cookie on an incoming request. It's easier than you think :)

Clustering F5 Big-IP Devices; Is it Possible?

5 Answers5