What is the Optimal Server Configuration for Split-Path Testing?

Question

I am far from an expert on Apache or any server for that matter, so i apologize if this question is poorly worded, which it likely is.

We have always relied on a vendor for split-path testing (aka "AB Testing"). If you're not familiar with that term, it's a form of marketing research in which you slightly modify one of your web pages (usually one nearest the point of conversion), say for instance, by changing the position of the "Buy Now" button or its color/contrast/texture, then serving one of those two pages to a given user based on random selection.

By doing split-path testing ourselves, I suspect we can do it far more cheaply and increase cycle times as well.

What is the optimal set-up for these tests? "Optimal" is based on the following criteria:

how quickly/easily new tests can be set-up and put online; and
minimal disruption to overall site performance

score 1 · Answer 1 · answered Dec 11 '12 at 22:32

If you have an environment of any scale you probably have redundant servers (at least a pair of everything, if not more). Given that, my choice here would be to update the configuration on half my servers to reflect Path A, and the other half to reflect Path B.
This is relatively trivial to do with, for example, puppet or chef.

From there your regular load balancing system takes over, and will hopefully be evenly distributing your users around your cluster, which ensures that you get an unbiased random set of users going down each path (as long as your load balancing is inherently ubiased).

This has a few practical advantages in my mind:

You don't have to worry about random path selection. Your load balancer is doing it for you.
If one path is performing terribly you can just disable those servers until you update them.
If a path makes the website unusable (bad code or similar) you're only breaking some of your servers. Users can keep performing transactions on the other working machines.
Your users will probably be oblivious to the change.

There are also some disadvantages:

You probably can't control where your users go.
Since your load balancer doesn't know who a user is you can't have a rule like "John Doe goes through path A, Jane Smith goes through path B", etc.
You have to maintain a server configuration profile for each path.
This is probably not too much work if your configuration management is sane.
You can't easily have compound paths.
You can only have full paths end-to-end with this model. You can't break someone out in the middle of a chain and have them go through a side-road detour.
Such detour paths would be difficult to analyze anyway, so I don't think this is a deal-breaker.

What is the Optimal Server Configuration for Split-Path Testing?

1 Answers1