We're currently using Varnish because it does a good job caching, and I'm trying to get it to perform well on cache misses. My benchmarks are giving me results I don't understand.
I have 3 boxes from rackspace, all running Centos 5.5. 1 has Varnish installed (4 GB instance), the other two have a simple Hello World node application running (2x1 GB instances).
A single node server performs well
httperf --hog --server 50.57.151.229 --port 1337 --uri / --rate=2000 --num-conns=20000 --num-call=1 --timeout 5
This completes, but many requests had errors, while 1750 works fine (1750 r/s)
The load balancer crashes
httperf --hog --server 50.56.80.227 --uri / --rate=1800 --num-conns=18000 --num-call=1 --timeout 5
If I switch to the load balancer, it works fine for a few seconds. Varnishstat shows 1800 requests and connections per second. After a few seconds, varnish's connections drop to 0, and the httperf starts getting errors. The node processes seem to be doing ok.
Sometimes the connections will pick back up, but they'll never get back to 1800 again. This happens sooner if I increase the load beyond 1800.
Here's my configuration, code, and varnishstat both before and after the drop.
https://gist.github.com/1296924
The Question
What is going on with Varnish when it drops to 0? What should I be looking for in varnishstat?