2

Every few hours, Web Farm Framework takes my farm down with a 502 error and the 2 WFE's in the farm are marked as Unhealthy.

I have no Validation URL set up in Health Monitoring, and everything is fine for another few hours if I manually 'Make Server Available', then same thing.

WFF is load balancing an ASP.NET application. How can I find out why they're getting marked as unhealthy, or just disable the health detection so the WFE's will only go offline in a deployment error?

EDIT: This is the latest Web Farm Framework on IIS.net as of yesterday.

Brandon
  • 2,817
  • 1
  • 24
  • 28
  • Sorry, don't know about the answer, but another question: is there any indication that the servers generate an error code that might lead the controller to believe they're down? WFF debug logs, App/System event logs (and/or Web and HTTPERR logs) might confirm this? – TristanK Apr 15 '11 at 00:57
  • @TristanK - no, I originally expected to see a ton of ASP.NET errors on one of the WFE's, but nothing, app works perfectly, and hitting the WFE's directly is fine. – Brandon Apr 15 '11 at 01:58

3 Answers3

4

I think I found the answer. I you recycle the ARR application pool you get the 502.4 error when trying to access the secondary servers through the WFF controller (which is the http load balancer). I set the Idle time-out to 0 minutes to disable application pool recycling.

From http://forums.iis.net/t/1158399.aspx

"Functionally speaking, this value has no impact on how ARR works. The idle time-out is designed to bring down the worker process in order to free up more memory. (The default value is 20 min. So for example, if you have multiple sites/applications in multiple applications pools, and if there has been no activity on one of them, IIS will bring down the worker process - so that other processes/etc can consume the resource on the machine.)

Since ARR is proxying the all requests to the content/application servers behind it, we recommend that the worker process is running all the time. (That said, if there is a constant flow traffic, then the worker process would be running all the time, irrespective of this value. ie. It won't be idle for 20 min.)"

Antonio
  • 71
  • 2
3

Also remember to disable default Recycling of application.

By default, the ARR DefaultAppPool will have Timeout=20 and Recycling ~ Fixed Intervals = 1740.

Set Timeout=0 and uncheck Recycling ~ Fixed Intervals.

Grant
  • 131
  • 1
0

One additional thing, which I found after lots of frustrating 502 outages:

http://forums.iis.net/t/1183539.aspx/1

"I broke down and paid for a support incident with MSFT to help on this. The serverAutoStart was set to false for the farm that was having issues. This was in the C:\Windows\System32\inetsrv\config\applicationHost.config file."

This setting was false for us too, since setting it to true we've had no 502's

Another little piece of voodoo to note when changing applicationhost.config on 64 bit machines:

https://stackoverflow.com/questions/5696801/iis-7-5-applicationhost-config-file-is-not-being-updated

Matt Evans
  • 133
  • 1
  • 8