Stopping Default IIS website causes Azure Application Gateway '502 Bad Gateway' Error for ALL websites in IIS

Question

I'm having an issue with hosting multiple .NET websites on Windows Server/IIS and Azure Application Gateway.

We host multiple sites on a single Azure Windows VM running IIS, sitting behind Azure Application Gateway WAFv2. The VM is connected to App Gateway using a backend pool configured to point to the private IP of the VM, with the VNets peering configured between the App Gateway and VM VNets.

When I stop the default website in IIS, ALL websites then return a '502 Bad Gateway' error from Azure Application Gateway, and the backend health status changes to 'Unhealthy' for the backend pool where the VM resides.

Can anyone tell me why stopping the Default site would cause Application Gateway to error for all sites?

EDIT: Screenshot of IIS bindings as requested

EDIT 2: Apparently I can't answer my own question, however after working through this with our CSP I have the answer. By default the App Gateway Backend Health check looks at the default IIS site. If you stop that then the Backend Health Check fails and goes Unhealthy. At this point APP Gateway will no longer even ATTEMPT to route any requests, regardless of URL to that backend pool.

Can you include a screenshot of how you have the sites created on IIS? Are you stopping the default site or default app pool? If its the default app pool that could explain it if all the sites are using the same app pool. Are you creating applications under the default site or creating new sites. — Ron, Jul 10 '20 at 00:51
What are the site bindings? Read https://docs.jexusmanager.com/tutorials/binding-diagnostics.html#background for hints. — Lex Li, Jul 10 '20 at 01:12
Application gateway does the healthcheck probe on 80/443 port for the backend pool. So if your default website is configured on any of those port that will cause an unhealty status of backend pool. Since your only backend pool is unhealthy application gateway will throw 502 bad gateway error. — John, Jul 20 '20 at 14:23
It's a multi-tenant hosting situation, so each Site (tenant) has it's own Site/AppPool separate from the Default Site, the default site is not used, and it's app pool is not shared with any other sites. I'm stopping both the Default Site and it's App Pool. No bindings beyond default are configured on the Default Site. — Chris Butler, Jul 22 '20 at 01:32
@John, that might explain why the Backend Health reports an error , there are multiple sites running on Port 80 (HTTPS/SSL on 443 is handled by AppGateway). Wouldn't make sense that this should bring EVERY site on the VM down though? In some cases we have up to 40 different sites on a single VM, each with their own unique Site and App Pool in IIS. — Chris Butler, Jul 22 '20 at 01:37
Updated to add screenshot of Site binding configurations on one of the DEV VMs in question. Have redacted anything potentially sensitive, and left as much as I can. — Chris Butler, Jul 22 '20 at 01:44

score 0 · Answer 1 · answered Jul 10 '20 at 03:28

If the application gateway has no VMs or virtual machine scale set configured in the back-end address pool, it can’t route any customer request and sends a bad gateway error. Following the below command to show back-end address pool JSON result.

Get-AzApplicationGateway -Name "SampleGateway" -ResourceGroupName "ExampleResourceGroup"

Here is an official guideline for troubleshooting the 502 error.
https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-troubleshooting-502#overview
Also, here is a simple troubleshooter.
https://support.microsoft.com/en-us/help/4504111/azure-application-gateway-with-bad-gateway-502-errors

There are currently 6 different backend pools, each containing at least one VM configured. The problem is that when I stop the default site on any one of these VMs, the Backend Health reports an error and ALL sites on that VM go offline. — Chris Butler, Jul 22 '20 at 01:38

score 0 · Answer 2 · answered Jul 23 '20 at 15:17

If I were to try and troubleshoot this, I would likely start with a brand new "test" instance of IIS and set up a reverse proxy on port 80 whose only job is to listen to incoming requests to port 80. Those requests would then be forwarded by your reverse proxy to your actual websites bound to different ports (e.g. 81, 82, 83, etc).

The idea here is to have all of your websites running on different ports such that when you stop one of your sites, the others continue to run without a problem.

Given your setup with up to 40 sites hosted in a single instance of IIS, I would only attempt this type of troubleshooting with a brand new "test" instance of IIS.

Create a brand new "test" instance of IIS.
Create a reverse proxy. To do this, create a new site and name it (e.g. rev-proxy) and give it a binding of port 80.
Deploy one actual site (e.g. myfirstsite). Give it a port binding of something other than 80 (e.g. 81).
Double click your rev-proxy site and add a URL Rewrite -> Inbound Rules -> Blank rule. See attached picture. Add a rule such that when a user requests "myfirstsite" that request is forwarded onto port 81. Use the "Test Pattern" button to test your pattern. The image is only a suggestion and your pattern should correspond to the URL your users are using to request your site and not necessarily to the name you give your site in IIS.

An example of a reverse proxy with a URL Rewrite

Chris Butler · Accepted Answer · 2021-07-05T23:03:39.253

Found the answer to this after many months of messing about!

With Azure Application Gateway, the default health probes for each backend pool ping and look for a response on the configured IP address or FQDN in the backend pool itself.

In my case this is set to the local IP address of the Virtual machine (when I configured this 18-24 months ago I recall our Azure CSP telling me there was a bug with using the FQDN in the backend pool configuration).

This means, that when the Health Probe is attempting to communicate with the VM, the Default Website in IIS is the only thing configured to respond to any requests on this IP address.

If you stop the Default Site, the Health Probe gets no response to it's requests a the Backend Pool status goes to Unhealthy as you would expect.

The really interesting thing here is that as soon as the Backend Pool Health Probe status goes Unhealthy, Azure Application Gateway ceases to even attempt to route any traffic to the affected backend pool. Instead it immediately reports the 502 Bad gateway error, and will continue to do so until the Health Probe status is corrected and goes back to healthy!

Stopping Default IIS website causes Azure Application Gateway '502 Bad Gateway' Error for ALL websites in IIS

3 Answers3