2

Hi I have a web app deployed as a Cloud Service on Windows Azure. Now I am performing some load/stress test against this app. In the Azure Management Portal I have configured the web role to scale automatically when the CPU goes over 40%.

I start the tests with only one instance of this web role. As the test progresses, I have set the number of concurrent users to increase over time up to 2000 users.

After I start the test, I connect via remote desktop to the web role instance on Azure and I monitor the CPU usage. After 10 mins or so, the CPU is constantly at 100% (and in fact my requests in the test take a very long time to complete) but if I check the CPU of the very same web role on the Azure management portal it says 1, 2 or 6, there was a peak of 70% but it sunk back immediately (but never the values I see in its task manager when I am connected in remote desktop) or even does not display any value (I go to the dashboard page of my cloud service), which means the graph is not updated any more.

Furthermore, and this is the point of my question, NO SCALING of the web role instances is performed whatsoever.

Any ideas where/what I am missing? Feel free to ask if my explanation is incomplete.

Mirko Lugano
  • 975
  • 1
  • 11
  • 26
  • Auto-scaling is based upon turning on pre-configured VMs. Did you set those up? http://azure.microsoft.com/en-us/documentation/articles/cloud-services-how-to-scale/ – David Peden Feb 17 '15 at 14:38
  • Hi David I've read the article, not 100% sure I got ur point, but in the 'Scale' page of the Management Portal, I have set the instance range from a minimum of 1 to a maximum of 6 instances to be scaled up or down – Mirko Lugano Feb 17 '15 at 15:47
  • @DavidPeden - the OP is using web roles in a cloud service, not Virtual Machines. So... the use of pre-configured VMs is not part of the equation here. – David Makogon Feb 17 '15 at 15:50

2 Answers2

2

Autoscaling on the CPU metric for a Cloud Service or Virtual Machine doesn't occur as fast as you are expecting (~10+ minutes). In this scenario, the CPU metric is averaged across all instances of the services for a period of 1 hour. Therefore, your autoscaling actions will not be immediate.

You can read more about this and some recommendations for configuring your autoscale settings here.

If you want to tighten this up a little more then take a look at this post where I show how to set the TimeWindow using the Monitoring Service Management Library. You may be able to get closer to what you want taking this approach.

Rick Rainey
  • 11,096
  • 4
  • 30
  • 48
  • Thanx in fact I didn't find any information about how much time was the 'average CPU' calculated and I thought it was the "wait time to scale up" parameter in the "Scale" page of the management portal. Running the test for 2 hours made it scale as expected. – Mirko Lugano Feb 19 '15 at 12:14
2

A few things to consider:

1) As Rick pointed out, by default CPU is taken at an hour average

2) If you start at only 1 server, and then autoscale up to 2, your first server will get yanked out of load balancer during the scale operation. You should really always have a minimum of 2 servers at all time.

3) Feel free to check out AzureWatch (link in my profile).. it was designed to perform decently advanced scaling scenarios and allows you to configure scaling rules without touching APIs

Igorek
  • 15,716
  • 3
  • 54
  • 92
  • Hi @Igorek, can you elaborate a bit more about point 2? Why would the first instance be rebooted for the scale operation? – Dirk Boer Jun 07 '15 at 20:53
  • 1
    @DirkBoer Need to clarify slightly, when Scaling operation is called, it is actually ChangeDeployment API method is executed. ChangeDeployment will yank out all of the servers out of load balancer one at a time (if you only have one, you will lose connectivity for a little bit). Furthermore, if you handle RoleEnvironmentChanging event, be sure to prevent reboots from happening when Topology event occur – Igorek Jun 07 '15 at 23:01
  • Hi @Igorek, thanks for taking your time to answer. I'll have to look up some more information about the ChangeDeployment and RoleEnvironmentChanging event. Thanks! :) – Dirk Boer Jun 08 '15 at 08:15