1

Background

If this seems like a question with incomplete information, it is. That's because it has to do with a large project that uses Azure to automatically provision and scale and downscale a large amount of VMs, that was handed over to us by a third party, and we're having trouble understanding some issues. But I'll try my best to explain

We are using python's Azure SDK to launch Azure VMs, to kill them etc. See this code used in the method that launches azure vms:

    #Get VMs info. do it via minimum calls to speed up things
    started_on = time.time()
    while True:
        try:
            time.sleep(5)
            vm = compute_client.virtual_machines.get(self.resource_group, vm_name)
            break

Problem

Sometimes when we run this command, we get this error:

aioc.logic.connectors.azure:2017-08-17 08:39:45,709 | ERROR | Stop waiting for VM auto-acfinH-25 to finish
Traceback (most recent call last):
  File "/home/aioc/aioc/aioc/logic/connectors/azure.py", line 126, in start
    vm = compute_client.virtual_machines.get(self.resource_group, vm_name)
  File "/home/aioc/venv/lib/python3.4/site-packages/azure/mgmt/compute/compute/v2016_04_30_preview/operations/virtual_machines_operations.py", line 369, in get
    raise exp
msrestazure.azure_exceptions.CloudError: Azure Error: ResourceNotFound
Message: The Resource 'Microsoft.Compute/virtualMachines/auto-acfinH-25' under resource group 'AIOCBot' was not found.
aioc.logic.main_controller_logic:2017-08-17 08:39:45,978 | ERROR | An error occurred while checking Vm with id '599553cdc1462e3a828c66da' machine id '328'
Traceback (most recent call last):
  File "/home/aioc/aioc/aioc/logic/main_controller_logic.py", line 41, in run_vm_controller
    started_vms = vmc.start(vm.machine_type, 1)
  File "/home/aioc/aioc/aioc/logic/connectors/azure.py", line 135, in start
    vm_net_interface = network_client.network_interfaces.get(self.resource_group, vm_name)
  File "/home/aioc/venv/lib/python3.4/site-packages/azure/mgmt/network/v2017_03_01/operations/network_interfaces_operations.py", line 171, in get
    raise exp
msrestazure.azure_exceptions.CloudError: Azure Error: ResourceNotFound

Upon investigation, it turns out this is happening b/c our resources at azure have maxed out (?). The way to solve this problem is by purging the said resources by using this method:

def cleanup_all(self):
    """
    Clean up all auto-created resources
    """
    compute_client = ComputeManagementClient(self.credentials, self.subscription_id)
    network_client = NetworkManagementClient(self.credentials, self.subscription_id)
    resource_client = ResourceManagementClient(self.credentials, self.subscription_id)

    l = resource_client.resources.list()
    for r in [r for r in l if r.name.startswith('auto-')]:
        try:
            if 'publicIPAddresses' in r.type:
                rs = network_client.public_ip_addresses.delete(self.resource_group, r.name)
                rs.wait()
            elif 'Microsoft.Network' in r.type: 
                rs = network_client.network_interfaces.delete(self.resource_group, r.name)
                rs.wait()
            elif 'Microsoft.Compute/virtualMachines' in r.type:
                rs = compute_client.virtual_machines.delete(self.resource_group, r.name)
                rs.wait()
        except:
            log.warn("Failed to stop resource: %s with type: %s", r.name, r.type, exc_info=1)

Which is all awesome. However, for business reasons we cannot simply - create a cron job that runs this commands on a regular interval - cannot run it in any automated fashion b/c it affects many different environments at once (ie prod/demo/stage/dev) which is too big a side affect to be fathomable.

Which means we must run this command periodically, every once in a while once we have an agreement that all envs are clear and ready.

Question

I would like to take a look at the resources section inside my Azure console

enter image description here

and have a way to find out how much of my allowed resources I have consumed. I need to have an idea for example that: oh by the way you have consumed like 45% percent of your allowed public ips etc, and that way I know if I'm safe or if I need to run the purge command again.

Ideas?

Update

This page discussees in detail the available limits, for example:

enter image description here

But it doesn't talk about how to gauge how much of these resources are currently in use or how much is left.. that's what i'm trying to find out

can someone explain what's going on?

abbood
  • 23,101
  • 16
  • 132
  • 246
  • Do you want to find the information programmatically or manual process will also do? – Gaurav Mantri Aug 18 '17 at 11:43
  • @GauravMantri for starters i'll be happy with manual information, although knowing how this info can be obtained programmatically is welcome as well – abbood Aug 18 '17 at 11:47
  • @abbood Firstly, this is a classic mode limit. In resource mode, you could use 60 public IP address(dynamic) by default, if you want to reach the limit, you could create a ticket. Please refer to this [link](https://learn.microsoft.com/en-us/azure/azure-supportability/resource-manager-core-quotas-request). The ticket is free. – Shui shengbao Aug 21 '17 at 02:06
  • @abbood You could use [sdk](https://learn.microsoft.com/en-us/python/api/azure.mgmt.compute.compute.v2017_03_30.operations.usageoperations?view=azure-python) to list VM usage. – Shui shengbao Aug 21 '17 at 02:13

2 Answers2

3

Part answer to your question. For manual, you can find this information on Azure portal itself. Click on "Subscription", then select your subscription from the subscriptions list and then "Usage + quota".

enter image description here

Gaurav Mantri
  • 128,066
  • 12
  • 206
  • 241
  • that was too easy.. if you can give me resources (ie apis) where i can read this programmatically i'd really appreciate it! :) – abbood Aug 18 '17 at 13:38
2

You can find them programmatically, but this is dispatched per provider:

For a more Event based programming, there is most probably a way to plug Event Grid and/or Logic Apps and/or Azure Monitor to be warned automatically.

Laurent Mazuel
  • 3,422
  • 13
  • 27