Background
If this seems like a question with incomplete information, it is. That's because it has to do with a large project that uses Azure to automatically provision and scale and downscale a large amount of VMs, that was handed over to us by a third party, and we're having trouble understanding some issues. But I'll try my best to explain
We are using python's Azure SDK to launch Azure VMs, to kill them etc. See this code used in the method that launches azure vms:
#Get VMs info. do it via minimum calls to speed up things
started_on = time.time()
while True:
try:
time.sleep(5)
vm = compute_client.virtual_machines.get(self.resource_group, vm_name)
break
Problem
Sometimes when we run this command, we get this error:
aioc.logic.connectors.azure:2017-08-17 08:39:45,709 | ERROR | Stop waiting for VM auto-acfinH-25 to finish
Traceback (most recent call last):
File "/home/aioc/aioc/aioc/logic/connectors/azure.py", line 126, in start
vm = compute_client.virtual_machines.get(self.resource_group, vm_name)
File "/home/aioc/venv/lib/python3.4/site-packages/azure/mgmt/compute/compute/v2016_04_30_preview/operations/virtual_machines_operations.py", line 369, in get
raise exp
msrestazure.azure_exceptions.CloudError: Azure Error: ResourceNotFound
Message: The Resource 'Microsoft.Compute/virtualMachines/auto-acfinH-25' under resource group 'AIOCBot' was not found.
aioc.logic.main_controller_logic:2017-08-17 08:39:45,978 | ERROR | An error occurred while checking Vm with id '599553cdc1462e3a828c66da' machine id '328'
Traceback (most recent call last):
File "/home/aioc/aioc/aioc/logic/main_controller_logic.py", line 41, in run_vm_controller
started_vms = vmc.start(vm.machine_type, 1)
File "/home/aioc/aioc/aioc/logic/connectors/azure.py", line 135, in start
vm_net_interface = network_client.network_interfaces.get(self.resource_group, vm_name)
File "/home/aioc/venv/lib/python3.4/site-packages/azure/mgmt/network/v2017_03_01/operations/network_interfaces_operations.py", line 171, in get
raise exp
msrestazure.azure_exceptions.CloudError: Azure Error: ResourceNotFound
Upon investigation, it turns out this is happening b/c our resources at azure have maxed out (?). The way to solve this problem is by purging the said resources by using this method:
def cleanup_all(self):
"""
Clean up all auto-created resources
"""
compute_client = ComputeManagementClient(self.credentials, self.subscription_id)
network_client = NetworkManagementClient(self.credentials, self.subscription_id)
resource_client = ResourceManagementClient(self.credentials, self.subscription_id)
l = resource_client.resources.list()
for r in [r for r in l if r.name.startswith('auto-')]:
try:
if 'publicIPAddresses' in r.type:
rs = network_client.public_ip_addresses.delete(self.resource_group, r.name)
rs.wait()
elif 'Microsoft.Network' in r.type:
rs = network_client.network_interfaces.delete(self.resource_group, r.name)
rs.wait()
elif 'Microsoft.Compute/virtualMachines' in r.type:
rs = compute_client.virtual_machines.delete(self.resource_group, r.name)
rs.wait()
except:
log.warn("Failed to stop resource: %s with type: %s", r.name, r.type, exc_info=1)
Which is all awesome. However, for business reasons we cannot simply - create a cron job that runs this commands on a regular interval - cannot run it in any automated fashion b/c it affects many different environments at once (ie prod/demo/stage/dev) which is too big a side affect to be fathomable.
Which means we must run this command periodically, every once in a while once we have an agreement that all envs are clear and ready.
Question
I would like to take a look at the resources section inside my Azure console
and have a way to find out how much of my allowed resources I have consumed. I need to have an idea for example that: oh by the way you have consumed like 45% percent of your allowed public ips etc, and that way I know if I'm safe or if I need to run the purge command again.
Ideas?
Update
This page discussees in detail the available limits, for example:
But it doesn't talk about how to gauge how much of these resources are currently in use or how much is left.. that's what i'm trying to find out
can someone explain what's going on?