1

We are using Azure Batch, and we need to use Windows Docker containers on the VMs.

This is how it is done via Portal: containers via portal

And this is how it is done via C# API:

private static VirtualMachineConfiguration ConfigureVM()
{
    var imageNames = new List<string> { "microsoft/dotnet-framework:4.7" };
    var containerConfig = new ContainerConfiguration
    {
        ContainerImageNames = imageNames
    };

    var offer = "WindowsServer";
    var publisher = "MicrosoftWindowsServer";
    var imageSku = "2016-Datacenter-with-Containers";
    var imageReference = new ImageReference(offer, publisher, imageSku);

    var nodeSku = "batch.node.windows amd64";
    var vmConfig = new VirtualMachineConfiguration(imageReference, nodeSku)
    {
        ContainerConfiguration = containerConfig
    };

    return vmConfig;
}

And now we are automating deployment so I want to do the same via ARM template (this is a child resource of the Azure Batch account, so name and type are ok):

"resources": [
    {
        "name": "Test",
        "type": "pools",
        "apiVersion": "2017-09-01",
        "properties": {
            "vmSize": "STANDARD_A1",
            "deploymentConfiguration": {
                "virtualMachineConfiguration": {
                    "imageReference": {
                        "publisher": "MicrosoftWindowsServer",
                        "offer": "WindowsServer",
                        "sku": "2016-Datacenter-with-Containers"
                    },
                    "nodeAgentSkuId": "batch.node.windows amd64",
                    "containerConfiguration": {
                        "imageNames": [ "microsoft/dotnet-framework:4.7" ]
                    }
                }
            }
        }
    }
]

And this does not work. When deploying, I get:

Could not find member 'containerConfiguration' on object of type 'VirtualMachineConfiguration'. 
Path 'properties.deploymentConfiguration.virtualMachineConfiguration.containerConfiguration'

Without containerConfiguration part things work - I get VMs with docker, just without the image. I understand why this happens - the template does not have this property as opposed to .NET class.

So... any workaround? I guess it is not the first time when template is not synced with functionality.

psfinaki
  • 1,814
  • 15
  • 29
  • Can you tell me the `type` you are using? I am facing the same problem now. I would appreciate if you could post a solution if you found one. – Coke Sep 04 '18 at 17:14
  • Rustam, `type` of what? We have not found a direct solution, our workaround is a crazy command line in the start task, where we login to docker and pull the image. – psfinaki Sep 05 '18 at 07:40
  • `type` of the resource that says `pools` in your case. I have solved the problem, thanks. – Coke Sep 05 '18 at 14:28
  • Okay. FWIW, it is `Microsoft.Batch/batchAccounts`, if I get you right. – psfinaki Sep 05 '18 at 17:05

1 Answers1

2

In a recent update the ARM provider for Batch was improved to allow creation of a container enabled pool. The following ARM template will create a container pool, notice the API version is updated.

{
            "type": "Microsoft.Batch/batchAccounts/pools",
            "name": "[concat(variables('batchAccountName'), '/', parameters('poolID'))]",
            "apiVersion": "2018-12-01",
            "scale": null,
            "properties": {
                "vmSize": "[parameters('virtualMachineSize')]",
                "networkConfiguration": {
                    "subnetId": "[parameters('virtualNetworkSubnetId')]"
                },
                "maxTasksPerNode": 1,
                "taskSchedulingPolicy": {
                    "nodeFillType": "Spread"
                },
                "deploymentConfiguration": {
                    "virtualMachineConfiguration": {
                        "containerConfiguration": {
                            "containerImageNames": "[parameters('dockerImagesToCache')]",
                            "type": "DockerCompatible"
                        },
                        "imageReference": {
                            "publisher": "microsoft-azure-batch",
                            "offer": "ubuntu-server-container",
                            "sku": "16-04-lts",
                            "version": "latest"
                        },
                        "nodeAgentSkuId": "batch.node.ubuntu 16.04"
                    }
                },
                "scaleSettings": {
                    "autoScale": {
                        "evaluationInterval": "PT5M",
                        "formula": "[concat('startingNumberOfVMs = 0;maxNumberofVMs = ', parameters('maxNodeCount'), ';pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(160 * TimeInterval_Second);pendingTaskSamples = pendingTaskSamplePercent < 70 ? startingNumberOfVMs : avg($PendingTasks.GetSample(160 * TimeInterval_Second));$TargetDedicatedNodes=min(maxNumberofVMs, pendingTaskSamples);')]"
                    }
                }
            },
            "dependsOn": [
                "[resourceId('Microsoft.Batch/batchAccounts', variables('batchAccountName'))]"
            ]
        }

----- Previous Answer now not needed -----

I've managed to find a workaround using an ACI container along with Managed Service Identities and some Python. It's not pretty but it does work.

The flow of the template is as follows:

  1. An MSI is created
  2. The MSI is assigned contributor rights for the resource group
  3. The Batch account is created
  4. An ACI instance is run which pulls down a templated pool.json file and uses a python script to fill in the required parameters. The python logs in to the az cli using the MSI identity then proceeds to create the pool.

Here is the full setup, you'll likely want to tweak this to fit your scenario.

The python script and pool.json file need to be uploaded to a public location, such as blob storage or git, then the _artifactLocation parameters are used to tell the template where to download the files.

Main template:

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "_artifactsLocation": {
            "type": "string",
            "metadata": {
                "description": ""
            }
        },
        "_artifactsLocationSasToken": {
            "type": "string",
            "metadata": {
                "description": ""
            }
        },
        "mountArgs": {
            "type": "string",
            "metadata": {
                "description": "Arguments passed to the mount.py script."
            }
        },
        "virtualNetworkSubnetId": {
            "type": "string",
            "metadata": {
                "description": "The subnet in which Batch will be deployed. Requires the following ports to be enabled via NSG: https://learn.microsoft.com/en-us/azure/batch/batch-virtual-network#network-security-groups-1."
            }
        },
        "maxTasksPerNode": {
            "type": "int",
            "defaultValue": 1
        },
        "maxNodeCount": {
            "type": "int",
            "defaultValue": 3
        },
        "virtualMachineSize": {
            "type": "string",
            "defaultValue": "Standard_F8s_v2",
            "metadata": {
                "description": "Size of VMs in the VM Scale Set."
            }
        },
        "storageAccountSku": {
            "type": "string",
            "defaultValue": "Standard_LRS",
            "allowedValues": [
                "Standard_LRS",
                "Standard_GRS",
                "Standard_ZRS",
                "Premium_LRS"
            ],
            "metadata": {
                "description": "Storage Account type"
            }
        },
        "location": {
            "type": "string",
            "defaultValue": "[resourceGroup().location]",
            "metadata": {
                "description": "Location for all resources."
            }
        },
        "poolId": {
            "type": "string",
            "defaultValue": "defaultpool"
        }
    },
    "variables": {
        "identityName": "batchpoolcreator",
        "storageAccountName": "[concat('batch', uniqueString(resourceGroup().id))]",
        "batchAccountName": "[concat('batch', uniqueString(resourceGroup().id))]",
        "batchEndpoint": "[concat('https://', variables('batchAccountName'), '.' , parameters('location'), '.batch.azure.com')]",

        "_comment": "The role assignment ID is required to be a guid, we use this to generate a repeatable guid",
        "roleAssignmentIdRg": "[guid(concat(resourceGroup().id, 'contributorRG'))]",

        "_comment": "This is the ID used to set the contributor permission on a role.",
        "contributorRoleDefinitionId": "[concat('/subscriptions/', subscription().subscriptionId, '/providers/Microsoft.Authorization/roleDefinitions/', 'b24988ac-6180-42a0-ab88-20f7382dd24c')]"
    },
    "resources": [
        {
            "comments": "Create an identity to use for creating the Azure Batch pool with container support (will be assigned to ACI instance)",
            "type": "Microsoft.ManagedIdentity/userAssignedIdentities",
            "name": "[variables('identityName')]",
            "apiVersion": "2015-08-31-preview",
            "location": "[resourceGroup().location]"
        },
        {
            "comments": "Assign the idenity contributor rights to the resource group",
            "type": "Microsoft.Authorization/roleAssignments",
            "apiVersion": "2017-05-01",
            "name": "[variables('roleAssignmentIdRg')]",
            "dependsOn": [
                "[resourceId('Microsoft.ManagedIdentity/userAssignedIdentities', variables('identityName'))]"
            ],
            "properties": {
                "roleDefinitionId": "[variables('contributorRoleDefinitionId')]",
                "principalId": "[reference(resourceId('Microsoft.ManagedIdentity/userAssignedIdentities', variables('identityName')), '2015-08-31-preview').principalId]",
                "scope": "[resourceGroup().id]"
            }
        },
        {
            "comments": "This is the storage account used by Azure Batch for file processing/storage",
            "type": "Microsoft.Storage/storageAccounts",
            "name": "[variables('storageAccountname')]",
            "apiVersion": "2016-01-01",
            "location": "[parameters('location')]",
            "sku": {
                "name": "[parameters('storageAccountsku')]"
            },
            "kind": "Storage",
            "tags": {
                "ObjectName": "[variables('storageAccountName')]"
            },
            "properties": {}
        },
        {
            "type": "Microsoft.Batch/batchAccounts",
            "name": "[variables('batchAccountName')]",
            "apiVersion": "2015-12-01",
            "location": "[parameters('location')]",
            "tags": {
                "ObjectName": "[variables('batchAccountName')]"
            },
            "properties": {
                "autoStorage": {
                    "storageAccountId": "[resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName'))]"
                }
            },
            "dependsOn": [
                "[resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName'))]"
            ]
        },
        {
            "type": "Microsoft.ContainerInstance/containerGroups",
            "apiVersion": "2018-10-01",
            "name": "[substring(concat('batchpool', uniqueString(resourceGroup().id)), 0, 20)]",
            "location": "[resourceGroup().location]",
            "dependsOn": [
              "[resourceId('Microsoft.ManagedIdentity/userAssignedIdentities', variables('identityName'))]",
              "[resourceId('Microsoft.Authorization/roleAssignments', variables('roleAssignmentIdRg'))]",
              "[resourceId('Microsoft.Batch/batchAccounts', variables('batchAccountName'))]"
            ],
            "identity": {
              "type": "UserAssigned",
              "userAssignedIdentities": {
                "[resourceId('Microsoft.ManagedIdentity/userAssignedIdentities', variables('identityName'))]": {}
              }
            },
            "properties": {
              "osType": "Linux",
              "restartPolicy": "Never",
              "containers": [
                {
                  "name": "azure-cli",
                  "properties": {
                    "image": "microsoft/azure-cli",
                    "command": [
                      "/bin/bash",
                      "-c",
                      "[concat('curl -fsSL ', parameters('_artifactsLocation'), '/azurebatch/configurepool.py', parameters('_artifactsLocationSasToken'), ' > configurepool.py && python3 ./configurepool.py \"', parameters('poolId'), '\" ', parameters('virtualMachineSize'),  ' \"', parameters('mountArgs'),  '\" ', parameters('_artifactsLocation'),  ' ', parameters('_artifactsLocationSasToken'),  ' ', parameters('virtualNetworkSubnetId'), ' ', parameters('maxNodeCount'), ' ', resourceId('Microsoft.ManagedIdentity/userAssignedIdentities', variables('identityName')), ' ', resourceGroup().name, ' ', variables('batchAccountName'))]"
                    ],
                    "resources": {
                      "requests": {
                        "cpu": 1,
                        "memoryInGB": 1
                      }
                    }
                  }
                }
              ]
            }
      }
    ],
    "outputs": {
        "storageAccountName": {
            "type": "string",
            "value": "[variables('storageAccountName')]"
        },
        "batchAccountName": {
            "type": "string",
            "value": "[variables('batchAccountName')]"
        },
        "batchEndpoint": {
            "type": "string",
            "value": "[variables('batchEndpoint')]"
        },
        "batchAccountKey": {
            "type": "securestring",
            "value": "[listKeys(resourceId('Microsoft.Batch/batchAccounts', variables('batchAccountName')), '2017-09-01').primary]"
        },
        "batchPoolId": {
            "type": "string",
            "value": "[parameters('poolId')]"
        }
    }
}

Pool.json

{
    "id": "POOL_ID_HERE",
    "vmSize": "VM_SIZE_HERE",
    "enableAutoScale": true,
    "autoScaleFormula": "startingNumberOfVMs = 0;maxNumberofVMs = MAX_NODE_COUNT_HERE;pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(160 * TimeInterval_Second);pendingTaskSamples = pendingTaskSamplePercent < 70 ? startingNumberOfVMs : avg($PendingTasks.GetSample(160 * TimeInterval_Second));$TargetDedicatedNodes=min(maxNumberofVMs, pendingTaskSamples);",
    "autoScaleEvaluationInterval": "PT5M",
    "enableInterNodeCommunication": false,
    "startTask": {
        "commandLine": "/usr/bin/python3 mount.py MOUNT_ARGS_HERE",
        "resourceFiles": [
            {
                "blobSource": "ARTIFACT_LOCATION_HERE/examplemountscript/script.pyARTIFACT_SAS_HERE",
                "filePath": "./mount.py",
                "fileMode": "777"
            }
        ],
        "userIdentity": {
            "autoUser": {
                "scope": "pool",
                "elevationLevel": "admin"
            }
        },
        "maxTaskRetryCount": 0,
        "waitForSuccess": true
    },
    "maxTasksPerNode": 1,
    "taskSchedulingPolicy": {
        "nodeFillType": "Spread"
    },
    "virtualMachineConfiguration": {
        "containerConfiguration": {
            "containerImageNames": [
                "ubuntu",
                "python"
            ]
        },
        "imageReference": {
            "publisher": "microsoft-azure-batch",
            "offer": "ubuntu-server-container",
            "sku": "16-04-lts",
            "version": "1.0.6"
        },
        "nodeAgentSKUId": "batch.node.ubuntu 16.04"
    },
    "networkConfiguration": {
        "subnetId": "SUBNET_ID_HERE"
    }
}

configurepool.py:

import subprocess
import sys
import urllib.request


def run_az_command(cmdArray):
    try:
        print("Attempt run {}".format(cmdArray))
        subprocess.check_call(cmdArray)
        print("Install completed successfully")
    except subprocess.CalledProcessError as e:
        print("Failed running: {} error: {}".format(cmdArray, e))
        exit(4)

if len(sys.argv) != 11:
    print(
        "Expected 'poolid', 'vm_size', 'mount_args', 'artifact_location', 'artifact_sas', 'subnet_id', 'max_node_count', 'msi_name', 'resource_group_name' , 'batch_account_name'"
    )
    exit(1)

pool_id = str(sys.argv[1])
vm_size = str(sys.argv[2])
mount_args = str(sys.argv[3])
artifact_location = str(sys.argv[4])
artifact_sas = str(sys.argv[5])
subnet_id = str(sys.argv[6])
max_node_count = str(sys.argv[7])
msi_name = str(sys.argv[8])
resource_group_name = str(sys.argv[9])
batch_account_name = str(sys.argv[10])

url = "{0}/azurebatch/pool.json{1}".format(artifact_location, artifact_sas)
response = urllib.request.urlopen(url)
data = response.read()
text = data.decode("utf-8") 

# Replace the target string
text = text.replace("POOL_ID_HERE", pool_id)
text = text.replace("VM_SIZE_HERE", vm_size)
text = text.replace("MOUNT_ARGS_HERE", mount_args)
text = text.replace("ARTIFACT_LOCATION_HERE", artifact_location)
text = text.replace("ARTIFACT_SAS_HERE", artifact_sas)
text = text.replace("SUBNET_ID_HERE", subnet_id)
text = text.replace("MAX_NODE_COUNT_HERE", max_node_count)


# Write the file out again
with open("pool.complete.json", "w") as file:
    file.write(text)

run_az_command(["az", "login", "--identity", "-u", msi_name])
run_az_command(["az", "batch", "account", "login", "--name", batch_account_name, "-g", resource_group_name])
run_az_command(["az", "batch", "pool", "create", "--json-file", "pool.complete.json"])
lawrencegripper
  • 597
  • 5
  • 12