0

I can deploy to my local service fabric cluster and it works fine. When I attempt to deploy it to my azure service fabric cluster it errors out with

Error event: SourceId='System.Hosting', Property='Download:1.0:1.0:5fb96531-7b75-42d0-8f23-6a9e42f0bda4'.
There was an error during download.System.Fabric.FabricException (-2147017731)
Container image download failed for ImageName=microsoft/aspnet with unexpected error. Exception=System.Exception: Container image history check failed after successful download. ImageName=microsoft/aspnet.
   at Hosting.ContainerActivatorService.ContainerImageDownloader.d__6.MoveNext().

When googling this error, the common answers are that the vm hardrive is full (check one of my nodes, over 100gb available) or that the vm operating system is wrong (verified on the vm scaleset that it is running 2016-Datacenter-with-Containers). Also have seem some people mention not having enough resources on the vm's so I bumped them up to Standard_D3_v2 which should be plenty.

I did see some people mentioning increasing the container download timeout. The container is over 5gb so this is potentially an issues, and could work locally because its coming from docker cache. Unfortunately I'm not sure how to increase the timeout easily.

What else could cause this issue?

Josh
  • 1,648
  • 8
  • 27
  • 58
  • Make sure you explicetly set container tag and not just `latest` – Gregory Suvalian Oct 04 '18 at 21:27
  • for windows containers I usually log into nodes and download base image manually with `docker pull`, after that it works fine – 4c74356b41 Oct 05 '18 at 05:12
  • Interesting. I attempted that and got following error on the vm. failed to register layer: re-exec error: exit status 1: output: ProcessUtilityVMImage \\?\C:\ProgramData\docker\windowsf ilter\5ea88d8f98c87d520e1d0771a3348cb3b151b1ce77923455eab151ad2a6da0b1\UtilityVM: The system cannot find the path specif ied. – Josh Oct 05 '18 at 05:17
  • I think I see the issue. I used the microsoft/aspnet as the base of my docker container and it doesnt support the windows version of my vms. Switching the base image should fix it. – Josh Oct 05 '18 at 05:39
  • Is there any way to downgrade my cluster vms to work with microsoft/aspnet docker image? It is very difficult to setup nano or windowserver2016 as they don't start out with remote management tools installed. – Josh Oct 05 '18 at 06:18
  • If you are using the default SF deployment from Azure you can't, you can create a custom cluster with ARM template and specify the os image of the VM. Can't you just tag the base image of your image to use the one with same OS version? I think will be easier for you! – Diego Mendes Oct 06 '18 at 11:22

2 Answers2

0

For an image on this size, is likely that is timing out while downloading it.

You could try:

  • Use a private repository on same region as your cluster, like 'Azure Container Registry', you might get higher download speeds
  • If the botleneck is on your network, Increase the VM sizes, bigger VMs has more bandwidth.
  • Configure the cluster to wait longer to download the image. You can try setting the ContainerImageDownloadTimeout as described here

This is set in the cluster configuration, an you cluster manifest will have a section like this:

{
        "name": "Hosting",
        "parameters": [
          {
              "name": "ContainerImageDownloadTimeout",
              "value": "1200"
          }
        ]
}

To change the settings from an existing cluster, you can follow the instructions found here and here

Diego Mendes
  • 10,631
  • 2
  • 32
  • 36
  • I've updated my value to 48000 and still get the same error. Also using Azure Container Registry isn't an option because I am working in Azure government and it is not available there. – Josh Oct 05 '18 at 03:04
  • Any other possible culprits, or maybe a way to debug this error further? The image runs perfectly on my local machine cluster. – Josh Oct 05 '18 at 05:13
  • 1
    Maybe the issue is something else then, you should try to run the container from within the VM using docker and check if any you get any different error. – Diego Mendes Oct 05 '18 at 07:42
0

Make sure you target the correct version of the (base) image. There are a few to choose from.

The version of the image must be compatible with the version of Windows you're running on the host.

LoekD
  • 11,402
  • 17
  • 27
  • I had to start with a new container based on 14393 build of windows. I tried to use windowsservercore but had difficulty getting it setup because it is so bare bones. – Josh Oct 06 '18 at 18:52