1

We have a microservice based app that uses Azure ServiceBus. We deploy one of the services (a saga manager) as a .Net Core console app in a docker container (Linux). We use docker-compose and a group of 2 containers (including the console app) is deployed into an Azure Container Instance.

We use MassTransit for working with Azure ServiceBus.

The console app (saga manager) starts up the service bus via MassTransit.BusControlExtensions.Start() method.

More or less at this point the console app (saga manager) is primed and ready.

Sidenote: the app in question (saga manager) runs an Automatonymous State Machine

Now if we stop or delete the container with the saga manager, and then restart it - it stops receiving bus messages.

It seems that a new instance of the saga manager (thread?) is created with each restart - while the old instance (thread?) persists somehow.

Even when we delete the Container Instance resource altogether on Azure - we can still observe that the bus messages are picked up by something...

the container's app is still alive after deleting the container group itself

Is there a way to definitively kill the detached instance/thread?

Is anyone familiar with such behavior of Azure Container Instances?

PS in the described scenarios the stop/delete operations are always successful

PS2: here is the yaml file used to deploy the problematic container group with the use of az container create command:

apiVersion: 2018-10-01
location: westeurope
name: #{groupName}#
properties:
  containers:
  - name: saga-aci
    properties:
      image: #{acrLoginServer}#/sagaazure:latest
      resources:
        requests:
          cpu: 1
          memoryInGb: 1.5
      ports:
      - port: 80
      - port: 443
      - port: 9350
  - name: proxymanager-aci
    properties:
      image: #{acrLoginServer}#/proxymanager:latest
      resources:
        requests:
          cpu: 1
          memoryInGb: 1.5
      ports:
      - port: 22999
      - port: 24000
  osType: Linux
  ipAddress:
    type: Public
    ports:
    - protocol: tcp
      port: '80'
    #- protocol: tcp
      #port: '8080'
    - protocol: tcp
      port: '22999'
    - protocol: tcp
      port: '24000'
    - protocol: tcp
      port: '443'
    - protocol: tcp
      port: '9350'
  imageRegistryCredentials:
    - server: #{acrLoginServer}#
      username: #{acrName}#
      password: #{acrPassword}#
tags: null
type: Microsoft.ContainerInstance/containerGroups

perhaps this is a restarpolicy issue?

sztepen
  • 136
  • 2
  • 13
  • nope, I have already tried all of the operation combinations - delete, start; stop, start etc. Each time the old process persists, can’t die. For example even for az container restart, the old process is alive. Before restart - 1 saga manager running, after restart -2 saga managers running; ad infinitum. This process multipication occurs also for az container delete. after stop or delete, sure, you cant bash into the containers - but the azure bus process is STILL RUNNING. crazy. – sztepen May 09 '19 at 08:54
  • 1
    Are you sure the saga is still running if you delete the container group? not other services? – Charles Xu May 09 '19 at 08:58
  • so to put it in inprecise terms - the containers are aleays killed properly, thats true. However it seems that a thread/process lingers on and interacts with our messgae bus – sztepen May 09 '19 at 09:02
  • replying to your question: I will add logging via app insights to be sure but for now we are sure that SOMETHING picks up bus messages and processes them - even after killing all possible saga instances (especially the container group - via azure portal, via comand line, via azure devops; also by individually stopping containers). I will add more logging and let you know. For now we are reasonably sure it’s still running. PS other services are turned off - we are in dev mode so we can do that – sztepen May 09 '19 at 09:06

1 Answers1

1

For the operations of Azure Container Instance, there are some points should be paying attention to.

The stop action, if you stop the container group, then it will terminate and recycle all the containers in the group and do not preserve the containers' state. It has no effect if the container group already terminated.

The start action, if you start the container group from a stopped state, then there will be a new deployment with the same container configuration. If the image update, it will pull the new one. Start action will start the whole container group, but cannot start a special one in it.

For the delete action, I don't think the application inside the container group will still be alive if you delete the container group. The action will take a short time. But finally, the container group will be deleted totally.

You can see more details in the Azure Container Instance Operations. I have no experience about Azure ServiceBus, all above just for the Azure Container Instance. If you have any more questions please let me know.

Charles Xu
  • 29,862
  • 2
  • 22
  • 39
  • to check that I'm not insane, here is the log excerpt: https://ibb.co/2Nvhf7P at around 22:02 I deleted the container instance resource COMPLETELY; at around 22:07 I sent a triggering bus message it seems that upon delete a new instance was created in the void. This may be a result of an overeager restart policy – sztepen May 11 '19 at 22:23
  • I added the config yaml used to deploy the container group, please take a look – sztepen May 11 '19 at 22:39
  • Ok, for the purposes of troubleshooting I set the restartPolicy to NEVER. It did the trick - the container can be killed for good.... I am not really happy with that solution, I think we encountered a flaw of Container Instances (or at least a feature that should be better documented). Tomorrow I will experiment with restartPolicy=OnFailure – sztepen May 11 '19 at 23:50
  • I had some issues during the last deploy/container upgrade - but let me confirm that. The issue certainly persists with restarPolicy=Always and most likely with restartPolicy=OnFailure. I will reproduce the issue and take note of the resource ID in App Insights. At this moment I would not say that the deploy process is stable. Will confirm. – sztepen May 20 '19 at 18:53
  • @sztepen I am about to deploy some services to Container Instances using MT & Azure SB also. Would be very interested to hear about any change to this issue or any further work arounds you may have found. – robs Oct 01 '19 at 08:22
  • @sztepen First, if the solution works for you please accept it as the answer. Second, if you have more questions, you can ask another question and provide more message to get a special solution. – Charles Xu Oct 01 '19 at 08:31
  • @CharlesXu I wasnt able solve the problem based on your recommendations. I am quite certain this was an azure bug at this point in time. I am yet to open a support ticket - I will do that, thanks for reminding. There were more pressing matters in the meantime ;) – sztepen Oct 01 '19 at 13:44
  • @robs This issue was put on a backburner. I was requested to open a support ticket by Azure team, but I didnt get around to it. So at the moment the status is: unsolved/unknown. But yeah, let me reopen that. Feel free to remind me via private message. – sztepen Oct 01 '19 at 13:46