0

I have an ECS task which has 2 containers using 2 different images, both hosted in ECR. There are 2 GitHub repos for the two images (app and api), and a third repo for my IaC code (infra). I am managing my AWS infrastructure using Terraform Cloud. The ECS task definition is defined there using Cloudposse's ecs-alb-service-task, with the containers defined using ecs-container-definition. Presently I'm using latest as the image tag in the task definition defined in Terraform.

I am using CircleCI to build the Docker containers when I push changes to GitHub. I am tagging each image with latest and the variable ${CIRCLE_SHA1}. Both repos also update the task definition using the aws-ecs orb's deploy-service-update job, setting the tag used by each container image to the SHA1 (not latest). Example:

          container-image-name-updates: "container=api,tag=${CIRCLE_SHA1}"

When I push code to the repo for e.g. api, a new version of the task definition is created, the service's version is updated, and the existing task is restarted using the new version. So far so good.

The problem is that when I update the infrastructure with Terraform, the service isn't behaving as I would expect. The ecs-alb-service-task has a boolean called ignore_changes_task_definition, which is true by default.

  • When I leave it as true, Terraform Cloud successfully creates a new version whenever I Apply changes to the task definition. (A recent example was to update environment variables.) BUT it doesn't update the version used by the service, so the service carries on using the old version. Even if I stop a task, it will respawn using the old version. I have to manually go in and use the Update flow, or push changes to one of the code repos. Then CircleCI will create yet aother version of the task definition and update the service.

  • If I instead set this to false, Terraform Cloud will undo the changes to the service performed by CircleCI. It will reset the task definition version to the last version it created itself!

So I have three questions:

  1. How can I get Terraform to play nice with the task definitions created by CircleCI, while also updating the service itself if I ever change it via Terraform?

  2. Is it a problem to be making changes to the task definition from THREE different places?

  3. Is it a problem that the image tag is latest in Terraform (because I don't know what the SHA1 is)?

I'd really appreciate some guidance on how to properly set up this CI flow. I have found next to nothing online about how to use Terraform Cloud with CI products.

Old Pro
  • 24,624
  • 7
  • 58
  • 106
Nick K9
  • 3,885
  • 1
  • 29
  • 62
  • I believe there is a good amount of information on integrating TFCloud with pipeline platforms, and I have done it with JP, Circle, Travis, Concourse, GLCI, GH Actions, and CodeBuild, so it is definitely possible. I believe the primary difficulty here is the TF+ECS integration. You may find it much easier to use an application deployment tool, and not an infrastructure provisioner, to deploy to ECS in a pipeline instead. – Matthew Schuchard May 24 '22 at 18:43
  • Thanks for the reply! This is 1 of 2 questions on TFC/CircleCI/AWS that I see on SO, and the only other resource I've found is [this series](https://circleci.com/blog/learn-iac-part3/) which uses k8s, GCP & a code/IaC monorepo, so I haven't been able to apply it. If you know of any other resource, I'd love to hear it! Most examples use TF not TFC. Can you explain what you mean by an "application deployment tool" as distinct from an "infra provisioner"? Do you mean I'd stop using TFC to manage the service entirely? (I have security groups, IAM roles etc applied, so this seems impractical?) – Nick K9 May 24 '22 at 18:57
  • TF would do well to manage ECS, but not necessarily to deploy to it. In k8s one would use Helm, operators, Ansible, etc. for this. I am unsure what options exist for ECS, but something analogous (or possibly the same in the case of Ansible) may be easier. – Matthew Schuchard May 24 '22 at 19:23
  • I've used Terraform for ECS deployments on multiple projects without an issue. I'm not sure why anyone would say it isn't suited to that task. Now if you want blue-green deployments, or rolling deployments with rollback, you would need a more sophisticated deployment tool like AWS CodeDeploy, but if you are just trying to release your latest docker containers by updating an ECS task definition, and updating the ECS service to use the new task definition, then Terraform works perfectly fine. – Mark B May 24 '22 at 19:48
  • I am successfully using CircleCI to deploy the two containers. I'm just having trouble when TFC either insists on retaining an outdated version of the task definition, or refuses to update the service when I've made a change that created a new task definition. Is my setup the way TFC is supposed to be used with CircleCI? – Nick K9 May 24 '22 at 20:41

1 Answers1

0

I have learned a bit more about this problem. It seems like the right solution is to use a CircleCI workflow to manage Terraform Cloud, instead of having the two services effectively competing with each other. By default Terraform Cloud will expect you to link a repo with it and it will auto-plan every time you push. But you can turn that off and use the terraform orb instead to run plan/apply via CircleCI.

You would still leave ignore_changes_task_definition set to true. Instead, you'd add another step to the workflow after the terraform/apply step has made the change. This would be aws-ecs/run-task, which should relaunch the service using the most recent task definition, which was (possibly) just created by the previous step. (See the task-definition parameter.)

I have decided that this isn't worth the effort for me, at least not at this time. The conflict between Terraform Cloud and CircleCI is annoying, but isn't that acute.

Nick K9
  • 3,885
  • 1
  • 29
  • 62