3

I'm currently using a Kubernetes cluster running on bare metal nodes provisioned with Ansible. There are plans to move to the cloud and I'm reading about Terraform and Packer, in preparation for this. Leaving the data migration aside, it seems like there is a pretty straight forward migration path for us:

  1. Build image with Packer using our existing Ansible scripts
  2. Deploy built image to the cloud with Terraform
  3. Deploy our Kubernetes resources with our current tooling

That's all great. We now have immutable infrastructure, using state of the art tooling.

What I am struggling to find is how images built with Packer are versioned. Somewhere down the line, we will have to upgrade some software in those images. Sometimes the Ansible scripts will change, but sometimes it's just a matter of having the latest security updates in the image. Either way, Packer will have to build a new image for us and we will have to deploy it with Terraform. If the new image ends up causing trouble, we will have to revert to the old one.

I can imagine how this could be done manually by editing the template before running it and then editing the terraform configuration to pick up the new version, but that's not going to work for a CI/CD pipeline. Another issue is that we may move between different regions and vendors. So a version of the image may be present in one region, but not the other, and ideally, the pipeline should create the image if it doesn't exist and use the existing one if it is already there. This may cause images in different regions or clouds to be different, especially since they might be built on different days and have different security updates applied.

All of this is built in to the Docker workflow, but with Packer, it's far from obvious what to do. I haven't found any documentation or tutorials that cover this topic. Is there any built-in versioning functionality in Packer and Terraform? Is Terraform able to invoke Packer if an image is missing? Is there any accepted best practice?

I can imagine automating this by using the API from the cloud provider to check for the existence of the required images and invoke Packer for any missing images, before executing Terraform. This would work, but I wouldn't want to write custom integration for each cloud provider and it sounds like something that should already be provided by Terraform. I haven't used Terraform before, so maybe I just don't know where to look, and maybe it's not that difficult to implement in Terraform, but then why aren't there any tutorials showing me how?

ydaetskcoR
  • 53,225
  • 8
  • 158
  • 177
Erik B
  • 40,889
  • 25
  • 119
  • 135
  • I just found this: https://www.davidbegin.com/packer-and-terraform/ so it seems like it might be an unsolved problem to which people come up with their own solutions. If you are using Terraform and Packer in production, please let me know how you are handling this. – Erik B Jun 02 '20 at 18:28
  • Versioning is not really built in to Packer because it integrates multiple providers, provisioners, etc. Each of these have their own specific versioning. For example, you mention Docker has this, because it is one specific provider that Packer can integrate among many others. Virtualbox, AWS, GCP, AZR, VMWare, etc. are other providers with their own versioning. There is a lot more to address in the question, but this explains why you are not finding that functionality in Packer. – Matthew Schuchard Jun 02 '20 at 18:30
  • @MattSchuchard I don't see why Packer couldn't unify the versioning or why Terraform couldn't treat images just like any other cloud resource and consider Packer a provisioner for these images. Then we make our VMs depend on these images, so that they have to be created/present before the VMs are launched. That seems like the obvious workflow to me and that's what I was expecting. It doesn't make sense to me that I would have to use yet another tool to build that dependency graph. – Erik B Jun 02 '20 at 22:24

3 Answers3

1

This is largely provider dependent and you haven't specified the cloud provider you are using but AWS makes a good example use case here.

Terraform and Packer both have a way of selecting the most recent AMI that matches a filter.

Packer's AWS AMI builder uses source_ami_filter that can be used to select the most recent image to base your image from. An example is given in the amazon-ebs builder documentation:

{
  "source_ami_filter": {
    "filters": {
      "virtualization-type": "hvm",
      "name": "ubuntu/images/\*ubuntu-xenial-16.04-amd64-server-\*",
      "root-device-type": "ebs"
    },
    "owners": ["099720109477"],
    "most_recent": true
  }
}

A typical case here is to always use the latest official Ubuntu image to build from. If you are producing multiple AMIs for different use cases (eg Kubernetes worker nodes vs etcd nodes) then you can then build up from there so you create a golden base image with a known naming scheme (eg ubuntu/20.04/base/{{isotime | clean_resource_name}}) that has everything you want in every AMI you produce and then other AMIs can also use the source_ami_filter to select for the most recent base AMI that you have published.

Terraform's AWS provider has the aws_ami data source that works in the same way and can be used to automatically select the latest AMI that matches a filter so that publishing a new AMI and then running Terraform will generate a plan to replace your instance or launch configuration/template that is referencing the AMI data source.

An example is given in the aws_instance resource documentation:

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["099720109477"] # Canonical
}

resource "aws_instance" "web" {
  ami           = "${data.aws_ami.ubuntu.id}"
  instance_type = "t2.micro"

  tags = {
    Name = "HelloWorld"
  }
}

In general you should be relying on mechanisms like these to automatically select the most recently published AMI and use that instead of hardcoding AMI IDs in code.


Managing the lifecycle of images is beyond the scope of Packer itself and it should be used as part of a larger system. If you want to rollback an image then you have two options that are available to you:

  • If your build is reproducible and the issue is in that reproducible thing then you can build and register a new image with the old code so that your newest image is now the same as the one 2 images ago
  • Deregister the newest image so you will begin picking up the old image again when searching for the latest. This varies depending on cloud provider but with AWS can be done programatically such as via the aws ec2 deregister-image command line

While Packer can automatically copy images to different regions (see ami_regions for AWS) and different accounts (use ami_users to share the created AMI with the other account or a post processor to create separate copies in different accounts) it can't easily do things conditionally without you having different Packer config files for each combination of ways you want to share things and can't separate out the roll out so you release to a non production account before you release to a production account etc.

If you want to roll out AMIs in some accounts and regions but not all then you are going to need to put that logic in a higher order place such as an orchestration mechanism like your CI/CD system.

ydaetskcoR
  • 53,225
  • 8
  • 158
  • 177
  • Thanks, this answer is definitely useful. One thing I am still missing is how Packer would know if it needs to build an image. Docker has cached layers and only rebuilds if there are changes, but it seems like Packer would build new images from scratch every time, regardless of whether anything has changed. Another thing is how you promote an image from staging to production. If you have an auto-scaling K8s cluster, you don't want it to spin up nodes with images that haven't been tested by your pipeline yet. Would you have to deal with this by renaming images outside of Terraform/Packer? – Erik B Jun 03 '20 at 14:31
  • It doesn't know whether changes have happened at all so you have to handle that above the usage of Packer if you care about that. And as I mentioned in the answer you also need to handle promoting images separately. Packer is very single focused on building images. It's up to you to tell it when to build an image and where the image should be published. – ydaetskcoR Jun 03 '20 at 14:36
  • I personally publish AMIs to all of our accounts at the same time and our CI/CD system then calls Terraform to apply the changes to our non production systems first and then eventually rolls out to production at a later step (currently manual approval for this step but could be automatic if you had confidence in a testing process). – ydaetskcoR Jun 03 '20 at 14:38
  • I try to keep my CI/CD pipelines free from logic. Ideally each stage just calls a single command. In the case of a Java project I would probably put such logic (one command depending on the output of another command that may or may not need to be executed) in Gradle. It seems odd to introduce Gradle here, but a build tool that let's you build a dependency graph (just like Gradle does), to model the dependency between Packer and Terraform seems to be what we need. Is there by any chance any tool commonly used for that? – Erik B Jun 03 '20 at 22:20
  • Nope it's entirely up to you. Lots of people (my self included) wrap Terraform in shell scripts to handle things like configuring state and pulling modules so that's a more common pattern than using Gradle. I've also seen Make used as well. – ydaetskcoR Jun 04 '20 at 07:10
  • Reading more about Terraform, it looks like it has all the capabilities to maintain such a dependency graph. I don't understand why an image isn't simply considered a resource that is provisioned with Packer, that your VM resources depend on. That's what I was expecting and I don't see any technical reason to not design it like that, but it seems like I would have to implement a custom provider to make it work. – Erik B Jun 04 '20 at 10:59
  • That's exactly how it works. You can also use the `aws_ami` resource to create that AMI directly in Terraform but it's not as nice as creating the images in Packer and decoupling them. From your mentioned requirements, you're going to need something to handle orchestration logic of rolling out that image progressively in different accounts and regions anyways so that should be encoded somewhere outside of Terraform and at that point you can handle linking Packer and Terraform anyway. – ydaetskcoR Jun 04 '20 at 11:29
  • Ok, seems like some cloud providers consider images resources, but others only make them available as data sources. Even with this knowledge, I can't figure out how to provision them with Packer. From the examples provided, it seems like you are supposed to create the image from a disk of a VM, and not by running Packer. Is that correct? It seems like using Ansible to call Packer (if needed) and then call Terraform is a common practice. At least this talk seems to suggest that: https://www.hashicorp.com/resources/ansible-terraform-better-together/ – Erik B Jun 04 '20 at 21:58
  • @ErikB Please dont go the path of configuration after provisioning. Packer can launch configuration tools during build process - example can be provisioner of type salt. More or less how most people generate AMIs is that they take the base AMI (managed by DevOps teams) and build with packer on top of this AMI. For example you take Centos7 AMI with preinstalled common packages and your dependent team uses it as a source_ami for their web_applications configuration. Please also consider Images management to not land in situation where you have 25 000 unused AMIs. – Daniel Hajduk Nov 17 '20 at 08:02
1

I wrote a blog post on this topic, Keeping Packer and Terraform images versioned, synchronized and DRY.

In summary:

Our goals

  • Images built by Packer must be versioned.
  • Image versioning must be DRY (kept in a single place) and shared between Packer and Terraform, to avoid going out-of-sync by mistake.
  • The configuration files for Packer, Terraform and image versioning information must be stored in git, so that checking out a specific commit and doing a terraform apply should be enough to perform a rollback.
  • Terraform must detect automatically, based only on local information, that there is a more recent version of one or multiple images, OR that a more recent version should be built.
  • It must be possible to have N independent development/staging environments, where image versioning is independent from production.
  • Approach must be IaaS agnostic (must work with any Cloud provider).

Summary of the approach

Use a naming convention like

<IMAGE-NAME> ::= <ROLE>__<BRANCH>-<REVISION>

Define the value of the variable in a separate file, packer/versions.pkvars.hcl:

service-a-img-name = "service-a__main-3"

Build the image with:

$ packer build -var-file=versions.pkrvars.hcl minimal.pkr.hcl

On the Terraform side, since file packer/versions.pkvars.hcl is in HCL, we can read it from Terraform:

$ terraform apply -var-file=../../packer/versions.pkrvars.hcl

All the details are in the blog post mentioned above.

marco.m
  • 4,573
  • 2
  • 26
  • 41
0

So for what it’s worth, the image versioning is useful because you can save some defaults for things like kubernetes host nodes (pre-downloaded docker images, etc) so by the time it passes the AWS checks, it is already joining the cluster.

I have don’t this for numerous apps and found it is typically best to do something like below

vendor-app-appversion-epoch

This approach allows you to version your Ami along with your apps, and then you can treat your instances like cattle (to be slaughtered) versus pets (to be cared for throughout their life).

data "aws_ami" "amazon_linux2" {
  most_recent = true
  filter {
    name = "name"
    values = ["amzn2-ami-*-x86_64-gp2"]
  }

  filter {
    name = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["amazon"]
}

This will pull the latest image for linux2 when you apply terraform.

hikerspath
  • 106
  • 7
  • I'm sorry, but I don't see how this is answering my question. It seems like you are suggesting a naming convention rather than a versioning strategy. You also do not mention anything about the interaction between Terraform and Packer. If you are suggesting that I manually copy the image name from the output of Packer to a variable in Terraform, then this answer doesn't help me at all, because that is what I am trying to avoid. – Erik B Jun 03 '20 at 09:54
  • Well, it won't let me place formatted text here, so I elaborated. Yes, it is a naming structure, but it allows you to always pull latest. – hikerspath Jun 04 '20 at 01:19