1

I'm using Packer to set up a Windows VM on the free tier of AWS EC2. The image is properly set up and I'm able to launch it, but I can't connect to it with SSM. Here's my Packer template:

{
  "variables": {
    "aws_access_key": null,
    "aws_secret_key": null
  },
  "builders": [
    {
      "name": "windows",
      "type": "amazon-ebs",
      "access_key": "{{user `aws_access_key`}}",
      "secret_key": "{{user `aws_secret_key`}}",
      "region": "us-east-1",
      "source_ami_filter": {
        "filters": {
          "virtualization-type": "hvm",
          "name": "Windows_Server-2019-English-Full-Base-2020.11.11",
          "root-device-type": "ebs"
        },
        "owners": "amazon",
        "most_recent": true
      },
      "instance_type": "t2.micro",
      "ami_name": "build-runner-windows {{timestamp}}",
      "communicator": "winrm",
      "force_deregister": true,
      "winrm_insecure": true,
      "winrm_username": "Administrator",
      "winrm_use_ssl": true,
      "user_data_file": "./windows_bootstrap.txt"
    }
  ]
}

No provisioners yet, I'm just trying to get the thing working.

Here's the content of ./windows_bootstrap.txt, as given in the official documentation:

<powershell>
write-output "Running User Data Script"
write-host "(host) Running User Data Script"

Set-ExecutionPolicy Unrestricted -Scope LocalMachine -Force -ErrorAction Ignore

# Don't set this before Set-ExecutionPolicy as it throws an error
$ErrorActionPreference = "stop"

# Remove HTTP listener
Remove-Item -Path WSMan:\Localhost\listener\listener* -Recurse

# Create a self-signed certificate to let ssl work
$Cert = New-SelfSignedCertificate -CertstoreLocation Cert:\LocalMachine\My -DnsName "packer"
New-Item -Path WSMan:\LocalHost\Listener -Transport HTTPS -Address * -CertificateThumbPrint $Cert.Thumbprint -Force

# WinRM
write-output "Setting up WinRM"
write-host "(host) setting up WinRM"

cmd.exe /c winrm quickconfig -q
cmd.exe /c winrm set "winrm/config" '@{MaxTimeoutms="1800000"}'
cmd.exe /c winrm set "winrm/config/winrs" '@{MaxMemoryPerShellMB="1024"}'
cmd.exe /c winrm set "winrm/config/service" '@{AllowUnencrypted="true"}'
cmd.exe /c winrm set "winrm/config/client" '@{AllowUnencrypted="true"}'
cmd.exe /c winrm set "winrm/config/service/auth" '@{Basic="true"}'
cmd.exe /c winrm set "winrm/config/client/auth" '@{Basic="true"}'
cmd.exe /c winrm set "winrm/config/service/auth" '@{CredSSP="true"}'
cmd.exe /c winrm set "winrm/config/listener?Address=*+Transport=HTTPS" "@{Port=`"5986`";Hostname=`"packer`";CertificateThumbprint=`"$($Cert.Thumbprint)`"}"
cmd.exe /c netsh advfirewall firewall set rule group="remote administration" new enable=yes
cmd.exe /c netsh firewall add portopening TCP 5986 "Port 5986"
cmd.exe /c net stop winrm
cmd.exe /c sc config winrm start= auto
cmd.exe /c net start winrm

</powershell>

And here's the output of me creating an image from it. So far so good.

PS C:\Users\Jesse\Infrastructure> packer build -var-file="template-vars.json" minimal.json
windows: output will be in this color.

==> windows: Force Deregister flag found, skipping prevalidating AMI Name
    windows: Found Image ID: ami-02b5cd5aa444bee23
==> windows: Creating temporary keypair: <redacted>
==> windows: Creating temporary security group for this instance: packer_5fb7fe2a-14c6-e0e1-feb5-1eae06766ef3
==> windows: Authorizing access to port 5986 from [0.0.0.0/0] in the temporary security groups...
==> windows: Launching a source AWS instance...
==> windows: Adding tags to source instance
    windows: Adding tag: "Name": "Packer Builder"
    windows: Instance ID: <redacted>
==> windows: Waiting for instance (<redacted>) to become ready...
==> windows: Waiting for auto-generated password for instance...
    windows: It is normal for this process to take up to 15 minutes,
    windows: but it usually takes around 5. Please wait.
    windows:
    windows: Password retrieved!
==> windows: Using winrm communicator to connect: <redacted>
==> windows: Waiting for WinRM to become available...
    windows: WinRM connected.
==> windows: #< CLIXML
==> windows: <Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>
==> windows: Connected to WinRM!
==> windows: Stopping the source instance...
    windows: Stopping instance
==> windows: Waiting for the instance to stop...
==> windows: Creating AMI build-runner-windows 1605893672 from instance <redacted>
    windows: AMI: ami-08986fa2707bad0dd
==> windows: Waiting for AMI to become ready...
==> windows: Terminating the source AWS instance...
==> windows: Cleaning up any extra volumes...
==> windows: No volumes to clean up, skipping
==> windows: Deleting temporary security group...
==> windows: Deleting temporary keypair...
Build 'windows' finished after 5 minutes 31 seconds.

==> Wait completed after 5 minutes 31 seconds

==> Builds finished. The artifacts of successful builds are:
--> windows: AMIs were created:
us-east-1: ami-08986fa2707bad0dd

PS C:\Users\Jesse\Infrastructure>

Here's where the trouble starts. When trying to connect through the AWS control panel, I get this error message:

AWS EC2 error message

The problem is:

  • My Packer image is based on a built-in Windows image that should have SSM Agent included.
  • My IAM should have SSM access enabled (although I actually don't know what I'm doing).
  • I followed all required steps of the Session Manager setup.

Here's what my currently-running instance looks like, as described by aws ec2 describe-instances:

{
    "Reservations": [
        {
            "Groups": [],
            "Instances": [
                {
                    "AmiLaunchIndex": 0,
                    "ImageId": "ami-08986fa2707bad0dd",
                    "InstanceId": "<redacted>",
                    "InstanceType": "t2.micro",
                    "KeyName": "test",
                    "LaunchTime": "2020-11-20T17:44:50+00:00",
                    "Monitoring": {
                        "State": "disabled"
                    },
                    "Placement": {
                        "AvailabilityZone": "us-east-1a",
                        "GroupName": "",
                        "Tenancy": "default"
                    },
                    "Platform": "windows",
                    "PrivateDnsName": "<redacted>",
                    "PrivateIpAddress": "<redacted>",
                    "ProductCodes": [],
                    "PublicDnsName": "<redacted>",
                    "PublicIpAddress": "<redacted>",
                    "State": {
                        "Code": 16,
                        "Name": "running"
                    },
                    "StateTransitionReason": "",
                    "SubnetId": "<redacted>",
                    "VpcId": "<redacted>",
                    "Architecture": "x86_64",
                    "BlockDeviceMappings": [
                        {
                            "DeviceName": "/dev/sda1",
                            "Ebs": {
                                "AttachTime": "2020-11-20T17:44:51+00:00",
                                "DeleteOnTermination": true,
                                "Status": "attached",
                                "VolumeId": "<redacted>"
                            }
                        }
                    ],
                    "ClientToken": "",
                    "EbsOptimized": false,
                    "EnaSupport": true,
                    "Hypervisor": "xen",
                    "IamInstanceProfile": {
                        "Arn": "<redacted>",
                        "Id": "<redacted>"
                    },
                    "NetworkInterfaces": [
                        {
                            "Association": {
                                "IpOwnerId": "amazon",
                                "PublicDnsName": "<redacted>",
                                "PublicIp": "<redacted>"
                            },
                            "Attachment": {
                                "AttachTime": "2020-11-20T17:44:50+00:00",
                                "AttachmentId": "<redacted>",
                                "DeleteOnTermination": true,
                                "DeviceIndex": 0,
                                "Status": "attached",
                                "NetworkCardIndex": 0
                            },
                            "Description": "",
                            "Groups": [
                                {
                                    "GroupName": "<redacted>",
                                    "GroupId": "<redacted>"
                                }
                            ],
                            "Ipv6Addresses": [],
                            "MacAddress": "<redacted>",
                            "NetworkInterfaceId": "<redacted>",
                            "OwnerId": "<redacted>",
                            "PrivateDnsName": "<redacted>",
                            "PrivateIpAddress": "<redacted>",
                            "PrivateIpAddresses": [
                                {
                                    "Association": {
                                        "IpOwnerId": "amazon",
                                        "PublicDnsName": "<redacted>",
                                        "PublicIp": "<redacted>"
                                    },
                                    "Primary": true,
                                    "PrivateDnsName": "<redacted>",
                                    "PrivateIpAddress": "<redacted>"
                                }
                            ],
                            "SourceDestCheck": true,
                            "Status": "in-use",
                            "SubnetId": "<redacted>",
                            "VpcId": "<redacted>",
                            "InterfaceType": "interface"
                        }
                    ],
                    "RootDeviceName": "/dev/sda1",
                    "RootDeviceType": "ebs",
                    "SecurityGroups": [
                        {
                            "GroupName": "<redacted>",
                            "GroupId": "<redacted>"
                        }
                    ],
                    "SourceDestCheck": true,
                    "VirtualizationType": "hvm",
                    "CpuOptions": {
                        "CoreCount": 1,
                        "ThreadsPerCore": 1
                    },
                    "CapacityReservationSpecification": {
                        "CapacityReservationPreference": "open"
                    },
                    "HibernationOptions": {
                        "Configured": false
                    },
                    "MetadataOptions": {
                        "State": "applied",
                        "HttpTokens": "optional",
                        "HttpPutResponseHopLimit": 1,
                        "HttpEndpoint": "enabled"
                    },
                    "EnclaveOptions": {
                        "Enabled": false
                    }
                }
            ],
            "OwnerId": "<redacted>",
            "ReservationId": "<redacted>"
        }
    ]
}

What am I doing wrong?

JesseTG
  • 113
  • 6

3 Answers3

1

You need to, at least, run the Quick Setup for SSM, and you need to add the AmazonSSMManagedInstanceCore policy to the EC2 instance's role (or just use the AmazonSSMRoleForInstancesQuickSetup role if you don't need any other set of policies)

Be advised, it takes a bit of time for the Quick Setup to finish, and I've found that if you didn't set the role when launching the instance sometimes you may need to do an SSH session to it before SSM will "kick in" (I don't know why this is). While you're at it, check the SSM Agent is actually running.

https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-quick-setup.html

Oscar De León
  • 241
  • 1
  • 2
  • 6
0

The question is whether it’s your custom AMI that’s broken or if it’s the other settings - network setup, IAM role, etc.

Try to spin up an official Windows AMI with exactly the same configuration as now (same subnet, same IAM role, same security group, etc) and see if it works. If it does then it’s the packer config that needs fixing, if it doesn’t then it’s the launch settings.

Once the cause is established take it from there.

Update: Since it looks like the problem is in the Packer config you’ll have to do some trouble shooting. Start with a very small packer config, verify that the image still works, add some more changes, verify that it still works, and so on until it breaks. Once it breaks you will know what change / line did it and then we can figure out why.

Hope that helps :)

MLu
  • 24,849
  • 5
  • 59
  • 86
  • Yes, I can connect to a totally plain (i.e. non-Packer) image with the same IAM and network setup. Taken from the same base AMI and everything. But now what? – JesseTG Nov 20 '20 at 23:09
  • @JesseTG now figure out which packer setting breaks it. Added some hints to the answer. – MLu Nov 21 '20 at 00:24
  • That's what I've been doing...have you used Packer? What information would be useful to you? – JesseTG Nov 21 '20 at 03:19
  • @JesseTG on Linux I’d jump on the instance, check if the ssm agent is running, if not why not, and if yes what’s in the logs. But I’m not a big Windows user so don’t really know how to check all that in your case. Maybe some firewall issue? – MLu Nov 21 '20 at 03:23
0

It turns out using the windows-restart provisioner to reboot the builder solved my problem. Everything I was doing (including my security group and IAM) was otherwise correct. What do you know?

JesseTG
  • 113
  • 6