We are applying patches to our Windows instances using the patch manager function in AWS Systems Manager. We have a patch baseline that is executed against a set of windows instances (each of which are part of a patch group) by executing a maintenance window which in turn executes a run command against each of the instances. However we are finding the following:
- The instances in question seem to get patches installed correctly. Executing
wmic qfe list
shows that the patches have been installed on the target machines - The target instances are then rebooted after patches are installed
- The run command remains in progress indefinitely
From more investigation we found that the amazon-ssh-agent failed to start when the instances are rebooted. The error logs were as follows:
[devInstanceA]: PS C:\ProgramData\Amazon\SSM\Logs> get-content .\errors.log -tail 20
2020-11-09 09:36:02 ERROR [func1 @ coremanager.go.246] [instanceID=i-04b3ce4e6e53b0b6f] error occurred trying to start core module. Plugin name: StartupProcessor. Error: Internal error occurred by startup processor: runtime error: invalid memory address or nil pointer dereference
Once we manually restarted the amazon-ssh-agent again the run command completed successfully. This issue is we dont want to have to manually start the amazon-ssh-agenton each instance especially as we have alot of instances!
Any ideas on what is causing this, i.e. why is the amazon-ssh-agent not starting up successfully after automatic reboot?