15

I have a fargate task which I have scheduled to run with CloudWatch Event rules, and output a timestamp to a database on a successful run. It also outputs a logfile to CloudWatch for every time it runs.

However, there was 1 time where the log file was not created, and the database not updated. I suspect the task was never even started, or had failed to start.

In CloudWatch, the event rule shows trigger and invocation at the time I expected the task to run, so I assume the task at least attempted to start.

My question is: is there any way I can debug or log information about the cluster failing to start a task?

Please let me know if I need to provide more information.

Edit: I should specify I'm looking for a way to read this information in a log file somewhere. I know I can see failed task reason in the web console, but that's only for relatively recent tasks.

I have posted the same question here: https://www.reddit.com/r/aws/comments/adtqvt/debugging_failed_fargate_task_initialization/ and StackOverflow: https://forums.aws.amazon.com/thread.jspa?messageID=884638&#884638

user3603567
  • 151
  • 1
  • 1
  • 4

3 Answers3

11
  1. Go to the cluster and choose the Tasks tab
  2. In the lower pane, choose Stopped for the Desired Task Status value
  3. Locate the desired Task and click it's GUID
  4. Scroll down to the Containers section and expand the relevant containers that are experiencing errors

You'll see some kind of Status reason for the error. In my case it was:

CannotStartContainerError: API error (500): failed to initialize logging driver: Cannot determine region for awslogs driver

Edit: I can't really take credit for figuring this out - found it here:

https://github.com/aws/amazon-ecs-agent/issues/1654#issuecomment-437178282

  • 1
    Thanks for the suggestion, but unfortunately this happens irregularly and I am not notified when it *doesn't* happen, so I have never had the chance to see the task itself. So I'm looking for some kind of meta data that will show the cluster managing of tasks. Like a logfile that shows "cluster recieved task from Event Rule", etc, to see where the chain breaks – user3603567 Apr 01 '19 at 14:35
9

Try going to "CloudWatch -> Logs -> Insights" and click on "Run Query":

enter image description here

Daniel
  • 21,933
  • 14
  • 72
  • 101
0

I just faced this problem and the lack of logs did make it quite difficult to resolve.

The problem in my case was the security group used for the task had been deleted. Hope this helps if any one has a similar issue.

J. Madeley
  • 11
  • 4