0

We have a team shared AWS account, that sometimes things are hard to debug. Especially, for EMR APIs, throttling happens regularly, that it'll be nice to have CloudTrail logs tell people who is not being nice when using EMR. I think our CloudTrail logging is enabled, that I can see these API events with EMR as event source--

AddJobFlowSteps
RunJobFlow
TerminateJobFlows

I'm pretty sure that I'm calling DescribeCluster for plenty times and caused some throttling, but not sure why they are not showing up in my CloudTrail logs...

Can someone help understand --

  • Is there additional setting needed for DescribeCluster EMR API, in order to log events to CloudTrail?
  • And what about other EMR APIs? Can they be configured to log events to CloudTrails, without using SDK explicitly writing to CloudTrails?

I have read these articles, feels like much can be done in CloudTrails...

Appreciate any help!

Pen2
  • 23
  • 7
  • 1
    Describe* calls won't show up in CloudTrail console. However, they are stored in the S3 logs. You need to set up a tool that parses CloudTrail logs stored in the S3 bucket. Couple of enterprise tools for example are: Splunk, Sumologic, etc. – krishna_mee2004 Jun 06 '18 at 11:39
  • @krishna_mee2004 Thanks for the hint! I parsed my logs and it seems EMR DescribeCluster API is translated into other lower level calls with EC2 in my logs... – Pen2 Jun 07 '18 at 18:20
  • One instance... one of my log entries has -- eventSource:"ec2.amazonaws.com", eventName:"DescribeInstances", sourceIPAddress:"elasticmapreduce.amazonaws.com", userAgent:"elasticmapreduce.amazonaws.com"... I think that's my "DescribeCluster", but don't really know for sure how that happened under the hood.. – Pen2 Jun 07 '18 at 18:23
  • Each AWS service will emit CloudTrail events whenever their public APIs are called. The `eventSource` determines where the event is coming from. EMR uses EC2 instances for its clusters so EMR **will** call EC2 on your behalf to add/remove slaves from your clusters. If you're interested about EMR limits then you can safely ignore events coming in from `ec2.amazonaws.com`. – Gaston May 28 '20 at 01:44
  • @krishna_mee2004 — CloudTrail now supports Read events in the Event History. – Gaston May 28 '20 at 01:48

1 Answers1

1

A quick summary of AWS cloudtrail: The events recorded by AWS cloudtrail are of two types: Management events and Data events. Management events include actions like: stopping an instance, deleting a bucket etc. Data events are only available for two services (S3 and lambda), which include actions like: object 'abc.txt' was read from the S3 bucket.

Under management events, we again have 4 types:

  1. Write-only

  2. Read-only

  3. All (both reads and writes)

  4. None

The DescribeCluster event that you are looking for comes under the management event 'Read-only' type. DescribeCluster - cloudtrail image:
DescribeCluster - cloudtrail image

Please ensure that you have selected "All" or "ReadOnly" management event type in your cloudtrail trail. Selecting "WriteOnly" in management event type in your cloudtrail trail will not record 'DescribeCluster'. There is no other AWS service specific setting that you can enable in cloudtrail.

Also note that the 'Event history' tab in AWS Cloudtrail console records all types of logs (including ReadOnly) for a period of 90 days. You can see the DescribeCluster event there too.