0

The below-mentioned code is created for exporting all the findings from the security hub to an S3 bucket using lambda functions. The filters are set for exporting only CIS-AWS foundations benchmarks. There are more than 20 accounts added as the members in security hub. The issue that I'm facing here is even though I'm using the NextToken configuration. The output doesn't have information about all the accounts. Instead, it just displays any one of the account's data randomly.

Can somebody look into the code and let me know what could be the issue, please?

import boto3
import json
from botocore.exceptions import ClientError
import time
import glob
 
client = boto3.client('securityhub')
s3 = boto3.resource('s3')
 
storedata = {}
_filter = Filters={
'GeneratorId': [
{
'Value': 'arn:aws:securityhub:::ruleset/cis-aws-foundations-benchmark',
'Comparison': 'PREFIX'
}
],
}
 
def lambda_handler(event, context):
    response = client.get_findings(
    Filters={
        'GeneratorId': [
            {
                'Value': 'arn:aws:securityhub:::ruleset/cis-aws-foundations-benchmark',
                'Comparison': 'PREFIX'
            },
        ],
    },
    )
    results = response["Findings"]
    while "NextToken" in response:
        response = client.get_findings(Filters=_filter,NextToken=response["NextToken"])
        results.extend(response["Findings"])
        storedata = json.dumps(response)
    print(storedata)
 
    save_file = open("/tmp/SecurityHub-Findings.json", "w")
    save_file.write(storedata)
    save_file.close()
 
    for name in glob.glob("/tmp/*"):
      s3.meta.client.upload_file(name, "xxxxx-security-hubfindings", name)

TooManyRequestsException error is also getting now.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Sidin
  • 15
  • 5
  • Are you saying that `results` doesn't have what you expect or that your print of `storedata` doesn't print what you expect? Does your code actually invoke get_findings multiple times with NextToken? Does the equivalent awscli command yield something different and is it what you expected to see? Let's do some debugging here. – jarmod Jul 09 '21 at 23:42
  • storedata doesn't print the expected data. Also not sure whether the code invokes get_findings multiple times with NextToken. The data is fully available when run through AWS CLI. – Sidin Jul 09 '21 at 23:45
  • If you read the code, you'll see that storedata is reassigned each time through the loop so its value after the loop will be the findings from the very last page of results. To find out what's happening in your Lambda function, print progress at various stages e.g. in the loop and then check CloudWatch Logs afterwards. – jarmod Jul 09 '21 at 23:48
  • I'm a newbie to coding, I wrote this to get the whole findings exported to an S3 bucket. Is it possible to have all the data in the storedata? – Sidin Jul 09 '21 at 23:50
  • You're actually accumulating the findings already in `results`. Isn't that what you want? – jarmod Jul 09 '21 at 23:53
  • The output from the cli is close to 600KB and from this , the output is less than 10KB, means, the result is having data from only one member account in the security hub, meanwhile, the output from the cli is having 22 accounts – Sidin Jul 09 '21 at 23:57
  • When you say "the output from this", are you referring to `storedata` that you dump to file? We've already seen that that's wrong because it's only the last page of results. You should write `json.dumps(results)`, not `storedata`. – jarmod Jul 10 '21 at 00:06
  • I've posted the entire code above for reference. – Sidin Jul 10 '21 at 00:15
  • @jarmod Thank You. I have updated the code as "storedata = json.dumps(results)" and now it's working. – Sidin Jul 10 '21 at 00:34
  • @jarmod Can you please post the answer here? – Sidin Jul 12 '21 at 08:09

1 Answers1

0

The problem is in this code that paginates the security findings results:

while "NextToken" in response:
    response = client.get_findings(Filters=_filter,NextToken=response["NextToken"])
    results.extend(response["Findings"])
    storedata = json.dumps(response)

print(storedata)

The value of storedata after the while loop has completed is the last page of security findings, rather than the aggregate of the security findings.

However, you're already aggregating the security findings in results, so you can use that:

save_file = open("/tmp/SecurityHub-Findings.json", "w")
save_file.write(json.dumps(results))
save_file.close()
jarmod
  • 71,565
  • 16
  • 115
  • 122