34

BACKGROUND:

The AWS operation to list IAM users returns a max of 50 by default.

Reading the docs (links) below I ran following code and returned a complete set data by setting the "MaxItems" to 1000.

paginator = client.get_paginator('list_users')
response_iterator = paginator.paginate(
 PaginationConfig={
     'MaxItems': 1000,
     'PageSize': 123})
for page in response_iterator:
    u = page['Users']
    for user in u:
        print(user['UserName'])

http://boto3.readthedocs.io/en/latest/guide/paginators.html https://boto3.readthedocs.io/en/latest/reference/services/iam.html#IAM.Paginator.ListUsers

QUESTION:

If the "MaxItems" was set to 10, for example, what would be the best method to loop through the results?

I tested with the following but it only loops 2 iterations before 'IsTruncated' == False and results in "KeyError: 'Marker'". Not sure why this is happening because I know there are over 200 results.

marker = None

while True:
    paginator = client.get_paginator('list_users')
    response_iterator = paginator.paginate( 
        PaginationConfig={
            'MaxItems': 10,
            'StartingToken': marker})
    #print(response_iterator)
    for page in response_iterator:
        u = page['Users']
        for user in u:
            print(user['UserName'])
            print(page['IsTruncated'])
            marker = page['Marker']
            print(marker)
        else:
            break
BMW
  • 42,880
  • 12
  • 99
  • 116
user45097
  • 561
  • 2
  • 7
  • 16
  • 1
    in boto3.client.get_paginator, MaxItems seems become a data listing threshold/limiter, it is not use as paginator. You need to use `PageSize` for pagination – mootmoot Aug 30 '16 at 08:35
  • See also: https://stackoverflow.com/a/59816089/75033 – John Mee Sep 02 '22 at 05:31

5 Answers5

30

(Answer rewrite) **NOTE **, the paginator contains a bug that doesn't tally with the documentation (or vice versa). MaxItems doesn't return the Marker or NextToken when total items exceed MaxItems number. Indeed PageSize is the one that controlling return of Marker/NextToken indictator.

import sys
import boto3
iam = boto3.client("iam")
marker = None
while True:
    paginator = iam.get_paginator('list_users')
    response_iterator = paginator.paginate( 
        PaginationConfig={
            'PageSize': 10,
            'StartingToken': marker})
    for page in response_iterator:
        print("Next Page : {} ".format(page['IsTruncated']))
        u = page['Users']
        for user in u:
            print(user['UserName'])
    try:
        marker = response_iterator['Marker']
        print(marker)
    except KeyError:
        sys.exit()

It is not your mistake that your code doesn't works. MaxItems in the paginator seems become a "threshold" indicator. Ironically, the MaxItems inside original boto3.iam.list_users still works as mentioned.

If you check boto3.iam.list_users, you will notice either you omit Marker, otherwise you must put a value. Apparently, paginator is NOT a wrapper for all boto3 class list_* method.

import sys
import boto3
iam = boto3.client("iam")
marker = None
while True:
    if marker:
        response_iterator = iam.list_users(
            MaxItems=10,
            Marker=marker
        )
    else:
        response_iterator = iam.list_users(
            MaxItems=10
        )
    print("Next Page : {} ".format(response_iterator['IsTruncated']))
    for user in response_iterator['Users']:
        print(user['UserName'])

    try:
        marker = response_iterator['Marker']
        print(marker)
    except KeyError:
        sys.exit()

You can follow up the issue I filed in boto3 github. According to the member, you can call build_full_result after paginate(), that will show the desire behavior.

Or Arbel
  • 2,965
  • 2
  • 30
  • 40
mootmoot
  • 12,845
  • 5
  • 47
  • 44
  • Thanks for the suggestion above but it doesn't seem to work. I have run this code and only get the "MaxItems" + 1. So if MaxItems is set to 10, I'll get 11 users returned before it stops. – user45097 Aug 29 '16 at 21:18
  • For each page returned by "response_iterator = paginator.paginate" there is a bunch of data included in the dict outside of the 'Users' list. For example: `RequestId, HTTPSStatusCode, HTTPHeaders` and a string called `'Marker'`. You can see this by dropping in `print(page)` – user45097 Aug 29 '16 at 21:30
  • @user45097 : I just realise the mistake, list_user are slightly different than S3 list_object next_token stuff . I will update my answer. – mootmoot Aug 30 '16 at 07:40
  • 6
    Example of using build_full_result: `paginator = client.get_paginator('list_users'); users = paginator.paginate().build_full_result()` – Putnik Feb 26 '20 at 12:54
  • 1
    You can deal with the extra arg as follows: kwargs = {'MaxItems': 10} \ if bool(marker): \ kwargs['Marker'] = marker \ response_iterator = iam.list_users(**kwargs) – MikeW Aug 04 '21 at 14:57
  • This library is bugged to the point of being unusable. If it does not fail in delivering all the elements in a paginated format (so you think you processed all elements when in reality you haven't) it will surely deliver the same elements multiple times, making you process duplicates (in a 21 elements list it actually returned 43, some elements being duplicated 4 times while others actually being unique). Beware of these problems before using it. – Dan Nemes Feb 21 '23 at 08:14
5

This code wasn't working for me. It always drops off the remainder of the items on the last page and doesn't include them in the results. Gives me a result of 60 accounts when I know I have 68. That last result page doesn't get appended to my list of account UserName's. I have concerns that these above examples are doing the same thing and people aren't noticing this in the results.

That and it seems overly complex to paginate through with an arbitrary size for what purpose?

This should be simple and gives you a complete listing.

import boto3
iam = boto3.client("iam")
paginator = iam.get_paginator('list_users')
response_iterator = paginator.paginate()
accounts=[]
for page in response_iterator:
    for user in page['Users']:
        accounts.append(user['UserName'])
len(accounts)
68
Jeff S
  • 51
  • 1
  • 1
1

This post is pretty old but due to the lack of concise documetation I want to share my code for all of those that are struggling with this

Here are two simple examples of how I solved it using Boto3's paginator hoping this helps you understand how it works

Boto3 official pagination documentation: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/paginators.html

AWS API specifying that the first token should be $null (None in Python): https://docs.aws.amazon.com/powershell/latest/reference/items/Get-SSMParametersByPath.html

Examples:

First example with little complexity for people like me who struggled to understand how this works:

def read_ssm_parameters():
    page_iterator = paginator.paginate(
        Path='path_to_the_parameters',
        Recursive=True,
        PaginationConfig={
        'MaxItems': 10,
        'PageSize': 10,
        }
    )

    while myNextToken:
        for page in page_iterator:
             print('# This is new page')
             print(page['Parameters'])
             if 'NextToken' in page.keys():
                 print(page['NextToken'])
                 myNextToken=page['NextToken']
             else:
                 myNextToken=False

    page_iterator = paginator.paginate(
        Path=baseSSMPath,
        Recursive=True,
        PaginationConfig={
            'MaxItems': 10,
            'PageSize': 10,
            'StartingToken': myNextToken
        }
    )

Second example with reduced code but without the complexity of using recursion

def read_ssm_parameters(myNextToken='None'):
    while myNextToken:
        page_iterator = paginator.paginate(
            Path='path_to_the_parameters',
            Recursive=True,
            PaginationConfig={
                'MaxItems': 10,
                'PageSize': 10,
                'StartingToken': myNextToken
            }
        )

        for page in page_iterator:
            if 'NextToken' in page.keys():
                print('# This is a new page')
                myNextToken=page['NextToken']
                print(page['Parameters'])
            else:
                # Exit if there are no more pages to read
                myNextToken=False

Hope this helps!

Alf
  • 31
  • 4
1

The correct answer was stated in one of the comments above. I have ran into this issue myself before. I manage 56 aws accounts and was looping through all of them and noticed that 'list_users' would only return 100 users. After digging into the issue I discovered this thread and the boto3 github issue related to this problem. Calling build_full_result() will return a complete list of users.

paginator = client.get_paginator('list_users') 
users = paginator.paginate().build_full_result()
-1

I will post my solution here and hopefully help other people do their job faster instead of fiddling around with the amazingly written boto3 api calls.

My use case was to list all the Security Hub ControlIds using the SecurityHub.Client.describe_standards_controls function.


controlsResponse = sh_client.describe_standards_controls(
StandardsSubscriptionArn = enabledStandardSubscriptionArn)

controls = controlsResponse.get('Controls')

# This is the token for the 101st item in the list.
nextToken = controlsResponse.get('NextToken') 

# Call describe_standards_controls with the token set at item 101 to get the next 100 results 
controlsResponse1 = sh_client.describe_standards_controls(
StandardsSubscriptionArn = enabledStandardSubscriptionArn, NextToken=nextToken)

controls1 = controlsResponse1.get('Controls')

# And make the two lists into one
controls.extend(controls1)

No you have a list of all the SH standards controls for the specified Subscription Standard(e.g., AWS foundational Standard)

For example if I want to get all the ControlIds I can just iterate the 'controls' list and do

controlId=control.get("ControlId")

same for other field in the response as it is described here

furydrive
  • 372
  • 2
  • 5