0

Recently, I encountered an issue when trying to retrieve data from Notion using its API. The data consists of more than 300 values, which is a problem since Notion only lets you extract 100 elements at a time. Fortunately, I could find a solution to get more than one page by utilizing the cursors in the JSON. However, it lets me accumulate data for three pages, but once it goes over three pages (300 elements), it gives me something like the following.

{'errorId': '94d72e18-3501-4976-8c61-1c66177045d3',
 'name': 'PayloadTooLargeError',
 'message': 'Request body too large.'}

    

The above error message is an output of data_hidden at the end of the loop in the following code.

 def next_page(data):
        readUrl = f"https://api.notion.com/v1/databases/{databaseId}/query"
        next_cur = data['next_cursor']
        try:
            while data['has_more']:
                data['start_cursor'] = next_cur
                data_hidden = json.dumps(data)
                # Gets the next 100 results
                data_hidden = requests.post(
                    readUrl, headers=headers, data=data_hidden).json()
                
                data["results"] += data_hidden["results"]
                next_cur = data_hidden['next_cursor']
                if next_cur is None:
                    break
        except:
            pass
        return data

I'm not really sure why this happening at the 300 mark. After numerous trials and errors, I am almost convinced that Notion won't let you get more than 300 values from the database. Hopefully, I am wrong, and if there's any idea or possible solution, please let me know. Also, if there's any information that I've missed, please comment below.

Andy Lee
  • 43
  • 4

1 Answers1

1

You are sending all the data as data_hidden with your request, most notably all the already found results (since data["results"] is getting all new results and data_hidden is basically data). Maybe try to specify some kind of filter that will remain the same and accumulate the results separately. Depending on additional filtering or sorting, you could simply send {"start_cursor": next_cur} after the first request as your data parameter for the request.

Simon
  • 126
  • 2