Python requests/httpx repsonses and garbage collection

Question

I originally found this using httpx, but it also applies to requests, which is more well known, so I'll be using the latter in my examples.

I am making multiple requests in parallel, where the data is coming from a generator that creates byte chunks just in time by reading a file. I set up counters/locks to limit the number of concurrent requests to avoid accumulating too much data in memory. The responses are accumulated in a list, since I need to pull some data from the headers once I am done.

This did not exactly work as expected, memory usage seemed to continue increasing as requests were executed. I found that this is because requests keeps a reference to the request body around, which prevents the ref count from reaching 0, so the data is not garbage collected.

A minimal reproducible example:

import os
import sys
import requests

response = requests.put("https://httpbin.org/put", data=os.urandom(100000))
response.request.body  # a bunch of bytes
sys.getrefcount(response.request.body)  # I get 2

For httpx, the reference is stored in httpx.Response.request.stream._body.

I resolved my issue by deleting the reference to the request before accumulating the response, but is this expected? It was pretty hard to track down, and it seems to me like it would lead to a lot un-intentional memory leakage like I was experiencing.

Well I "solved" the issue by deleting the attribute from the response, as per above — LoveToCode, Nov 24 '21 at 23:37
Did memory actually stop increasing after you deleted the attribute from the response? If so can you please explain the process that helped you track this down? — Dragolis, Jul 17 '22 at 20:59
This was years ago, I don't remember exactly how I tracked it down. But yes, I remember memory consumption went down after deleting that attribute. — LoveToCode, Jul 17 '22 at 21:25

Python requests/httpx repsonses and garbage collection

0 Answers0

Linked