1

I have a loop (like below) that leaks memory:

api = api()#class handling api-request - no leak
page_token = ""
page_size = 20
ndb.get_context().set_cache_policy(False)
while True:
  logging.debug(memory_usage().current())
  api_list = api.list(pageToken=page_token)
  ele = None  
  for ele in api_list["eles"]:
    query = Ndb_element.query(Ndb_element.atribute==ele["atribute"]).order(Ndb_element.key)
    ndb_ele = None
    ndb_eles, curs, more = query.fetch_page(page_size=page_size)
    while True:
      if len(ndb_eles ) > 0:
        if ndb_ele :
          ndb_ele = merge_objs(ndb_ele , ndb_eles)
        else:
          ndb_ele = merge_objs(ndb_eles[0], ndb_eles[1:])
      if more:
        ndb_eles , curs, more = query.fetch_page(page_size=page_size,start_cursor=curs)
      else:
        break
    if ndb_ele and ndb_ele.same_as(ele):
      deferred.defer(do_stuff, ele, ndb_ele)
  if "nextPageToken" in api_list:
    page_token = api_list["nextPageToken"]
  else:
    break
  ndb.get_context().clear_cache()
  gc.collect()

I have not been able to find the leakage. It grows like this:

44.640625
52.546875
56.8203125
60.10546875
63.62890625
68.30078125
72.45703125
75.6640625
....
136.19921875
139.0546875
141.56640625
145.0703125
147.2265625
148.21484375
150.4609375

[!!!] Exceeded soft private memory limit of 128 MB with 153 MB after servicing 0 requests total

I fetch in pages and have checked the size of data-structures. Amount of memory leaked grows with page-size.

Are there some core python-concept I've missed?

Niklas Ternvall
  • 510
  • 1
  • 3
  • 19
  • Is this really a leak, that loop is performed in a single request so more memory will be used on each iteration of the list, where is the opportunity for things to fall out of scope and be garbage collected in this single request ? Have you tried calling gc.collect on each iteration of the loop. The memory leaks referred with ndb typically involve an instance leak over multiple requests. – Tim Hoffman Jan 22 '16 at 23:23
  • 1
    Also what is going on in `merge_objs` There is really to much code here to eyeball for code issues. – Tim Hoffman Jan 22 '16 at 23:25
  • 1
    Make sure to turn off the in context cache, which stores every single entity for each request: ndb.get_context().set_cache_policy(False) – Patrick Costello Jan 23 '16 at 17:48

0 Answers0