14

I have a complicated python server app, that runs constantly all the time. Below is a very simplified version of it.

When I run the below app using python; "python Main.py". It uses 8mb of ram straight away, and stays at 8mb of ram, as it should.

When I run it using pypy "pypy Main.py". It begins by using 22mb of ram and over time the ram usage grows. After a 30 seconds its at 50mb, after an hour its at 60mb.

If I change the "b.something()" to be "pass" it doesn't gobble up memory like that.

I'm using pypy 1.9 on OSX 10.7.4 I'm okay with pypy using more ram than python.

Is there a way to stop pypy from eating up memory over long periods of time?

import sys
import time
import traceback

class Box(object):
    def __init__(self):
        self.counter = 0
    def something(self):
        self.counter += 1
        if self.counter > 100:
            self.counter = 0

try:
    print 'starting...'
    boxes = []      
    for i in range(10000):
        boxes.append(Box())
    print 'running!'
    while True:
        for b in boxes:
            b.something()
        time.sleep(0.02)

except KeyboardInterrupt:
    print ''
    print '####################################'
    print 'KeyboardInterrupt Exception'
    sys.exit(1)

except Exception as e:
    print ''
    print '####################################'
    print 'Main Level Exception: %s' % e
    print traceback.format_exc()
    sys.exit(1)

Below is a list of times and the ram usage at that time (I left it running over night).

Wed Sep  5 22:57:54 2012, 22mb ram 
Wed Sep  5 22:57:54 2012, 23mb ram 
Wed Sep  5 22:57:56 2012, 24mb ram 
Wed Sep  5 22:57:56 2012, 25mb ram 
Wed Sep  5 22:57:58 2012, 26mb ram 
Wed Sep  5 22:57:58 2012, 27mb ram 
Wed Sep  5 22:57:59 2012, 29mb ram 
Wed Sep  5 22:57:59 2012, 30mb ram 
Wed Sep  5 22:58:00 2012, 31mb ram 
Wed Sep  5 22:58:02 2012, 32mb ram 
Wed Sep  5 22:58:03 2012, 33mb ram 
Wed Sep  5 22:58:05 2012, 34mb ram 
Wed Sep  5 22:58:08 2012, 35mb ram 
Wed Sep  5 22:58:10 2012, 36mb ram 
Wed Sep  5 22:58:12 2012, 38mb ram 
Wed Sep  5 22:58:13 2012, 39mb ram 
Wed Sep  5 22:58:16 2012, 40mb ram 
Wed Sep  5 22:58:19 2012, 41mb ram 
Wed Sep  5 22:58:21 2012, 42mb ram 
Wed Sep  5 22:58:23 2012, 43mb ram 
Wed Sep  5 22:58:26 2012, 44mb ram 
Wed Sep  5 22:58:28 2012, 45mb ram 
Wed Sep  5 22:58:31 2012, 46mb ram 
Wed Sep  5 22:58:33 2012, 47mb ram 
Wed Sep  5 22:58:35 2012, 49mb ram 
Wed Sep  5 22:58:35 2012, 50mb ram 
Wed Sep  5 22:58:36 2012, 51mb ram 
Wed Sep  5 22:58:36 2012, 52mb ram 
Wed Sep  5 22:58:37 2012, 54mb ram 
Wed Sep  5 22:59:41 2012, 55mb ram 
Wed Sep  5 22:59:45 2012, 56mb ram 
Wed Sep  5 22:59:45 2012, 57mb ram 
Wed Sep  5 23:00:58 2012, 58mb ram 
Wed Sep  5 23:02:20 2012, 59mb ram 
Wed Sep  5 23:02:20 2012, 60mb ram 
Wed Sep  5 23:02:27 2012, 61mb ram 
Thu Sep  6 00:18:00 2012, 62mb ram 
DavidColquhoun
  • 153
  • 1
  • 6
  • 1
    Hmm. I can't reproduce this. With pypy 1.9 (from Macports) and OS X 10.6.8, I see the memory usage (as reported by 'top', from the RSIZE column) stay at around 46M. It may be worth a bug report. – Mark Dickinson Sep 05 '12 at 20:11
  • I've still got that process running, new data point for it: Thu Sep 6 09:02:26 2012, 63mb ram – DavidColquhoun Sep 05 '12 at 23:14
  • 2
    I can reproduce this using pypy 1.8, but 1.9 seems to have corrected this problem – loopbackbee Sep 10 '12 at 02:05

2 Answers2

11

http://doc.pypy.org/en/latest/gc_info.html#minimark-environment-variables shows how to tweak the gc

  • Oops, commented on the wrong reply, moved my comment to the above answer. – DavidColquhoun Oct 06 '12 at 23:03
  • 2
    4 months later and i've realized that what Ronny linked to is an even better solution. Setting PYPY_GC_MIN=1GB and PYPY_GC_MAX=3GB works much better, keeping the ram usage between 1 and 3 gb. And I found that the gc.collect() calls were taking around 50ms... slowing my application down too much. So yeah, those environment variables are a much better way to go. :) – DavidColquhoun Feb 22 '13 at 00:59
  • @DavidColquhoun that's `PYPY_GC_MIN=1GB` and `PYPY_GC_MAX=3GB`, not `PYPY_GC_MIN="1GB"` and `PYPY_GC_MAX="3GB"`, correct? – noɥʇʎԀʎzɐɹƆ Aug 06 '16 at 17:50
  • both should work the same, quoting is needed only when separators or whitespace are used for a shell, foo and "foo" tend to be the same –  Aug 31 '16 at 21:23
  • I found my side set up, but it's not working. What's the problem? – Cherry May 27 '20 at 15:41
5

Compared to cpython, pypy uses different garbage collection strategies. If the increase in memory is due to something in your program, you could try to run a forced garbage collection every now and then, by using the collect function from the gc module. In this case, it might also help to explicitly del large objects that you don't need anymore and that don't go out of scope.

If it is due to the internal workings of pypy, it might be worth it submitting a bug report, as Mark Dickinson suggested.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94
  • Thankyou! gc.collect() was the what I needed. I run that every few seconds and it keeps the memory usage flat... check out the graph here to see the difference that one command made: http://datasmugglers.com/2012/10/06/server-ram-usage/ – DavidColquhoun Oct 06 '12 at 23:05