2

My python application reads data from files and stores these data in dictionaries during start up (dictionaries are properties of data reader classes). Once the application starts and the read data is used, these data in the dictionaries are no longer needed. However, they consume large amount of memory. How do I delete these dictionaries to free the memory?

For example:

class DataReader():
    def __init__(self, data_file):
        self.data_file = data_file

    def read_data_file_and_store_data_in_dictionary():
        self.data_dictionary = {}
        for [data_name, data] in self.data_file:
             self.data_dictionary[data_name] = data

class Application():
    def __init__(self, data_file):
        self.data_reader = DataReader()
        self.data_reader.read()

    def start_app(self):
        self.use_read_data()

After application is started, self.data_dictionary is no longer needed. How do I delete self.data_dictionary permanently?

alwbtc
  • 28,057
  • 62
  • 134
  • 188

4 Answers4

4

Use the del statement

del self.data_dictionary  # or del obj.data_dictionary

Note this will only delete this reference to the dictionary. If any other references still exist for the dictionary (say if you had done d = data_reader.data_dictionary and d still references data_dictionary) then the dictionary will not be freed from memory. This also includes any references to d.keys(), d.values(), d.items().

Only when all references have been removed will the dictionary finally be released.

FHTMitchell
  • 11,793
  • 2
  • 35
  • 47
  • Will memory be freed if I delete `self.data_reader`? Will it delete everything (dictionary, keys, values etc)? – alwbtc Aug 24 '18 at 15:07
  • yes assuming you don't hold references to any of those – FHTMitchell Aug 24 '18 at 15:12
  • Is there a way to know the all references to an object in a python project? How do advanced python programmers handle it? – alwbtc Aug 24 '18 at 15:14
  • They don't. The best way is to never hold a reference to something that is too big to handle. The use things like lazy iterators instead. In fact that's exactly what your `self.data_file` object is -- the inbuilt objects are already made in a way to reduce memory. – FHTMitchell Aug 24 '18 at 15:16
  • last question: after using `del`, is `gc.collect()` necessary to free the memory? – alwbtc Aug 24 '18 at 15:19
  • Shouldn't be. The python interpreter is clever enough to handle it itself. – FHTMitchell Aug 24 '18 at 15:31
3

Using Python, you should not care about memory management.

Python has an excellent garbage collector, which counts for each object the reference in the code.
If there are no references, the object will be unallocated.

In your case, if the memory is not free after you're done using it, it means that the object can be still used in your program. If you delete it and then try to call it, you will get a ReferenceError.

Someone in other answers is suggesting to use del, but it will only delete the variable name.

My suggestion is to ensure that your code does not actually call the object anymore, and if it does, manipulate your data accordingly (use a lightweight db, save them on local hard drive, ...) and retrieve them when needed. If your big dictionaries are class parameters of a class which is still used, but doesn't need the dicts anymore, you should take those dicts outside the class (maybe referencing a new class, which only manages the dicts). In this Q&A you will find useful tips for optimizing memory usage.

You can read this article to have a really interesting dive into the Python's garbage collector

Gsk
  • 2,929
  • 5
  • 22
  • 29
1

How about having the data in a smaller scope?

class Application():
    def __init__(self, data_file):
        self.use_read_data(DataReader(data_file).read())

After application is started, self.data_dictionary is no longer needed

If you do not need the data for the whole lifetime of the application then you shouldn't be storing it in an instance attribute.

Choose the right scope and you won't need to care about deleting variables.

Stop harming Monica
  • 12,141
  • 1
  • 36
  • 56
  • With this method, won't `DataReader` continue living as long as `Application` lives? – alwbtc Aug 24 '18 at 15:08
  • @alwbtc Objects without references are ready to be garbage collected. What happens then depends on the implementation. CPython will deallocate it inmedialely. – Stop harming Monica Aug 24 '18 at 16:31
0

del will delete the reference to the object, however it could still be on memory. In that case, the garbage collector (gc.collect([generation])) will free the memory:

https://docs.python.org/2.7/library/gc.html

import gc

[...]
# Delete reference
del object
# Garbage collector
gc.collect()
[...]
Jose
  • 3,306
  • 1
  • 17
  • 22
  • this won't work: `obj1 = dict; obj2 = obj1; del obj1; gc.collect()`. `obj2` is still there, referencing the same object. Interesting suggestion, by the way, to force the garbage collector to collect in a specific moment – Gsk Aug 21 '18 at 11:29