I'm working in a memory constrained environment and uses a Python script with tarfile library (http://docs.python.org/2/library/tarfile.html) to continuously make backups of log files.
As the number of log files have grown (~74 000) I noticed that the system effectively kills this backup process when it runs now. I noticed that it consumes an awful lots of memory (~192mb before it gets killed by OS).
I can make a gzip tar archive ($ tar -czf) of the log files without a problem or high memory usage.
Code:
import tarfile
t = tarfile.open('asdf.tar.gz', 'w:gz')
t.add('asdf')
t.close()
The dir "asdf" consists of 74407 files with filenames of length 73. Is it not recommended to use Python's tarfile when you have a huge amount of files ?
I'm running Ubuntu 12.04.3 LTS and Python 2.7.3 (tarfile version seems to be "$Revision: 85213 $").