I have a rather large list: 19 million items in memory that I am trying to save to disk (Windows 10 x64 with plenty of space).
pickle.dump(list, open('list.p'.format(file), 'wb'))
Background: The original data was read in from a csv (2 columns) with the same number of rows (19mil) and was modified to a list of tuples.
The original csv file was 740mb. The file "list.p" is showing up in my directory at 2.5 gb but the python process does not budge (I was debugging and stepping through each line) and the memory utilization at last check was at 19 gb and increasing.
I am just interested if anyone can shed some light on this pickle process.
PS - I understand that pickle.HIGHEST_PROTOCOL is now at Protocol version 4 which was added in Python 3.4. (It adds support for very large objects)