I have a csv file (containing only numeric data) of size 18 MB. When I read it and convert to numpy array and save it in hdf5 format or pickle , it takes around 48 MB disk space. Shouldn't the data be compressed when we use pickle or hdf5? Is it better to save in hdf5 format to be consumed by tensorflow ? The Csv data is of the form
2,3,66,184,2037,43312,0,0,9,2,0,1,8745,1,0,2,6,204,27,97
2,3,66,184,2037,43312,0,0,9,2,0,1,8745,1,0,2,6,204,27,78
2,3,66,184,2037,43312,0,0,9,2,0,1,8745,1,0,1,6,204,27,58
Dimension of the data is 310584 X 20