0

I am trying to get a 4Gb Csv into a dataframe. I got the memory error. I checked the chunksize solution but still have the problem. I have also heard that it could be related to a python version but I am running with anaconda python 3.6 64 bits (at least it is what it is supposed to be)

Here is my function code :

import panda as pd    
def dataFrameCreationCsv(path):
        print('Loading Csv file...')
        if file.isfile(path):
            #  dataFrameCsv=pd.read_csv(path,encoding='latin1',delimiter='|')  MEMORY ISSUES ON LARGE SCALE DATASET
            tp = pd.read_csv(path, encoding='latin1', delimiter='|', iterator=True, chunksize=1000)  # gives TextFileReader, which is iterable with chunks of 1000 rows.
            dfCsv = pd.concat(tp, ignore_index=True)
            return dfCsv
        else:
            print('wrong path used')
Mayeul sgc
  • 1,964
  • 3
  • 20
  • 35

0 Answers0