I'm trying to create a dictionary file for a big size csv file that is divided into chunks to be processed, but when I'm creating the dictionary its just doing it for one chuck, and when I try to append it it passes epmty dataframe to the new df. this is the code I used
wdata = pd.read_csv(fileinput, nrows=0,).columns[0]
skip = int(wdata.count(' ') == 0)
dic = pd.DataFrame()
for chunk in pd.read_csv(fileinput, names=['sentences'], skiprows=skip, chunksize=1000):
dic_tmp = (chunk['sentences'].str.split(expand=True).stack().value_counts().rename_axis('word').reset_index(name='freq'))
dic.append(dic_tmp)
dic.to_csv('newwww.csv', index=False)
if I saved the dic_tmp one is just a dictionary for one chunk not the whole set and dic is taking alot of time to process but returns empty dataframes at the end, any error with my code ?
input csv is like
output csv is like
expected output should be
so its not adding the chunks together its just pasting the new chunk regardless what is in the previous chunk or the csv.