I extracted the data from whatsapp into a txt file I need to create 4 columns Date, Time, Name and Message in my output file
import pandas as pd
# read file by lines
with open('D:\Analysis\example_chat_whatsapp.txt', encoding="utf-8") as f:
data=f.readlines()
# # sanity stats
print('num lines: %s' %(len(data)))
# parse text and create list of lists structure
# remove first whatsapp info message
dataset = data[1:]
cleaned_data = []
for line in dataset:
# grab the info and cut it out
date = line.split(" ")[0]
line2 = line[len(date):]
time = line2.split(" ")[0][:2]
line3 = line2[len(time):]
name = line3.split(":")[0][:4]
line4 = line3[len(name):]
message = line4[6:-1] # strip newline charactor
#print(date, time, name, message)
cleaned_data.append([date, time, name, message])
#Create the DataFrame
df = pd.DataFrame(cleaned_data, columns = ['Date', 'Time', 'Name', 'Message'])
df
The issue that I am getting is with variable Time (empty) and Name with a wrong output. Date and Message are Ok with expected output