I have a file of WhatsApp messages which I want to save into csv format. File looks like this:
[04/02/2018, 20:56:55] Name1: Messages to this chat and calls are now secured with end-to-end encryption.
[04/02/2018, 20:56:55] Name1: Content1.
More content.
[04/02/2018, 23:24:44] Name2: Content2.
I want to parse messages into date, sender, text
columns. My code:
with open('chat.txt', "r") as infile, open("Output.txt", "w") as outfile:
for line in infile:
date = datetime.strptime(
re.search('(?<=\[)[^]]+(?=\])', line).group(),
'%d/%m/%Y, %H:%M:%S')
sender = re.search('(?<=\] )[^]]+(?=\:)', line).group()
text = line.rsplit(']', 1)[-1].rsplit(': ', 1)[-1]
new_line = str(date) + ',' + sender + ',' + text
outfile.write(new_line)
I have problems with handling multi line texts. (I sometimes skipped into a new line in my messages - in this case I have only text in the line which is supposed to be a part of the previous line.) I'm also open to more pythonic way of parsing datetime, sender, and text. The result of my code is error because every line doesn't have all criteria (but correctly parses date, sender, text):
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-33-efbcb430243d> in <module>()
3 for line in infile:
4 date = datetime.strptime(
----> 5 re.search('(?<=\[)[^]]+(?=\])', line).group(),
6 '%d/%m/%Y, %H:%M:%S')
7 sender = re.search('(?<=\] )[^]]+(?=\:)', line).group()
AttributeError: 'NoneType' object has no attribute 'group'
Idea: maybe using try-catch and then somehow appending line with only text? (Doesn't sound Pythonic.)