0

I have a CSV file which has two columns 'title' and 'description' . The Description columns has HTML elements . I am trying to replace 'InterviewNotification' with InterviewAlert .

screenshot here of csv file

This is the code i wrote :

text = open("data.csv", "r")
text = ''.join([i for i in text]).replace("InterviewNotification", "InterviewAlert")
x = open("output.csv","w")
x.writelines(text)
x.close()

But, Im getting this Error :

      File "C:\Users\Zed\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 5786: character maps to <undefined>

Also used pandas , here is the code :

dataframe = pd.read_csv("data.csv")  
# using the replace() method 
dataframe.replace(to_replace ="InterviewNotification", value = "InterviewAlert",  inplace = True) 

still no Luck. Help pls

  • as the raised exception says, you have a UnicodeError. that means your original data is "malformed", i.e. there are char bytes which cannot be decoded with the encoding you're using, which is `UTF` by default. your data is probably not using it, so you should check the original document encoding, then read the file with `open(path, 'rb')` and decode the resulting bytes string with the correct format – nxet Dec 09 '20 at 18:28
  • Please do not mark your own question text as citations with the vertical bar in front created by `>` when editing. – bjhend Dec 09 '20 at 18:47

2 Answers2

0

Have you tried specifying the encoding as "utf-8" in your first line? For example:

text = open("data.csv", encoding="utf8")

It seems that your issue may be related to this question

Matthew Cox
  • 1,047
  • 10
  • 23
0

You open the file but you do not read it. To get the text itself do this:

textFile = open("data.csv", "r")
text = textFile.read()
textFile.close()

Or, to improve the code, use a context manager:

with open("data.csv", "r") as textFile:
    text = textFile.read()

This ensures that the file is properly closed even if the intermediate code raises an exception.

bjhend
  • 1,538
  • 11
  • 25