I am loading data into a pandas dataframe
from an Excel sheet and there are a lot of non display characters in many columns that I want to convert.
The most prevalent is an apostrophe being used in a contraction ; e.g. doesn't
which comes out as doesn’t
.
In the past I have used :
str.encode('ascii', errors='ignore').decode('utf-8')
but this required me to know which columns I needed to fix.
In this case I have 103 columns which could each contain this or other types of issues like this.
I am looking for a way to just replace any and all issues across the entire dataframe
.
Is there a quick and easy way to do this over the entire dataframe
without having to pass in each column to a function ?