1

Have tried the above with no success. Note ..This is specific to the text Column Headings and not the Column Values

df.columns = [x.lower().replace(" ","").replace("?","").replace("_","").replace( "Â" , "") for x in df.columns]

Would have replaced the non-printable character but has failed.

Can anyone help ?

csv export post the suggested solution

Peter R
  • 11
  • 3
  • Usually this means that you have [mojibake](https://en.wikipedia.org/wiki/Mojibake) or other corruption in your input, or are reading it incorrectly. A much better fix is to repair the upstream source so that the root cause gets addressed. – tripleee Feb 15 '23 at 14:07
  • Consider this [answer](https://stackoverflow.com/a/32201665/3155240), which uses regex to replace text with regex. – Shmack Feb 15 '23 at 19:21

1 Answers1

0

First of all, please remember that replace is case sensitive. Also, when chaining functions, the order is important.

"Â".lower().replace("Â", "") # "â"
"Â".replace("Â", "").lower() # ""

If the reason for the matter in question is a Mojibake encoding/decoding issue, you can try this quick fix with ftfy library. You can use it in conjunction with the rename function.

import ftfy

def _change_column_name(val):
    # fix mojibake
    val = ftfy.fix_text(val)
    # whatever data processing you need
    return val.replace("Â", "").lower()

df.rename(columns=_change_column_name, inplace=True)

@tripleee is right, though. Maybe instead of quick fix you'd want to fix encoding/decoding errors in your source data.

Pawel Kam
  • 1,684
  • 3
  • 14
  • 30