0

I'm trying to read CSV files with Western Europe (windows) encoding

df = pd.read_csv(FileName,encoding='mbcs', usecols=[1],header=4)

This code works well on Windows but not on Linux 18.04. (Error: unknown encoding: mbcs) Indeed, in the codecs python documentation, we have the information:

mbcs is for Windows only: Encode the operand according to the ANSI codepage (CP_ACP).

is there another way/name to decode my files in python on Linux? (I have thousand of files so I can't save as on Excel)

Joachimhgg
  • 47
  • 8

1 Answers1

3

If your systems uses a Western Europe encoding on Windows, the mbcs encoding (the ANSI codepage) is cp1252. So you should use:

df = pd.read_csv(FileName,encoding='cp1252', usecols=[1],header=4)

on both system to have a compatible code base.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • Thank you for your answer, the encoding of my files are ANSI, but with 'cp1252' I have an UnicodeDecodeError: `'charmap' codec can't decode byte 0x8d in position 164956: character maps to ` However it works with 'cp1252' on windows – Joachimhgg Apr 28 '20 at 14:30
  • How is represented the `b'\x8d` byte? Could it be a `'ì'` (LATIN SMALL LETTER I WITH GRAVE) – Serge Ballesta Apr 28 '20 at 14:42
  • my files contain 4 lines with some "classic" text. Then bellow this lines (`header=4`) I have 2 columns `Time` and `Ampl` and then only Numerical values, but no `'ì'`. The issue was causing by 2-3 files, which seems the same than the others (maybe NaN values). Because it's data I can erase them and it works well, thx! – Joachimhgg Apr 28 '20 at 15:24