-1

I've got a CSV file, which has a character encoding which I can't identify. From it's content (German language entries) I could find the following characters matching some 1-byte character encodings:

  • 0x81 = ü
  • 0x94 = ö
  • 0x9A = Ü

Which Codepage is this? Is there any website where you can maybe lookup code pages by known entries?

I was assuming this could be WINDOWS-1252 or ISO-8859-1, but it's neither of them.

SDwarfs
  • 3,189
  • 5
  • 31
  • 53

1 Answers1

0

As I found out by some more trial and error the encoding is "CP 437" or also called "DOS". Weird to see such an encoding used nowadays.

SDwarfs
  • 3,189
  • 5
  • 31
  • 53
  • 1
    For German I would expect something like CP850. In any case, for data which were collected also before Windows era, there is only two possibilities: CP437 (or 850) or Unicode. ISO-8859-1 (Latin 1) is a subset, and CP1252 is an extension of Latin1, but still not all DOS characters. So it is common to keep old encoding (until converting to Unicode), but the latter may requires changes in tools (multibytes, or/and variable length, and all normalization requirements). – Giacomo Catenazzi Feb 09 '23 at 15:06