Which codepage is 0x81 = ü, 0x94 = ö, 0x9A = Ü?

Question

I've got a CSV file, which has a character encoding which I can't identify. From it's content (German language entries) I could find the following characters matching some 1-byte character encodings:

0x81 = ü
0x94 = ö
0x9A = Ü

Which Codepage is this? Is there any website where you can maybe lookup code pages by known entries?

I was assuming this could be WINDOWS-1252 or ISO-8859-1, but it's neither of them.

@GiacomoCatenazzi Just wanted to post my finding as the result. It's CP 437 (aka "DOS"). But you were faster... Thank you! — SDwarfs, Feb 09 '23 at 14:23
Any of `CP437` or `CP850` or `CP852` or `CP775` or `CP857`… — JosefZ, Feb 09 '23 at 17:53

score 0 · Accepted Answer · answered Feb 09 '23 at 14:25

0

As I found out by some more trial and error the encoding is "CP 437" or also called "DOS". Weird to see such an encoding used nowadays.

answered Feb 09 '23 at 14:25

SDwarfs

3,189
5
31
53

1

For German I would expect something like CP850. In any case, for data which were collected also before Windows era, there is only two possibilities: CP437 (or 850) or Unicode. ISO-8859-1 (Latin 1) is a subset, and CP1252 is an extension of Latin1, but still not all DOS characters. So it is common to keep old encoding (until converting to Unicode), but the latter may requires changes in tools (multibytes, or/and variable length, and all normalization requirements). – Giacomo Catenazzi Feb 09 '23 at 15:06

Which codepage is 0x81 = ü, 0x94 = ö, 0x9A = Ü?

1 Answers1