What you are trying to do is to compose valid UTF-8 codes by several consecutive Windows-1252 codes.
For example, for ü
, the Windows-1252 code of Ã
is C3
and for ¼
it's BC
. Together the code C3BC
happens to be the UTF-8 code of ü
.
Now, for Ã?
, the Windows-1252 code is C33F
, which is not a valid UTF-8 code (because the second byte does not start with 10
).
Are you sure this sequence occurs in your text? For example, for à
, the Windows-1252 decoding of the UTF-8 code (C3A0) is Ã
followed by a non-printable character (non-breaking space). So, if this second character is not printed, the ?
might be a regular character of the text.
For ¶
the Windows-1252 encoding is C2B6
. Shouldn't it be ö
, for which the Windows-1252 encoding is C3B6
, which equals the UTF-8 code of ö
?