I am currently trying to make my screen reader work better with Becky! Internet Mail. The problem which I am facing is related to the list view in there. This control is not Unicode aware but the items are custom drawn on screen so when someone looks at it content of all fields regardless of encoding looks okay. When accessed via MSAA or UIA however basic ANSI chars and mails encoded with the code page set for non Unicode programs have they text correct whereas mails encoded in Unicode do not. Samples of the text :
Zażółć gęślą jaźń
is represented by:
Zażółć gęślÄ… jaźń In this case it is damaged CP1250 as per answer below. However: ⚠️
is represented by: ⚠️
⏰ is represented by: ⏰ and 高生旺 is represented by: é«ç”źć—ş
I've just assumed that these strings are damaged beyond repair, however when unicode beta support in windows 10 is enabled they are exposed correctly.
Is it possible to simulate this behavior in Python?
The solution needs to work in both Python 2 and 3.
At the moment I am simply replacing known combinations of these characters with their proper representations, but it is not very good solution, because lists containing replacements and characters to replace needs to be updated with each new discovered character.