2

I have this function:

def decomposition():
    """
        Вызов модуля oval_decomposition.py для разложения OVAL xml на
        составные части - определения, объекты и т.д.

        Для корректного сбора модулем build необходима следующая секция
        внутри каждого <definition>:
        <oval_repository>
            <dates>
                <submitted date="YYYY-MM-DDTHH:MM:SS.000+00:00">
                    <contributor organization="ORGANISATION">JOHN WICK</contributor>
                </submitted>
            </dates>
        </oval_repository>

    """
    oval_decomposition.main()

And this is what i get in Powershell, when i use help(decomposition):

┬√чют ьюфєы  oval_decomposition.py фы  Ёрчыюцхэш  OVAL xml эр
ёюёЄртэ√х ўрёЄш - юяЁхфхыхэш , юс·хъЄ√ ш Є.ф.

─ы  ъюЁЁхъЄэюую ёсюЁр ьюфєыхь build эхюсїюфшьр ёыхфє■∙р  ёхъЎш 
тэєЄЁш ърцфюую <definition>:
<oval_repository>
    <dates>
        <submitted date="YYYY-MM-DDTHH:MM:SS.000+00:00">
            <contributor organization="ORGANISATION">JOHN WICK</contributor>
        </submitted>
    </dates>
</oval_repository>

When i use Cyrillic alphabet in print it works. It also works normally in Linux when i add "# coding: utf-8" in the beginning of file. However, this does not help in Windows. I also tried this to change Powershell encoding:

PS C:\Users\denis\Documents\dev\OVALRepo> "$OutputEncoding = [Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8"
System.Text.UTF8Encoding = [Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8

I can't find methods to change help function encoding manually like we can do in print function.

DenisNovac
  • 621
  • 2
  • 8
  • 15
  • I tried also [Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("utf-8") and [Console]::OutputEncoding=[Text.Encoding]::UTF8, both actually change encoding in Properties of PS, but no effects on code... – DenisNovac Aug 12 '19 at 18:25
  • 1
    `[Console]::InputEncoding = [System.Text.Encoding]::GetEncoding(1251)` works for me if I set _Language for non-Unicode programs_ to `Russian (Russia)` in _Administrative language settings_ – JosefZ Aug 12 '19 at 21:41
  • It's actually worked! No idea why Python print() works without that or why PowerShell does not use unicode by default so apps like Python could stop guessing encoding. Is there a way to make it all work in unicode? – DenisNovac Aug 13 '19 at 06:18
  • Sorry, I'm _not_ ready to read source code of Python's built-in functions – JosefZ Aug 13 '19 at 07:52

1 Answers1

1

Solved definitely after applying UTF-8 Everywhere rigorously in Windows.

Tried using different alphabet combination (Latin, Cyrillic and Greek scripts) as follows: type .\testHelp.py

# -*- coding: utf-8 -*-

def foo():
    """
    help in Czech, Greek, Russian
    nápověda česky, řecky, rusky
    βοήθεια στα Τσεχικά, Ελληνικά, Ρωσικά (¹)
    помощь на чешском, греческом, русском языках (¹)
    (¹) from Google Translate
    """
    return ''

help(foo)

Result if run from pure cmd or from PowerShell (version 5.1) or from pwsh (version 7):

python .\testHelp.py
Help on function foo in module __main__:

foo()
    help in Czech, Greek, Russian
    nápověda česky, řecky, rusky
    βοήθεια στα Τσεχικά, Ελληνικά, Ρωσικά (¹)
    помощь на чешском, греческом, русском языках (¹)
    (¹) from Google Translate

(or Copy&Paste the above code in an open python prompt).

Settings:

Language for non-Unicode programs

Addendum: You met a mojibake case; for instance, Вызов (the 1st word in your example) is shown as ┬√чют due to the following mojibake mechanism:

>>> 'Вызов'.encode('cp1251').decode('cp866') == '┬√чют'
True
>>> 'Вызов' == '┬√чют'.encode('cp866').decode('cp1251')
True
JosefZ
  • 28,460
  • 5
  • 44
  • 83