Python ast.parse method converts Latin letters to Greek (utf-8)

Asked Mar 12 '23 at 08:46

Active Mar 12 '23 at 08:46

Viewed 16 times

While playing with the ast module, I noticed the below:

import ast
>>> mu = 'µ'
>>> mu.encode()
b'\xc2\xb5'
>>> root = ast.parse(mu, "<string>", mode="eval")
>>> root.body.id.encode()
b'\xce\xbc'

It seems like ast.parse converts 'µ' which is encoded as UTF-8 Latin-1 Supplement (b'\xc2\xb5') to UTF-8 Greek and Coptic (b'\xce\xbc').

When looking at the cpython code of ast.parse(), I couldn't find any reference to such a conversion.

Can someone help find the reason for the above conversion?

asked Mar 12 '23 at 08:46

Carmel David

0 Answers0