Python converts all identifiers to their NFKC normal form; from the Identifiers section of the reference documentation:
All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC.
The NFKC form of both the super and subscript characters is the lowercase u
:
>>> import unicodedata
>>> unicodedata.normalize('NFKC', 'Xᵘ Xᵤ')
'Xu Xu'
So in the end, all you have is a single identifier, Xu
:
>>> import dis
>>> dis.dis(compile('Xᵘ = 42\nprint((Xu, Xᵘ, Xᵤ))', '', 'exec'))
1 0 LOAD_CONST 0 (42)
2 STORE_NAME 0 (Xu)
2 4 LOAD_NAME 1 (print)
6 LOAD_NAME 0 (Xu)
8 LOAD_NAME 0 (Xu)
10 LOAD_NAME 0 (Xu)
12 BUILD_TUPLE 3
14 CALL_FUNCTION 1
16 POP_TOP
18 LOAD_CONST 1 (None)
20 RETURN_VALUE
The above disassembly of the compiled bytecode shows that the identifiers have been normalised during compilation; this happens during parsing, any identifiers are normalised when creating the AST (Abstract Parse Tree) which the compiler uses to produce bytecode.
Identifiers are normalized to avoid many potential 'look-alike' bugs, where you'd otherwise could end up using both find()
(using the U+FB01 LATIN SMALL LIGATURE FI character followed by the ASCII nd
characters) and find()
and wonder why your code has a bug.