-2

I came across the inbuilt python function chr() which number input and displays the unicode character(opposite of ord() ). I ran a loop from 0 to 300 to check the unicode characters and I for a lot of numbers the returned characters are the same ( question mark in a box ). For the 0th character it is question mark enclosed in a diamond. Can anyone explain me the reason some characters are repeating while some are missing all together. P.S. : I used google colab notebook for the coding. I am attaching a picture of my code for code: reference.

I tried this code:

#using chr()
for i in range(0,300):
  print(i, ":",chr(i))
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
  • 1
    The ?s are substitution characters for characters that have no visual definition or are missing from the font. The blanks are white space characters, such as new line, tab, or carriage return. – Mark Tolonen Jun 04 '23 at 14:07

1 Answers1

0

It is not that the character represented by these codepoints (numbers) are "question marks" - it is that they are either undefined, or undisplayable characters. Or sometimes even a defined characters, but for which your current display does not have the proper symbol to be displayed.

If you want a question mark, use solely the codepoint 63 - that is the code for question mark.

All the others are just characters that can't be displayed due to one of the reasons above.

Python can show you the name of a character, with unicodedata.name: that allows you to see unequivocally which is which:

In [2]: import unicodedata

In [3]: for i in range(0, 300):
   ...:     try:
   ...:         print(i, chr(i), unicodedata.name(chr(i)))
   ...:     except ValueError:
   ...:         print(i, "undefined character")
   ...: 
0 undefined character
1 undefined character
2 undefined character
...
30 undefined character
31 undefined character
32   SPACE
33 ! EXCLAMATION MARK
34 " QUOTATION MARK
35 # NUMBER SIGN
36 $ DOLLAR SIGN
37 % PERCENT SIGN
38 & AMPERSAND
39 ' APOSTROPHE
40 ( LEFT PARENTHESIS
41 ) RIGHT PARENTHESIS

Here, the copy-pasteable code:

import unicodedata

for i in range(0, 300):
    try:
        print(i, chr(i), unicodedata.name(chr(i)))
    except ValueError:
        print(i, "undefined character")
jsbueno
  • 99,910
  • 10
  • 151
  • 209