-1

So, I've got an algorithm whereby I take a character, take its character code, increase that code by a variable, and then print that new character. However, I'd also like it to work for characters not in the default ASCII table. Currently it's not printing 'special' characters like (for example). How can I make it print certain special characters?

#!/usr/bin/python3
# -*- coding: utf-8 -*-

def generateKey(name):

    i = 0
    result = ""

    for char in name:

        newOrd = ord(char) + i
        newChar = chr(newOrd)
        print(newChar)

        result += newChar

        i += 1

    print("Serial key for name: ", result)

generateKey(input("Enter name: "))

Whenever I give an input that forces special characters (like |||||), it works fine for the first four characters (including DEL where it gives the transparent rectangle icon), but the fifth character (meant to be ) is also an error char, which is not what I want. How can I fix this?

Here's the output from |||||:

Enter name: |||||
|
}
~


Serial key for name:  |}~

But the last char should be , not a blank. (BTW the fourth char, DEL, becomes a transparent rectangle when I copy it into Windows)

Jack Bashford
  • 43,180
  • 11
  • 50
  • 79

1 Answers1

1

In the default encoding (utf-8), chr(128) is not the euro symbol. It's a control character. See this Unicode table. So indeed it should be blank, not .

You can verify the default encoding with sys.getdefaultencoding().

If you want to reinterpret chr(128) as the euro symbol, you should use the windows-1252 encoding. There, it is indeed the euro symbol. (Different encodings disagree on how to represent values beyond ASCII's 0–127.)

Arya McCarthy
  • 8,554
  • 4
  • 34
  • 56
  • There are online references that do indeed claim that U+0100 is the Euro symbol. Unicode uses the Latin-1 character set for its first 256 codepoints, and windows-1252 is a superset of that. – Mark Ransom May 05 '20 at 02:44
  • Thank you! I didn't get that it was a windows-style encoding (the program runs in windows) – Jack Bashford May 05 '20 at 02:48
  • Even in latin1, [it's a control character](https://en.wikipedia.org/wiki/ISO/IEC_8859-1#Code_page_layout). Below, it says that Windows-1252 reassigns those to printable characters. – Arya McCarthy May 05 '20 at 15:49
  • I wish I knew where to find an official list of Unicode codepoints, because there's inconsistency in the ones on the web. There's not even agreement on what is in ISO-8859-1. – Mark Ransom May 05 '20 at 15:58