2

I'm working on a simple Caesar Cipher in python using chr() and ord()

Here's my code:

 key = 13
 newString = ''
 if mode == 'decrypt':
     key = -key
 for c in message:
     newString += chr(ord(c) + key)
 print newString

But something funny happens!

When I input: "Hello world!", I get back "Uryy|-?|yq."

Looks about right, right?

But when I try deciphering it,

I get: Hello 2old!

Any insights? I'm thinking it has to do with chr() returning something like this: '\x84'

Artjom B.
  • 61,146
  • 24
  • 125
  • 222
Mickey
  • 117
  • 1
  • 10
  • 1
    ok, sounds like you have a hypothesis as to what is going wrong... why don't you test it? – thebjorn Aug 06 '14 at 00:51
  • 2
    Have you considered overflow? – lewisjb Aug 06 '14 at 00:57
  • possible duplicate of [Caesar cypher in Python](http://stackoverflow.com/questions/22059435/caesar-cypher-in-python) – thebjorn Aug 06 '14 at 01:01
  • One curiosity is that `Hello World!` has 12 characters, but the encrypted text (`Uryy|-?|yq.`) only has 11 characters. Then the decrypted text has 11 too. So, the first issue is 'which character went missing and why'? – Jonathan Leffler Aug 06 '14 at 01:16
  • @JonathanLeffler the `w`, because it has a character code of 119, and add 13 is 132, which is not a normal character – lewisjb Aug 06 '14 at 01:20
  • 1
    That, I guess, depends on your definition of 'normal' character. It is a perfectly normal byte, but in the 8859-x codesets corresponds to a C1 control character (IND, to be precise; C1 Controls (0x80 - 0x9F) are from ISO/IEC 6429:1992). In some of the Windows code pages (CP1252, for example), 0x84 corresponds to a printing character (U+201E DOUBLE LOW-9 QUOTATION MARK for CP1252). – Jonathan Leffler Aug 06 '14 at 01:27

1 Answers1

4

"Hello world!" is 12 characters, but "Uryy|-?|yq." is 11 (and so is "Hello 2old!").

The cause of this is that the new ASCII code of the w is 132 instead of 119. Which is the '\x84' code.

If you do it in the IDLE and instead of print just type the variable, it outputs the string with \x84, but if you print it, it replaces it with an invalid character. If you input the exact string (with the \x84) it returns "Hello world!". If you don't understand the \x84 I suggest you research character codes and hexadecimal.


And a traditional Caesar shift keeps all characters as letters, not others, like punctuation marks, pipes and 132.

  • A has the character code of 65 (in decimal)

  • a is 97

According to http://en.wikipedia.org/wiki/Caesar_cipher, the encryption and decryption are:

"E_n(x) = (x + n) \mod {26}."

and

"D_n(x) = (x - n) \mod {26}."

respectively.

Use the character offsets of 65 and 97 and do what the Wikipedia article says.

lewisjb
  • 678
  • 10
  • 26