0

Ok... Im writing a python script to convert text to binary...
Im using easygui to quickly convert short phrases and test for issues. My issue...
The main work horse is:

BText = bin(int(binascii.hexlify(DText),16))

I have the value return through an easygui dialog also... But when i type in a single character i get a 15 character response...
So a) im getting an extra character somewhere(b4 workhorse?)and
b) why isnt the returned value 16 characters?
Ive also tried 4 letter words and other various sizes and i always end up 7 characters too long. So im getting an extra entry value somewhere and am always returning one character short of a full 8 return...
I dont know a thing about the underlying processes that make this happen but it should be something i should know... Thanks...

Alright tried for an hour to post my code and it isnt properly formatted i guess... I run Python 2.7.8.
I use easygui.textbox to receive input and for the output.
The input is run through the workhorse above. 0b is then stripped from the returned input using BText = str(BText)[2:]. The resulting string is then returned and shown to the user via easygui.textbox...

EasyGui

#Imports
import OTPModule as TP
import easygui as EG

Plain = EG.textbox(msg='Enter Message', title='OTP', text='Hi', codebox=1)
XORD, Key = TP.Gather(Plain)
EG.textbox(msg='XORD', title='OTP - XOR Message', text=XORD, codebox=1)
EG.textbox(msg='Key', title='OTP - Key', text=Key, codebox=1)


raw_input("Press Enter To Decrypt")

XOrd = EG.textbox(msg='Enter XOR Message', title='OTP', text='01', codebox=1)
Key = EG.textbox(msg='Enter Key', title='OTP', text='10', codebox=1)
Plain = TP.Release(XORD, Key)
EG.textbox(msg='ASCII', title='OTP', text=Plain, codebox=1)

raw_input("Press Enter To Exit")

Module..

#################
#  One Time Pad #
#    (Module)   #
#  Python 2.7.8 #
#    Nov 2014   #
#Retler & Amnite#
#################


    #imports
import binascii
import random

def Gather(DText):

  print(DText)#Debug

  #First Things First... Convert To Binary
  BText = bin(int(binascii.hexlify(DText),16))

  #Strip 0b
  BText = str(BText)[2:]

  print(BText)#Debug

  #Generate Key
  KText = []
  a = 0

  while a < len(BText):
    b = random.randint(0,1)
    KText.append(b)
    a = a+1

  KText = ''.join(map(str,KText))
  print(KText)#Debug
  print a

  #So apparently we have to define the XOR ourselves
  #0^0=0, 0^1=1, 1^0=1, 1^1=0
  EText = []
  a = 0

  while a < len(BText):
    if BText[a] == KText[a]:
      EText.append(0)
    else:
      EText.append(1)
    a = a+1

  EText = ''.join(map(str,EText))

  return(EText, KText)

######The Other Half#######

def Release(EText, KText):

  print(EText)#Debug
  print(KText)#Debug

  #XOR
  BText = []
  a = 0

  while a < len(EText):
    if EText[a] == KText[a]:
      BText.append(0)
    else:
      BText.append(1)
    a = a+1

  BText = ''.join(map(str,BText))

  print(BText)#Debug

  #Binary To ASCI(Re-Add 0b)
  DText = int('0b'+BText,2)
  DText = binascii.unhexlify('%x' % DText)

  return(DText)
  • Please post your code, otherwise others can only guess what is wrong. – SSC Dec 10 '14 at 05:16
  • Change your debug print statements so that they print the representation of the data, eg `print(repr(DText))` or surround DText with backticks ` . That way you'll be able to see newlines and other control characters. – PM 2Ring Dec 10 '14 at 07:38
  • I guess I ought to mention that there's an easier way to do XOR on text, [eg](http://stackoverflow.com/a/26080586/4014959) – PM 2Ring Dec 10 '14 at 08:03

1 Answers1

2

Edit

Having installed easygui and trying textbox(), unicode strings are returned with a trailing new line character...

>>> Plain = EG.textbox(msg='Enter Message', title='OTP', text='Hi', codebox=1)
# hit OK in text box
>>> Plain
u'Hi\n'

That's the source of the additional character. You can get rid of the new line it with:

>>> Plain = Plain.rstrip()
>>> Plain
u'Hi'

Note also that a unicode string is returned. You may run into decoding issues if you enter non-ascii data, e.g. u'\u4000' (= 䀀) - hexlify() will blow up but that's another problem.

Original answer

I'm not familiar with easygui but I am guessing that it's producing UTF-16 output or some other multi-byte encoded data. Try printing the input character using repr(input_string) or similar. That could be why you are apparently seeing an additional character when inputting only a single character:

>>> bin(int(hexlify('a'), 16))[2:]
'1100001'
>>> bin(int(hexlify('a'.encode('utf-16-le')),16))[2:]
'110000100000000'

In the first example, a single character is translated to 7 bits (leading zeros not emitted by bin()). In the second example, the UTF-16 encoding is 2 bytes long:

>>> 'a'.encode('utf-16-le')
'a\x00'

and hence the result is the 15 bit string - again any leading zero bits are not emitted.

mhawke
  • 84,695
  • 9
  • 117
  • 138
  • Is 7 bits standard binary for linux...? Is there different standards? Honestly you lost me at encoding... I understand the 15 bits for 2 bytes but wouldnt that rise exponentially as the input grows? I end up with almost the right amount of bytes no matter how big my input is... Its always 7 01's extra after assuming each character equals 8. Would it be a newline or something passed down by Tk? – Jeremiah Schmidt Dec 10 '14 at 07:19
  • @mhawke: I also suspect it might be a unicode / UTF issue. It looks like EasyGUI has unicode capability, using some form of UTF-16 according to [this page](http://www.easygui.com/main.asp?sc=7&me=2&sub=13). – PM 2Ring Dec 10 '14 at 07:19
  • No, 7bits is because `bin()` does not display any leading zeros. The extra byte could be a new line character. Try printing the data entering your script - `repr(input_string)` or similar. I was using `encode()` to simulate utf16 input. – mhawke Dec 10 '14 at 07:23
  • @JeremiahSchmidt: The string returned by `bin()` is not fixed length because it can be used on `long` as well as `int`. So if you need fixed length bit strings you need to do that yourself, eg using the `.zfill()` method. – PM 2Ring Dec 10 '14 at 07:32
  • @PM 2Ring: If i use zfill to pad 0's on the left side of my "binary" will this still return the same character? Honestly at the end of what im doing divisibility by 4 would be most excellent – Jeremiah Schmidt Dec 10 '14 at 07:47
  • @JeremiahSchmidt: `unhexlify()` ignores leading zeroes. Also, `int(bitstring, 2)` doesn't need the bitstring to have a leading `0b` – PM 2Ring Dec 10 '14 at 08:00
  • @JeremiahSchmidt - see updated answer - `easygui.textbox()` returns new line character which is the source of your additional character. – mhawke Dec 10 '14 at 09:07
  • Hey, not sure if its appropriate, so please message me if not: I've removed that last \n and the fix will appear in 0.97 of easygui – Robert Lugg Dec 17 '14 at 05:25