2

I am creating a simple database style program for a project in school and I am using a simple hash and salt algorithm for storing passwords. I would like to use the Unit Separator and Record Separator ASCII codes to separate attributes and records in the database, but I can't find anything on them in Python. Here is my code so far:

import os
import hashlib
import sys

users = {}

def new_user():
    while True:
        username = str(input('Please Enter Your Desired Username > '))
        if username in users.keys():
            print('Username already exists, please try again')
        else:
            break
    while True:
        password = str(input('Please Enter a Password Longer than 5 Characters > '))
        if len(password) < 6:
            print('Too Short, Try Again')
        else:
            break
    password = str.encode(password)
    salt = os.urandom(256)
    hashed = hashlib.sha256(password+salt).hexdigest()
    new = {'username' : username,
           'password' : hashed,
           'salt' : salt }
    users[username] = new
    towrite = new['username']+'\x1f'+new['password']+'\x1f'+str(new['salt'])+'\x1e'
    with open('users.txt', 'a') as userfile:
        userfile.write(towrite)
    print('New User Created: Welcome %s' % username)
    return()

def login():
    while True:
        with open('users.txt', 'r') as userfile:
            users = userfile.read().split('\x1e')
            for user in users:
                user = user.split('\x1f')
        username = str(input('Please Enter Your Username > '))
        password = str(input('Please Enter Your Password > '))
        salt = user[2]
        password = hashlib.sha256(password + salt).hexdigest()
        for user in users:
            if (username == user[0]) and (password == user[1]):
                print('You\'re In!')
                return()
        print('Invalid Username or Password')

while True:
    choice = str(input('Continue Adding? > '))
    if choice == 'n':
        break
    else:
        new_user()

while True:
    choice = str(input('Continue Logging? > '))
    if choice == 'n':
        break
    else:
        login()

The problem here is that when joining the attributes in the new_user() function, the string that is written to the file does not contain '\x1e' or '\x1f'. How can I fix this or otherwise implement these control codes?

PySheep
  • 21
  • 3
  • Using control characters as separators is from a much simpler and more primitive time. Try to keep the fields separated in a different way. P.S. opening the files in binary mode might fix your problems. – Mark Ransom Jan 25 '16 at 20:18
  • How would you suggest? Using other characters can be awkward as they might be included in the username or even in the string representation of the salt. – PySheep Jan 25 '16 at 20:21
  • Since this is a school project it might be appropriate to keep with a simple and primitive scheme. Anything that replaces it is likely to be more complex. A true database would be a good choice, or something like JSON. – Mark Ransom Jan 25 '16 at 20:44
  • If I do `open("myfile.txt", "w").write('\x1f')` then hexdump the file, I see: `68 65 6c 6c 6f 1f 74 68 65 72 65`, so are you sure your output doesn't contain the control chars? – Alastair McCormack Jan 25 '16 at 20:46
  • I'm looking in the file (it's just plain text) and there are no control characters, unless I'm missing something from the code? – PySheep Jan 25 '16 at 20:50
  • Control chars don't print. Use a hex editor, or 'hexdump' on Ubuntu/OS X. – Alastair McCormack Jan 25 '16 at 20:52
  • Your code works as written. How did you determine that the control characters were not written to the file? I'm 99% certain that Windows doesn't muck about with these codes even in text mode, but just in case, what platform are you testing on here? – Martijn Pieters Jan 25 '16 at 20:53
  • I've just tested your code. Adding a user creates `\x1f` in the right place. The risk you have is the hash may also contain '\x1f'. You might want to consider using multiple `\x1f`s to reduce the risk of it appearing randomly. Or try an encoding a mechanism like Protobuf or ASN.1 where, data is stored in Tag Length Value blocks. – Alastair McCormack Jan 25 '16 at 20:54
  • 1
    @AlastairMcCormack: or storing a hexadecimal or base64 encoding of the hash.. – Martijn Pieters Jan 25 '16 at 21:06
  • @AlastairMcCormack: actually, they already do that, so only letters and digits are used. So the chance of the hash containing a control character is exactly 0. Note the `.hashdigest()` in `hashed = hashlib.sha256(password+salt).hexdigest()`. – Martijn Pieters Jan 25 '16 at 21:07
  • @MartijnPieters, good point re `hexdigest` and the simple approach :) – Alastair McCormack Jan 25 '16 at 21:12
  • I managed to figure out what was going wrong. You're right the program was writing the control characters, the problem lay in the splitting of the data in the login() function. Once that was done it was just a matter of ironing out the storage of the salt, which didn't convert back from string to bytes in the program, but I got it sorted so thanks guys :) – PySheep Jan 25 '16 at 21:12
  • Sorry it ended up being such a simple problem in the end – PySheep Jan 25 '16 at 21:13

0 Answers0