140

What is the easiest way to generate a random hash (MD5) in Python?

SilentGhost
  • 307,395
  • 66
  • 306
  • 293
mistero
  • 5,139
  • 8
  • 29
  • 27
  • 1
    Random as in for anything? Or for an object? If you just want a random MD5, just pick some numbers. – samoz Jun 10 '09 at 16:06
  • I am renaming files before uploading and want a filename like this: timestamp_randommd5.extension Cheers! – mistero Jun 10 '09 at 16:15
  • 6
    You could just rename them to timestamp_randomnumber.ext. There really isn't a reason why md5(randomnumber) would be any better than randomnumber itself. – sth Jun 11 '09 at 01:36
  • best answer for Python 3 is the last one `import uuid; uuid.uuid().hex` http://stackoverflow.com/a/20060712/3218806 – maxbellec Jul 19 '16 at 08:39

10 Answers10

174

A md5-hash is just a 128-bit value, so if you want a random one:

import random

hash = random.getrandbits(128)

print("hash value: %032x" % hash)

I don't really see the point, though. Maybe you should elaborate why you need this...

sth
  • 222,467
  • 53
  • 283
  • 367
151

I think what you are looking for is a universal unique identifier.Then the module UUID in python is what you are looking for.

import uuid
uuid.uuid4().hex

UUID4 gives you a random unique identifier that has the same length as a md5 sum. Hex will represent is as an hex string instead of returning a uuid object.

http://docs.python.org/2/library/uuid.html

https://docs.python.org/3/library/uuid.html

sebs
  • 4,566
  • 3
  • 19
  • 28
90

The secrets module was added in Python 3.6+. It provides cryptographically secure random values with a single call. The functions take an optional nbytes argument, default is 32 (bytes * 8 bits = 256-bit tokens). MD5 has 128-bit hashes, so provide 16 for "MD5-like" tokens.

>>> import secrets

>>> secrets.token_hex(nbytes=16)
'17adbcf543e851aa9216acc9d7206b96'

>>> secrets.token_urlsafe(16)
'X7NYIolv893DXLunTzeTIQ'

>>> secrets.token_bytes(128 // 8)
b'\x0b\xdcA\xc0.\x0e\x87\x9b`\x93\\Ev\x1a|u'
Nick T
  • 25,754
  • 12
  • 83
  • 121
50

This works for both python 2.x and 3.x

import os
import binascii
print(binascii.hexlify(os.urandom(16)))
'4a4d443679ed46f7514ad6dbe3733c3d'
Zitrax
  • 19,036
  • 20
  • 88
  • 110
Buttons840
  • 9,239
  • 15
  • 58
  • 85
25

Yet another approach. You won't have to format an int to get it.

import random
import string

def random_string(length):
    pool = string.letters + string.digits
    return ''.join(random.choice(pool) for i in xrange(length))

Gives you flexibility on the length of the string.

>>> random_string(64)
'XTgDkdxHK7seEbNDDUim9gUBFiheRLRgg7HyP18j6BZU5Sa7AXiCHP1NEIxuL2s0'
Matthew Taylor
  • 3,911
  • 4
  • 29
  • 33
6

Another approach to this specific question:

import random, string

def random_md5like_hash():
    available_chars= string.hexdigits[:16]
    return ''.join(
        random.choice(available_chars)
        for dummy in xrange(32))

I'm not saying it's faster or preferable to any other answer; just that it's another approach :)

tzot
  • 92,761
  • 29
  • 141
  • 204
5
import uuid
from md5 import md5

print md5(str(uuid.uuid4())).hexdigest()
Sam
  • 1,246
  • 1
  • 19
  • 27
5
import os, hashlib
hashlib.md5(os.urandom(32)).hexdigest()
gizzmole
  • 1,437
  • 18
  • 26
2

The most proper way is to use random module

import random
format(random.getrandbits(128), 'x')

Using secrets is an overkill. It generates cryptographically strong randomness sacrifying performance.

All responses that suggest using UUID are intrinsically wrong because UUID (even UUID4) are not totally random. At least they include fixed version number that never changes.

import uuid
>>> uuid.uuid4()
UUID('8a107d39-bb30-4843-8607-ce9e480c8339')
>>> uuid.uuid4()
UUID('4ed324e8-08f9-4ea5-bc0c-8a9ad53e2df6')

All MD5s containing something other than 4 at 13th position from the left will be unreachable this way.

  • `os.urandom(128//8)` takes me 5x as long, or 0.25 microseconds longer to compute. If you care enough that you need to save 1 second every 4 million hashes you generate, you should get a faster RNG like PCG. If you think it will *ever* matter that your 'hashes' are unguessable/unpredictable, you should use a CSPRNG as it's trivially easy to [reconstruct the RNG state](https://github.com/eboda/mersenne-twister-recover) after using your call ~160 times. – Nick T Aug 21 '22 at 22:49
  • I don't understand your MD5 comment. Please explain. – not2qubit Aug 24 '22 at 14:02
  • With UUID4 you will generate a string that always contains `4` at 13th position. It is an issue. – Pasha Podolsky Aug 24 '22 at 14:33
1

from hashlib import md5
plaintext = input('Enter the plaintext data to be hashed: ') # Must be a string, doesn't need to have utf-8 encoding
ciphertext = md5(plaintext.encode('utf-8')).hexdigest()
print(ciphertext)

It should also be noted that MD5 is a very weak hash function, also collisions have been found (two different plaintext values result in the same hash) Just use a random value for plaintext.

save_jeff
  • 403
  • 1
  • 5
  • 11
Eric Jin
  • 3,836
  • 4
  • 19
  • 45