4

I have to store a bunch of UUIDs in integer format in a numpy array. Converting the UUID to integer format (128bits) is not valid since the maximum integer size that can be stored in a numpy array is 64 bits. Therefore I am trying to store the UUID as 6 separate integers using the fields style.

However I am unable to recreate the UUID from the numpy array values. Here is an example of the problem.

import uuid

#generate a random uuid4

my_uuid = uuid.uuid4()
my_uuid
# UUID('bf6cc180-52e1-42fe-b3fb-b47d238ed7ce')

# encode the uuid to 6 integers with fields
my_uuid_fields = my_uuid.fields
my_uuid_fields
# (3211575680, 21217, 17150, 179, 251, 198449560475598)

# recreate the uuid, this works
uuid.UUID(fields=my_uuid_fields)
# UUID('bf6cc180-52e1-42fe-b3fb-b47d238ed7ce')

# create an array with the fields 
my_uuid_fields_arr = np.array(my_uuid_fields)
my_uuid_fields_arr
# array([3211575680,21217,17150,179,251,198449560475598])

# convert array back to tuple and check if its the same as the initial fields tuple
assert tuple(my_uuid_fields_arr) == my_uuid_fields

# this fails however
uuid.UUID(fields = tuple(my_uuid_fields_arr))

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/scratch/6084717/ipykernel_43199/1198120499.py in <module>
----> 1 uuid.UUID(fields = tuple(my_uuid_fields_arr))

/hpc/compgen/users/mpages/software/miniconda3/envs/babe/lib/python3.7/uuid.py in __init__(self, hex, bytes, bytes_le, fields, int, version, is_safe)
    192         if int is not None:
    193             if not 0 <= int < 1<<128:
--> 194                 raise ValueError('int is out of range (need a 128-bit value)')
    195         if version is not None:
    196             if not 1 <= version <= 5:

ValueError: int is out of range (need a 128-bit value)

Any ideas on how to solve this?

Marc P
  • 353
  • 1
  • 17

1 Answers1

2

tuple(my_uuid_fields_arr) is a tuple of np.int64, while my_uuid_fields is a tuple of int. Apparently uuid cannot handle numpy integers properly.

Simply convert the numpy ints to python integers.

uuid.UUID(fields = tuple([int(i) for i in my_uuid_fields_arr]))

You can verify that this is the problem when you check the type of the first item in either tuple.

sarema
  • 695
  • 5
  • 18
  • 2
    The actual problem is the bit shifting `uuid.UUID` performs on the `int` or `numpy.int64`. `int(3211575680) << 96` evaluates to `254447239901898999706735475910225428480` whereas `np.int64(3211575680) << 96` evaluates to `0`. [This](https://stackoverflow.com/a/30514159/12479639) is a good answer as to why that happens. – Axe319 Jan 13 '22 at 11:28
  • 1
    Thanks a lot! Just found that `uuid.UUID(fields = my_uuid_fields_arr.tolist())` also works, I guess this method converts to `int` automatically. – Marc P Jan 13 '22 at 11:28