1

I have a time series of string-formatted UUIDs, and I would like Polars to translate them into u128 numbers for better storage and querying.

Similar to what we do with dates:

....str.strptime(pl.Datetime, fmt="%Y-%m-%dT%H:%M:%S.%fZ", strict=False)

Is this supported, or do I need to handle it on the Python side?

Also, I don't see a u128 type, but there's a Decimal that seems to be an i128. If I were to do my own translation, which type should I use?

P.S. I notice a GitHub ticket in the Polars repository about supporting the Rust crate Uuid, but in a way, this could be implemented without it. So, I am not sure if it is.

Jeremy Chone
  • 3,079
  • 1
  • 27
  • 28

1 Answers1

3

Polars doesn't support a u128 dtype. If you can accept the loss, you can store them as u64 or otherwise as a Utf8 column.

We haven't support for this yet, but we will also get FixedSizeBinary in the future which could also fit this.

ritchie46
  • 10,405
  • 1
  • 24
  • 43
  • Thanks for the detailed response and insights into the roadmap. Having `u128` support with `str::strpuuid` would be fantastic. However, if that's not feasible, `FixedSizeBinary` would be a welcome alternative. Our use case involves server request logs with a request UUID on each line, which is needed for correlation. – Jeremy Chone Aug 13 '23 at 18:49