Python C-API int128 support

Question

In python one can handle very large integers (for instance uuid.uuid4().int.bit_length() gives 128), but the largest int datastructure the C-API documentation offers is long long, and it is a 64-bit int.

I would love to be able to get a C int128 from a PyLong, but it seems there is no tooling for this. PyLong_AsLongLong for instance cannot handle python integers bigger than 2**64.

Is there some documentation I missed, and it is actually possible?
Is there currently not possible, but some workaround exist? (I would love to use the tooling available in the python C-API for long long with int128, for instance a PyLong_AsInt128AndOverflow function).
Is it a planed feature in a forthcoming python release?

I think you're confusing python's arbitrary length integers with c's integer types. Assuming that `uuid.uuid4().int.bit_length() == 128` (which is not always true), you could do something like `print((uuid.uuid4().int * 4).bit_length())` and you should get a number <= 130. That shouldn't suggest that Python is using "int130" -- only that Python supports arbitrary length integers. — jedwards, Jan 20 '19 at 15:03
I edited my post so it is more accurate. The only thing I would like to do is to get a C `int128` from a python `PyLong`. — azmeuk, Jan 20 '19 at 16:33

Mad Physicist · Answer 1 · 2019-01-20T18:30:01.463

There are a couple of different ways you can access the level of precision you want.

Systems with 64-bit longs often have 128-bit long longs. Notice that the article you link says "at least 64 bits". It's worth checking sizeof(long long) in case there's nothing further to do.

Assuming that is not what you are working with, you'll have to look closer at the raw PyLongObject, which is actually a typedef of the private _longobject structure.

The raw bits are accessible through the ob_digit field, with the length given by ob_size. The data type of the digits, and the actual number of boots they hold is given by the typedef digit and the macro PYLONG_BITS_IN_DIGIT. The latter must be smaller than 8 * sizeof(digit), larger than 8, and a multiple of 5 (so 30 or 15, depending on how your build was done).

Luckily for you, there is an "undocumented" method in the C API that will copy the bytes of the number for you: _PyLong_AsByteArray. The comment in longobject.h reads:

/* _PyLong_AsByteArray: Convert the least-significant 8*n bits of long
   v to a base-256 integer, stored in array bytes.  Normally return 0,
   return -1 on error.
   If little_endian is 1/true, store the MSB at bytes[n-1] and the LSB at
   bytes[0]; else (little_endian is 0/false) store the MSB at bytes[0] and
   the LSB at bytes[n-1].
   If is_signed is 0/false, it's an error if v < 0; else (v >= 0) n bytes
   are filled and there's nothing special about bit 0x80 of the MSB.
   If is_signed is 1/true, bytes is filled with the 2's-complement
   representation of v's value.  Bit 0x80 of the MSB is the sign bit.
   Error returns (-1):
   + is_signed is 0 and v < 0.  TypeError is set in this case, and bytes
     isn't altered.
   + n isn't big enough to hold the full mathematical value of v.  For
     example, if is_signed is 0 and there are more digits in the v than
     fit in n; or if is_signed is 1, v < 0, and n is just 1 bit shy of
     being large enough to hold a sign bit.  OverflowError is set in this
     case, but bytes holds the least-significant n bytes of the true value.
*/

You should be able to get a UUID with something like

PyLongObject *mylong;
unsigned char myuuid[16];

_PyLong_AsByteArray(mylong, myuuid, sizeof(myuuid), 1, 0);

Python C-API int128 support

1 Answers1