0

Calculating modulo with numpy for large integer (>63 bits) sometimes gives incorrect results.

For example:

import numpy

numpy.mod(12345678912345679000, 3)
numpy.mod(12345678912345679001, 3)
numpy.mod(12345678912345679002, 3)

all give the result of 1.0. Note that there is no 8 between the second 7 and 9.

This could be due to the int being larger than 63 bits.


However, sometimes the correct results are outputted.

For example,

numpy.mod(123456789123456789100, 3)
numpy.mod(123456789123456789101, 3)
numpy.mod(123456789123456789102, 3)

give the correct results of 1, 2, and 0, respectively. Note these new ints are 3 bits longer than before with the addition of 8 between the second 7 and 9.

Any idea why numpy.mod would have this behavior and how can I work with large ints (>63 bits) in a consistent manner with numpy?

Thanks in advance!


I'm running Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on 64-bit windows with numpy v1.18.1.

Tim
  • 3,178
  • 1
  • 13
  • 26
  • `int64` is the normal `numpy` integer `dtype` – hpaulj Feb 04 '20 at 17:50
  • Try: `12345678912345679000%3`. Python integers are not limited to 64 bits. – hpaulj Feb 04 '20 at 17:58
  • if that is the reason,try changing `dtype` to `uint64` – Shubham Shaswat Feb 04 '20 at 18:00
  • `uint64` doesn't seem to help. Even if it did it would only gain one bit. – hpaulj Feb 04 '20 at 18:01
  • 1
    What complicates things is that `np.array(...)` creates `np.int64` numbers, then `np.uint64`, and the `object` dtype as they get bigger. With object dtype the `mod` uses the Python `%`, which is fine. There's a small region around 64 bits where the `mod` is producing an erroneous float. Elsewhere it is producing a valid integer. – hpaulj Feb 04 '20 at 18:09
  • @hpaulj thanks! That makes a lot of sense. The reason I want to use numpy is for the faster array operations since I have to take a bunch of mods. But I guess I can do normal python `%` for these troublemakers around 64 bits. – Tim Feb 04 '20 at 18:19
  • For single values or a small list, direct use of `%` will be faster than `np.mod` (even in a list comprehension). If it's already an array `np.mod` should be faster, especially for large arrays. But if you have to revert to object dtype arrays, the speed drops back to the list comprehension range. – hpaulj Feb 04 '20 at 19:29
  • Got it. I do have large arrays. Will set dtype to 'object' when necessary then. – Tim Feb 04 '20 at 19:46

0 Answers0