I'm encountering a problem with incorrect numpy calculations when the inputs to a calculation are a numpy array with a 32-bit integer data type, but the outputs include larger numbers that require 64-bit representation.
Here's a minimal working example:
arr = np.ones(5, dtype=int) * (2**24 + 300) # arr.dtype defaults to 'int32'
# Following comment from @hpaulj I changed the first line, which was originally:
# arr = np.zeros(5, dtype=int)
# arr[:] = 2**24 + 300
single_value_calc = 2**8 * (2**24 + 300)
numpy_calc = 2**8 * arr
print(single_value_calc)
print(numpy_calc[0])
# RESULTS
4295044096
76800
The desired output is that the numpy array contains the correct value of 4295044096, which requires 64 bits to represent it. i.e. I would have expected numpy arrays to automatically upcast from int32 to int64 when the output requires it, rather maintaining a 32-bit output and wrapping back to 0 after the value of 2^32 is exceeded.
Of course, I can fix the problem manually by forcing int64 representation:
numpy_calc2 = 2**8 * arr.astype('int64')
but this is undesirable for general code, since the output will only need 64-bit representation (i.e. to hold large numbers) in some cases and not all. In my use case, performance is critical so forcing upcasting every time would be costly.
Is this the intended behaviour of numpy arrays? And if so, is there a clean, performant solution please?