Factorializing Medium Numpy Ints Creates Runtime Warning

Question

I am checking run times on factorials (have to use the user-defined function), but I receive an odd error. Code I'm working with is as follows:

import numpy as np
import time
np.random.seed(14)
nums = list(np.random.randint(low=100, high=500, size=10))
# nums returns as [207, 444, 368, 427, 349, 458, 334, 256, 238, 308]
def fact(x):
  if x == 1:
    return 1
  else:
    return x * fact(x-1)

recursion_times = []
recursion_factorials = []

for i in nums:
    t1 = time.perf_counter()
    factorial = fact(i)
    t2 = time.perf_counter()
    execution = t2-t1
    recursion_factorials.append(factorial)
    recursion_times.append(execution)
    print(execution)

When I run the above, I get the following: RuntimeWarning: overflow encountered in long_scalars"""

But when I run it as below, I get no warnings.

recursion_times = []
recursion_factorials = []
for i in [207, 444, 368, 427, 349, 458, 334, 256, 238, 308]:
    t1 = time.perf_counter()
    factorial = fact(i)
    t2 = time.perf_counter()
    execution = t2-t1
    recursion_factorials.append(factorial)
    recursion_times.append(execution)
    print(execution)

I know it's a bit of extra overhead to call the list nums, but why would it trigger a runtime warning? I've tried digging around but I only get dynamically-named variable threads and warning suppression libraries - I'm looking for why this might happen.

For what it's worth, I'm running Python3 in a jupyter notebook. Glad to answer any other questions if it will help.

Thanks in advance for the help!

I don't think it has anything to do with your variable names. You end up with numbers too big to store in the data type of your choice. Are you using numpy? — Selcuk, Jan 22 '20 at 04:48
@Selcuk I am, I created the list using np.random.randint(low=100, high=500, size=10). I've also tried putting in dtype=np.int64 as an optional argument (I believe the default is int32). — TJ15, Jan 22 '20 at 04:52
In the future, please post something that we can actually run, that reproduces the error when run. Particularly, without the definition of `fact`, we can't run this. Also, unless `fact` itself uses NumPy directly, the overflow is probably being triggered by your use of a NumPy array as input, in which case passing a list (as the posted version of the code does) would not trigger it. — user2357112, Jan 22 '20 at 04:55
@user2357112supportsMonica Good point, have edited the original post to define fact(). Sorry about that — TJ15, Jan 22 '20 at 04:57
Numpy's default integer size is `int32`. You can only compute up to `12!` using an `int32`. Note that even `100!` is a **huge** number (it has 158 digits). — Selcuk, Jan 22 '20 at 05:04
Am I correct in thinking that you want to remove the second definition of `nums` in your 2nd code block, 3rd line, as you've already defined it in the first code block? — PeptideWitch, Jan 22 '20 at 05:06
@PeptideWitch yes, I was using that for quick reference before I showed work on how I used `np.random.randint()`, I've edited that out. — TJ15, Jan 22 '20 at 05:07
@Selcuk I've tried using the optional `dtype=np.int64` argument in `np.random.randint()` but it doesn't appear to help (I recognize how large the numbers are here, though). I understand why an overflow warning would come through in general, but I'm confused about the inconsistent behavior when calling the list directly versus referencing `nums` — TJ15, Jan 22 '20 at 05:09

user2357112 · Accepted Answer · 2020-01-22T05:19:04.283

3

If (as in the current version of your post) you created nums by calling list on a NumPy array, but wrote an explicit list literal with no NumPy for the second test, then the second test gives no warning because it's not using NumPy. nums is a list of NumPy fixed-width integers, while the other list is a list of ordinary Python ints. Ordinary Python ints don't overflow.

(If you want to create a list of ordinary Python scalars from a NumPy array, the way to do that is with array.tolist(). This is usually undesirable due to performance implications, but it is occasionally necessary to interoperate with code that chokes on NumPy types.)

There would usually be an additional effect due to the default Python warning handling. By default, Python only emits a warning once per code location per Python process. In the original version of your question, it looked like this was causing the difference.

Using a variable or not using a variable has no effect on this warning.

edited Jan 22 '20 at 05:19

answered Jan 22 '20 at 04:48

user2357112

260,549
28
431
505

I wouldn't think so, but if I run the same cell again (or another cell with the same code), I retain the warning. I've also tried changing the cell in place and commenting out the line that calls the list `nums`. It still holds that whenever I call `nums` instead of the list directly, I receive the warning. Do you know why that would be? – TJ15 Jan 22 '20 at 04:54
1

@TJ15: Probably some sort of ordering effect or weird non-default warning handling, or some additional change you made that you didn't realize was significant. It's unlikely to have anything to do with whether you use a variable for the input. – user2357112 Jan 22 '20 at 05:08
3

For example, if you wrote `for i in [207, 444, 368, 427, 349, 458, 334, 256, 238, 308]` instead of `for i in list(np.random.randint(low=100, high=500, size=10))`, then you would be using ordinary Python ints instead of NumPy fixed-width integers, and you wouldn't get this warning. Coupled with different warning handling in the notebook that shows warnings more often when you are using NumPy, that could explain things. – user2357112 Jan 22 '20 at 05:11
Excellent point. Answers to [this question](https://stackoverflow.com/questions/38155039/what-is-the-difference-between-native-int-type-and-the-numpy-int-types) goes into a bit more detail on python built-in int type versus np.int64 or np.int32. Thanks for your help! – TJ15 Jan 22 '20 at 05:14

Factorializing Medium Numpy Ints Creates Runtime Warning

1 Answers1