6

In the Scipy documents written that :

The function zeros creates an array full of zeros, the function ones creates an array full of ones, and the function empty creates an array whose initial content is random and depends on the state of the memory. By default, the dtype of the created array is float64.

So I was ran this code :

import numpy as np
np.empty((1,2))

And it's return :

array([[  6.92892901e-310,   8.42664136e-317]])

So it's return a random numbers and all of things are great .

But, when I was running that code for the second time (in that shell) it's return a zero array !

np.empty((1,2))
array([[ 0.,  0.]])

And here is the question, why it's return zero array at the second time (instead of random number) ?

2 Answers2

6

It's not random, it depends on what was saved in the bytes in memory that your computer gave NumPy when it requests some space for the array. If there is something other than zeros in there then these will be interpreted with the requested dtype (seemingly random but a better word would be unpredictable).

In your example you didn't save the first array so the memory for the first array immediatly reused.

>>> import numpy as np
>>> print(id(np.empty((20))))
2545385324992
>>> print(id(np.empty((20))))
2545385324992

Now comes the amazing part: It seems Python (or NumPy or your OS) zeros that memory before it gives it to NumPy again.

If you create a bigger array than it won't be "zero" because it's taken from somewhere else:

>>> print(np.empty((1, 2)))
[[  1.25757479e-311   1.25757479e-311]]
>>> print(np.empty((1, 3)))
[[  4.94065646e-324   9.88131292e-324   1.25757705e-311]]
MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • Interesting! I wouldn't have expected it to be the exact same memory. But is that behaviour guaranteed? – Paul Panzer Jan 31 '17 at 02:38
  • 1
    @PaulPanzer I'm almost certain both the zero'ing and the reuse is an implementation detail (so it's not guaranteed). But the latter makes sense because it's probably more efficient to reuse recently free'd memory if exactly the same "size" is requested again so it's not unlikely. – MSeifert Jan 31 '17 at 02:43
  • 1
    That is really puzzling. If there are any good reasons for zeroing the memory the second time. Why the heck don't they apply the first time? – Paul Panzer Jan 31 '17 at 03:07
  • I tried running this code in IDLE, and the memory was not zeroed the second time, or the third time. It did get zeroed eventually, but I didn't do anything differently. More weirdly, the id changed from 206546944 to 206546848 after I printed it, but the numbers in the ndarray didn't change. It changed back to the original 206546944 at some unknown point, still without changing the numbers in the ndarray. I was calling np.empty(20) every time, without storing the resulting ndarray. – Allison B Jun 07 '22 at 22:11
  • @AllisonB I think it's not too important to know when something is reused or not. If you want a random array you should use `np.random.*` and if you want a filled array you can use `np.zeros`, `np.ones` or `np.full` (I guessed the last one - maybe it's named differently). The reusing of objects is an implementation detail inside Python/NumPy and it's not too surprising that something changed in the last 5 years. If you want deterministic reuse: one needs to reuse the array themself. – MSeifert Jun 08 '22 at 15:45
  • @MSeifert The fact that it changed isn't surprising at all! There is always something interesting to learn from the implementation, so you're welcome for doing the exploration. – Allison B Jun 08 '22 at 17:04
2

The wording of the docs seems a bit unfortunate in this case. They do not mean random in the sense of a proper random number generator. If the latter is what you need you can use one of the functions in numpy.random or scipy.stats.

Describing numpy.empty a better word would be "undefined" meaning you the user can't make any assumptions on the values initially in the returned array. empty is the cheapest way of creating an array if you know you will overwrite its content anyway. The computer will just grab some memory for you. If that memory was not yet used in that session chances are it will appear random. But your computer also recycles memory.

I have to admit I don't really know what recycled memory looks like but two plausible possibilities would be

  • it contains what the program that used it before happened to write
  • or perhaps the OS has overwritten it with zeros for security reasons

Either possibility would explain what you are seeing.

Paul Panzer
  • 51,835
  • 3
  • 54
  • 99