2

Given that a numpy array is stored contiguously, if we try to append or extend to it then this happens not in-place but, instead, a new copy of the array is created with adequate 'room' for the append or extend to occur contiguously (see https://stackoverflow.com/a/13215559/3286832).

To avoid that, and assuming we are lucky enough to know the specific number of elements we expect the array to have, we can create a numpy array with a fixed size filled with zeros:

import numpy as np

a = np.zeros(shape=(100,))  # [0. 0. 0. ... 0. 0. 0.]

Say that we want to populate this array each element with a new value each time (e.g. provided by the user) by editing this array in-place:

pos = 0
a[pos] = 0.002              # [0.002 0. 0. ... 0. 0. 0.]

pos = pos + 1
a[pos] = 0.101              # [0.002 0.101 0. ... 0. 0. 0.]

# etc.

pos = -1
a[pos] = 42.00              # [0.002 0.101 ... ... ... 42.]

Question:

Is there a way to keep track of the next available position pos (i.e. last position not previously populated with a new input value) instead of manually incrementing pos each time?

Is there a way in efficiently achieving this in numpy, preferably? Or is there a way of achieving this in another Python library (e.g. scipy or Pandas)?

(edited the question according the comments and initial answers which stated how not clear my initial question was phrased - hope this now is clearer)

Yannis
  • 1,682
  • 7
  • 27
  • 45
  • 2
    Your question is a bit confusing. In your example code, you are not "appending" anything to the array. To "append" would mean to add values beyond the boundaries of the existing array. Instead, you are changing values in the array that already have space allocated for them. Thus, reallocation does not need to occur. – Jussi Nurminen Feb 18 '20 at 12:59
  • @jussinurminen You are right, I misused the terms 'append' and 'extend'. I will edit my question to remove those and make it clear that I just want to insert new values in this fixed-size-instantiated-with-zeros array one by one. – Yannis Feb 18 '20 at 13:06
  • 1
    I'm still not sure what you want to accomplish. Your example code is fine; it changes the existing values in the array. Do you want to do something else? – Jussi Nurminen Feb 18 '20 at 13:12
  • @jussinurminen I just edited my question to be clearer based on your comments and initial answers - hope that makes it clearer. – Yannis Feb 18 '20 at 14:20
  • 2
    If you want to use `numpy` with a preallocated array, you need to keep track of `pos`. Otherwise, there's no way to know which positions were previously written. (Of course you could try tricks like preallocating the array with a special value such as `np.nan` to keep track of unwritten positions, but that's even more clumsy than keeping track of the index). If you want simpler code, an alternative technique might be to use a Python list to collect the values (using the `.append` method) and convert it to a numpy array afterwards. – Jussi Nurminen Feb 18 '20 at 14:28

2 Answers2

0

If I understand you correctly, you need some kind of circular buffer. Python has collections.deque for this purpose.

Here is my custom implementation of circular buffer using h5py, but you can change it to numpy.

Update: As it was already mentioned in comments it is impossible to track changes of an np.array out of the box. Instead, you can implement your own class and track all the necessary changes there (see my implementation as an example, i.e. concatenate arrays to extend its size). I'd suggest you to use python list if you need appending or deque if you need fixed size. The both arrays can be then converted to np.array

Aray Karjauv
  • 2,679
  • 2
  • 26
  • 44
  • I just edited my question to be clearer based on your comments and initial answers - hope that makes it clearer. – Yannis Feb 18 '20 at 14:21
0

Actually, your question is still confusing for me. How do you define the new value you want to insert to the new position? Is it coming from outside of your code? Do you have all the new values for your array, or only part of it?

Probably, you can use the slices in numpy, which are exactly for fast updating of the array, however, I'm not exactly sure that this is what you want to do.

Some samples for you:

>>> import numpy as np
>>> a = np.zeros(shape=(10,))
>>> a
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
>>> a[3:6] += 1
>>> a
array([0., 0., 0., 1., 1., 1., 0., 0., 0., 0.])
>>> a[:4] += .001
>>> a
array([1.000e-03, 1.000e-03, 1.000e-03, 1.001e+00, 1.000e+00, 1.000e+00,
       0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00])
>>> a[3:5] = [2, 1]
>>> a
array([1.e-03, 1.e-03, 1.e-03, 2.e+00, 1.e+00, 1.e+00, 0.e+00, 0.e+00,
       0.e+00, 0.e+00])
>>>
VMAtm
  • 27,943
  • 17
  • 79
  • 125
  • I just edited my question to be clearer based on your comments and initial answers - hope that makes it clearer. – Yannis Feb 18 '20 at 14:21
  • As I said, in this case it looks like you can save the time updating the array in batches via splices. – VMAtm Feb 18 '20 at 14:35
  • Yes, that is useful when I know (at least some of) the new inputted values in advance. – Yannis Feb 18 '20 at 14:42