How do you return a changed Python numpy array size reference parameter

Question

This is a Python function parameter passing question. I want a Python function to adjust the size a numpy array which is one of the functions reference parameters.

The content of the passed array appears to change, inside and outside the function. Somehow the updated size/shape of the array object is not being exported from the function, even though I thought Python passed parameters by reference. I am new to Python programming and would have expected all aspects of the object to be updated by reference. Do I need to explicitly "export" the change?

#!/opt/local/bin/python2.7

# Function Test returning changed array
import numpy

def adjust( a1, a2 ) :
  " Adjust passed arrays (my final function will choose which one to adjust from content) "
  print str(a1.shape) + " At start inside function"
  a1[-1,0] = 99
  a1 = numpy.delete(a1, -1, 0)
  print str(a1.shape) + " After delete inside function"
  return None


d1 = numpy.array( [ [ 1,  2,  3],
                    [11, 12, 13],
                    [21, 22, 23],
                    [31, 32, 33]  ] )
d2 = numpy.array( [ [ 9,  8,  7],
                    [19, 18, 17]  ] )

print str(d1.shape) + " At start"
# Let us delete the last row
d1 = numpy.delete(d1, -1, 0)
print str(d1.shape) + " After delete"
# Worked as expected

# So far so good, now do it by object reference parameters in a function......
adjust( d1, d2 )
print d1
print str(d1.shape) + " After function delete return"
# Reference fails to update object properties

Somehow the referenced array object is not getting it's size/shape attributes updated. There should only be 2 rows in the returned array.

(4, 3) At start
(3, 3) After delete
(3, 3) At start inside function
(2, 3) After delete inside function
[[ 1  2  3]
 [11 12 13]
 [99 22 23]]
(3, 3) After function delete return

So the mainline/global code works as expected, the function fails to adjust the size, but the now deleted line at the end shows the updated data. Remembering the final function will select which one of several parameters to adjust, how do I fully export the changed shape/size of the parameter from the function?

score 0 · Answer 1 · answered Dec 23 '18 at 01:15

0

https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.delete.html

Returns: "A copy of arr with the elements specified by obj removed. Note that delete does not occur in-place. If axis is None, out is a flattened array."

What this means is that delete does not modify the original array -- it just makes a copy. Contrast this to element assignment a1[-1,0] = 99, which does modify the array in place. I don't believe numpy allows dynamic array resizing for performance reasons.

As to your confusion, the parameter is passed by reference (so the array at a1 is initially the same as the one at d1). However, the assignment a1 = numpy.delete(a1, -1, 0) is rebinding the name a1, not modifying the array it points to, so d1 is unchanged.

If this doesn't make sense, you should read more about how variables work in Python https://mathieularose.com/python-variables/, but basically, the names (e.q. a1, d1) are dictionary keys, and assignment to a name is changing the value associated with that key, which doesn't affect any other keys associated with that value.

answered Dec 23 '18 at 01:15

JoshuaF

1,124
2
9
23

Thank you for your excellent explanation. I had made some progress to understanding what was going on, but thinking of these things as "names" has helped. By changing a1 after the assignment, I can see that a1 is now referring to a new object. I have not been able to work out the syntax to change d1 from a function. I guess my question was:- "How do I update whatever d1 is, to the new copy?". Do I have to pass a reference to d1 and d2 names somehow? or Do I have to mutate a1 and hence d1 as part of the "numpy.delete(a1, -1, 0)" call? However I am not sure of the syntax required. – mjp Dec 23 '18 at 02:44
Just like `np.delete(a1, ...)` returns a new array, your `adjust` function can return the `a1` variable, which you then assign to whatever variable you like. Even if the function does modify the argument in place, so the passed-by-reference id remains valid, there's no harm in also returning that object. – hpaulj Dec 23 '18 at 04:11
As I mentioned, I am new to Python programming, and so I do not quite understand the syntax you are suggesting here. Remember, in the final code, the function has 2 array parameter objects and my goal is for the function to choose which one to modify, and that to be applied to the appropriate parameter. – mjp Dec 23 '18 at 10:00
The most pythonic way is probably to return both arrays, the changed and the unchanged. You can assign d1 and d2 to the output (`return a1, a2 ... d1, d2 = adjust(d1, d2) `). – JoshuaF Dec 24 '18 at 17:38

mjp · Answer 2 · 2019-01-14T20:49:42.103

Another solution (less efficient than a numpy call).....

I am sure I came across some "inplace delete" examples while searching for an answer, not fully appreciating this may have been what I needed at the time. The trick is to make sure the local names (formal parameters) are not re-assigned to new references by an "=" in the function, so in this way all changes are made to the same references (names) as passed arguments (actual parameters) to the function.

Python's call by reference is perhaps more like call by value pointer given the way names can behave on assignment (=).

# Warning Unsafe - array parameter resized in place
# Insure array is not in use on calling
# Requires array to be in C order
def inplace_row_delete( a, r, c=1 ) :
  " Dangerously and inefficiently delete a row in-place on a numpy array (no range checking) "
  # a  reference to the numpy array to have a row deleted from
  #    This routine takes care not to re-assign this in effect "local" name
  #    so as to change the called array in place.
  # r  row to delete in numpy array a
  # c  Nr of rows from r (inclusive) to delete (default 1)

  rows = a.shape[0]
  if (r < 0):
    # support -ve indexing back from end
    r = rows + r
  # Move all the elements after row r, forward one (or c) place(s)
  for index in range(r, rows - c):
    a[index] = a[index + c]

  # Now make the array smaller, disposing of the last (now repeated) element(s)
  # tuples are immutable, but we need to reduce the first element (for the rows in
  # the array) by 1 (c), but keep however many others there are, the same.
  sh    =  list(a.shape)
  sh[0] -= c
  sh    =  tuple(sh)
  # This numpy re-size happens inplace
  # REQUIRES C order
  a.resize(sh, refcheck=False)
  return None

Using `numpy.ndarray.resize` this way is highly unsafe, leading to memory corruption or silently wrong results in common cases. For example, if `a` is in Fortran order instead of C order, the `resize` will retain the wrong elements. If any views of `a` are in use, they will be viewing freed memory, which is likely to cause memory corruption. — user2357112, Jan 14 '19 at 19:50
So something like `tmp=numpy.delete(a); resize; numpy.copy(a,tmp)` might be a little safer. I guess should at least rework this answer @user2356112? Many Thanks. — mjp, Jan 14 '19 at 21:49
I agree this answer is unhelpful in it's current form, due to the Fortran incompatibility. I guess it should therefor be deleted. — mjp, Jan 30 '19 at 05:33

How do you return a changed Python numpy array size reference parameter

2 Answers2