3

Imagine I have a class A that is outside of my control (numpy.ndarray is the specific application), and class B within my control (scipy.sparse.coo_matrix). I want to implement in-place addition of B to A without touching class A.

Is this at all possible? Is it a bad idea? If this isn't possible in general, is it possible with numpy arrays specifically?

For a specific example consider:

class A:
   foo = 0

   def __iadd__(self, other):
       print("Avoid calling this function.")
       return self

class B:
   def __add__(self, other):
       if isinstance(other, A):
           other.foo += 1
           return other

   __radd__ = __add__

   # Modify this class to make assertion below pass

a1, a2 = A(), A()
a1 += B()
a2 = a2 + B()
assert a1.foo == a2.foo == 1, "How to make this work"

EDIT: Actual application.

In-place addition of a sparse coordinate matrix to a dense numpy array has an efficient implementation:

from time import time

import numpy as np
from scipy import sparse

N = 1000
a = np.zeros((N, N))
b = sparse.identity(N, format="coo")

t_slow = -time()
a += b  # Want to override, converts b to array—slow!
t_slow += time()

t_fast = -time()
a[b.row, b.col] += b.data  # Desired implementation
t_fast += time()
print(f"{t_slow=:.2}s, {t_fast=:.2}s")
# t_slow=0.0017s, t_fast=0.00024s
  • In reality are you only adding a scalar "1" to the left-hand side? If not, what is the actual addition being performed? – Reinderien Jul 31 '21 at 12:59
  • Added the actual application to the post body. – Anton Akhmerov Jul 31 '21 at 13:06
  • 1
    I would consider it a *very* bad idea for `B.__add__` to mutate the *other* argument. If I'm defining `A`, I don't expect `c = a + b` to modify `a` in-place. – chepner Jul 31 '21 at 18:24
  • Also, you aren't correctly defining your dunder methods. For example, your `A. __add__` will return `None` if the condition is false. It should return `NotImplemented` in that case – juanpa.arrivillaga Jul 31 '21 at 18:46
  • The minimal example is for illustration purposes only, it only demonstrates what needs to be overridden/implemented. I am aware of the problems you bring up. – Anton Akhmerov Jul 31 '21 at 19:24
  • 1
    @chepner, the sparse implementation of `a+=b` (`_add_dense`) does not work in-place. `b` ends up in control, changing `a` to `np.matrix`, and even changing its `dtype`. It isn't a very well behaved method! – hpaulj Aug 01 '21 at 01:47

2 Answers2

0

The delegation from iadd to radd and on can be complicated

Look at a simple example.

In [237]: M=sparse.coo_matrix(np.eye(3))
In [238]: M
Out[238]: 
<3x3 sparse matrix of type '<class 'numpy.float64'>'
    with 3 stored elements in COOrdinate format>
In [239]: A = np.zeros((3,3))
In [240]: A
Out[240]: 
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])
In [241]: A + M         # a conventional add - makes a np.matrix
Out[241]: 
matrix([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])
In [242]: A += M
In [243]: A             # this too is np.matrix!
Out[243]: 
matrix([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

But if I convert sparse to add, the result is array:

In [244]: A = np.zeros((3,3))
In [246]: A += M.A
In [247]: A
Out[247]: 
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

I suspect (but don't quote me on this)

A.__iadd__(M) =>
M.__radd__(A) =>
M.__add__(A) =>
M._add_dense(A) =>

def _add_dense(self, other):
    if other.shape != self.shape:
        raise ValueError('Incompatible shapes ({} and {})'
                         .format(self.shape, other.shape))
    dtype = upcast_char(self.dtype.char, other.dtype.char)
    result = np.array(other, dtype=dtype, copy=True)
    fortran = int(result.flags.f_contiguous)
    M, N = self.shape
    coo_todense(M, N, self.nnz, self.row, self.col, self.data,
                result.ravel('A'), fortran)
    return matrix(result, copy=False)

That would explain the np.matrix return. Normally I don't expect A+=... to change A shape, class, or dtype. But it looks like sparse.coo takes control. I haven't looked at that coo_todense.

There some other evidence. A+=M can change the dtype, while A+=M.A cannot. A+=M.A is a view, A+=M is not.

So you might be able to modify the sparse.coo_matrix methods to do things as you want. In fact, I could imagine submitting a bug issue for this sparse behavior. Not only is it slower, but it changes A in non-numpy standard ways.

But why not just write a simple utility function, and not bother with the += overwrite?

In [249]:     def add_to_dense(a,b):
     ...:         a[b.row, b.col] += b.data
     ...: 
In [250]: add_to_dense(A,M)
In [251]: A
Out[251]: 
array([[2., 0., 0.],
       [0., 2., 0.],
       [0., 0., 2.]])
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • My goal is overriding the implementation of in-place addition from the right hand side class. A helper function/method is an obvious alternative, but it doesn't achieve the same goal. – Anton Akhmerov Jul 31 '21 at 19:27
  • My function does work in-place. I tested that with `__array_interface__`. But as I showed the current sparse version does not operate in-place. – hpaulj Jul 31 '21 at 19:50
  • I don't follow: the code you provided seems to allow to call `a[b.row, b.col] += b.data` when `a += b` is invoked (which is what I aim to achieve). – Anton Akhmerov Jul 31 '21 at 20:01
0

If you want a += b to work, define a way to explicitly convert b to a value that does work with A.__iadd__. Otherwise, you are opening the door for something like c = a + b to modify a in-place, which is probably going to lead to some hard-to-find bugs.

If there are types D for which for which a += d (d being an instance of D), then let B provide a way to turn an instance of B into an instance of D.

class B:
    def to_d(self):
        return ...

a += b.to_d()
chepner
  • 497,756
  • 71
  • 530
  • 681