1

I am trying to use Modin package to import a sparse matrix created with scipy (specifically, a scipy.sparse.csr_matrix).

Invoking the method:

from modin import pandas as pd
pd.DataFrame.sparse.from_spmatrix(mat)

I am getting the following AttributeError:

AttributeError                            Traceback (most recent call last)
C:\Users\BERGAM~1\AppData\Local\Temp/ipykernel_37436/3032405809.py in <module>
----> 1 pd.DataFrame.sparse.from_spmatrix(mat)

C:\Miniconda3\envs\persolite_v0\lib\site-packages\modin\pandas\accessor.py in from_spmatrix(cls, data, index, columns)
    109     @classmethod
    110     def from_spmatrix(cls, data, index=None, columns=None):
--> 111         return cls._default_to_pandas(
    112             pandas.DataFrame.sparse.from_spmatrix, data, index=index, columns=columns
    113         )

C:\Miniconda3\envs\persolite_v0\lib\site-packages\modin\pandas\accessor.py in _default_to_pandas(self, op, *args, **kwargs)
     78             Result of operation.
     79         """
---> 80         return self._parent._default_to_pandas(
     81             lambda parent: op(parent.sparse, *args, **kwargs)
     82         )

AttributeError: 'function' object has no attribute '_parent'

While using the original pandas API, it works.

Anyone with a similar problem? Thanks for the support

  • 1
    That looks like it might be a bug in `modin`. You could create a new issue in the [`modin` github repository](https://github.com/modin-project/modin), and see if the `modin` developers can help. – Warren Weckesser Dec 21 '21 at 02:45
  • +1: Looks like there's a recent github issue (https://github.com/modin-project/modin/issues/3890) that was created based on this – ROBOTPWNS Jan 03 '22 at 18:16

1 Answers1

1

This is a bug. The code in this package uses a classmethod to call an instance method, and as a result the self reference is not bound to the inference, but is instead a reference to the first argument (which here is a function).

This is the code that fails:

class BaseSparseAccessor:
    
    def _default_to_pandas(self, op, *args, **kwargs):
        return self._parent._default_to_pandas(
            lambda parent: op(parent.sparse, *args, **kwargs)
        )

class SparseFrameAccessor(BaseSparseAccessor):

    @classmethod
    def from_spmatrix(cls, data, index=None, columns=None):
        return cls._default_to_pandas(
            pandas.DataFrame.sparse.from_spmatrix, data, index=index, columns=columns
        )

A quick example of why this fails follows:

class A:
    
    _parent = 0
    
    def a_method(self, op, **args):
        self._parent = op(self._parent, **args)

class B(A):
    
    @classmethod
    def b_method(cls, data, **args):
        return cls.a_method(sum, data, **args)

When you call b_method (it doesn't matter if B is instantiated into an instance or not) it will fail, because self in a_method is the function sum instead of the class or instance reference.

>>> B.b_method(20)

AttributeError                            Traceback (most recent call last)
<ipython-input-17-3914ce57d001> in <module>
----> 1 B.b_method(20)

<ipython-input-11-a25ce2c0614c> in b_method(cls, data, **args)
     12     @classmethod
     13     def b_method(cls, data, **args):
---> 14         return cls.a_method(sum, data, **args)

<ipython-input-11-a25ce2c0614c> in a_method(self, op, **args)
      6 
      7     def a_method(self, op, **args):
----> 8         self._parent = op(self._parent, **args)
      9 
     10 class B(A):

AttributeError: 'builtin_function_or_method' object has no attribute '_parent'
CJR
  • 3,916
  • 2
  • 10
  • 23