5

I'd like to get the names from named R vectors (or matrices, etc.) back into Python. In rpy2 < 3.0.0 this was possible, e.g.,

>>> stats.quantile(numpy.array([1,2,3,4]))
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f3e664d6d88 / R:0x55c939a540c8>
[1.000000, 1.750000, 2.500000, 3.250000, 4.000000]
>>> stats.quantile(numpy.array([1,2,3,4])).names
R object with classes: ('character',) mapped to:
<StrVector - Python:0x7f3e66510788 / R:0x55c939a53648>
['0%', '25%', '50%', '75%', '100%']
>>> stats.quantile(numpy.array([1,2,3,4])).rx('25%')
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f3e68770bc8 / R:0x55c938f23ba8>
[1.750000]

But in rpy2 >= 3.0.0, the output is getting converted to a numpy array so of course there is no .names or .rx and therefore the names seem to be lost.

>>> stats.quantile(numpy.array([1,2,3,4]))
array([1.  , 1.75, 2.5 , 3.25, 4.  ])
lgautier
  • 11,363
  • 29
  • 42

1 Answers1

1

rpy2 3.0.0 is trying to simplify its conversion system, and with this make its imperfections easier to both anticipate and mitigate.

Here, what is happening when the numpy conversion layer is active is that:

  • numpy arrays are converted to R arrays whenever needed by R
  • R arrays are converted to numpy arrays when returning from R

That symmetry is not a requirement, but just the way the default numpy conversion layer is. One can set up an asymmetrical conversion layer, which will be here converting numpy arrays to R arrays but leaving R arrays as such when returning from R, relatively quickly and easily.

import numpy
from rpy2.rinterface_lib import sexp
from rpy2 import robjects
from rpy2.robjects import conversion
from rpy2.robjects import numpy2ri

# We are going to build our custom converter by subtraction, that is
# starting from the numpy converter and only revert the part converting R
# objects into numpy arrays to the default conversion. We could have also
# build it by addition. 
myconverter = conversion.Converter('assym. numpy',
                                   template=numpy2ri.converter)
myconverter.rpy2py.register(sexp.Sexp,
                            robjects.default_converter.rpy2py)

That custom conversion can then be used when we need it:

with conversion.localconverter(myconverter):
    res = stats.quantile(numpy.array([1, 2, 3, 4]))

The outcome is:

>>> print(res.names)                                                                                                   
[1] "0%"   "25%"  "50%"  "75%"  "100%"

If this looks like too much effort, you can also skip the numpy converter altogether, only use the default converter, and manually cast your numpy arrays to suitable R arrays whenever you judge it necessary:

>>> stats.quantile(robjects.vectors.IntVector(numpy.array([1, 2, 3, 4]))).names                                           
R object with classes: ('character',) mapped to:
['0%', '25%', '50%', '75%', '100%']
lgautier
  • 11,363
  • 29
  • 42
  • The asymmetric custom conversion has been working perfectly for me. Thanks! However as of the changes in rpy2 3.3.0 that added `NameClassMap`, I needed to add this additional hack: myconverter._rpy2py_nc_map.update(rpy2.robjects.default_converter._rpy2py_nc_map._map.copy(), default=rpy2.robjects.default_converter._rpy2py_nc_map._default) because the default numpy converter doesn't have the necessary information in `_map` (`_map` is just an empty dictionary). This seems to work, but I was wondering if there is a better approach. – Chris Paciorek Jul 29 '20 at 18:42