-1

I'm trying to covert UniProt accession numbers to Entrez IDs using the BioconductoR package org.Hs.eg.db (which is an S4 object). I'm also trying to do this as part of a Python script with rpy2. Calling the select function gives me errors. Here's the code (the program is 400 lines, I'm excerpting the relevant stuff):

from rpy2.robjects.packages import importr
from rpy2.robjects import StrVector, DataFrame, r

# get UniProt accension numbers from first two columns of data
uniprotA = []
uniprotB = []
for row in interactions:
    uniprotA.append(row[0][10:])
    uniprotB.append(row[1][10:])
# convert to vectors in r
uniprotA = StrVector(uniprotA)
uniprotB = StrVector(uniprotB)

homosap = importr('org.Hs.eg.db')

geneidA = r.select(homosap, keys = uniprotA, columns = "ENTREZID", keytype="UNIPROT")

And here are the error messages:

Traceback (most recent call last):
  File "mitab_preprocess.py", line 356, in <module>
    reformat_data(interactions)
  File "mitab_preprocess.py", line 140, in reformat_data
    geneidA = r.select(homosap, keys = uniprotA, columns = "ENTREZID", keytype="UNIPROT")
  File "//anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.py", line 178, in __call__
    return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
  File "//anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.py", line 102, in __call__
    new_args = [conversion.py2ri(a) for a in args]
  File "//anaconda/lib/python2.7/site-packages/singledispatch.py", line 210, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "//anaconda/lib/python2.7/site-packages/rpy2/robjects/conversion.py", line 60, in _py2ri
    raise NotImplementedError("Conversion 'py2ri' not defined for objects of type '%s'" % str(type(obj)))
NotImplementedError: Conversion 'py2ri' not defined for objects of type '<class 'rpy2.robjects.packages.InstalledSTPackage'>'
hannah
  • 889
  • 4
  • 13
  • 27

1 Answers1

0

homosap is an R package exposed as a Python namespace.

I think that you want to use an object in that namespace as a parameter, not the namespace.

Here it should be homosap.org_Hs_eg_db (I am guessing, I have not tried).

There are many things at play here:

  • . is not a syntactically valid symbol for Python variable names so it is translated to _ by rpy2
  • when importing an R package all its symbols are added to the search path. If coming from Python, this is a bit like from <package> import *. rpy2's importr is returning a namespace in which the symbols of the package are exposed as attributes.
lgautier
  • 11,363
  • 29
  • 42
  • I think you're right. Changing that line of code to: " geneidA = r.select(org.Hs.eg.db, keys = uniprotA, columns = "ENTREZID", keytype="UNIPROT")", I'm now getting the error: File "mitab_preprocess.py", line 346, in reformat_data(interactions) File "mitab_preprocess.py", line 138, in reformat_data geneidA = r.select(org.Hs.eg.db, keys = uniprotA, columns = "ENTREZID", keytype="UNIPROT") NameError: global name 'org' is not defined – hannah Oct 16 '15 at 18:23
  • No no no. Your Python namespace is called `homosap`. You likely need to pass a parameter like `homosap.`. – lgautier Oct 16 '15 at 22:28