1

I'm currently working on accessing HBase using python3. The way I'm doing is using py4j to call JAVA APIs that I'm writing to access HBase.

I've a question related to creating a Put object which takes a qualifier and value.

I want to pass a dictionary to a JAVA class which expects a hashmap. Is it possible through py4j.

I don't want to call Put for every column qualifier iteratively. I want to pass the dict to py4j and it should be received as HashMap on JAVA side.

Could you please some hints/pointers to how can this be done...

zakinster
  • 10,508
  • 1
  • 41
  • 52
Mayank
  • 5,454
  • 9
  • 37
  • 60

1 Answers1

2

There are two ways to do what you want:

  1. You can create a java.util.HashMap() and use it as a dict on the python side. This is good if you pass the dictionary a lot, but you do not modify it often on the python side. This is also good if the dictionary is modified on the java side and you want to see the modifications on the python side.
  2. Py4J can automatically convert a python dict to a HashMap when calling a Java method. Note that the dictionary will be copied and that any change performed on the Java side won't be reflected on the Python side.

The easiest solution would be #1 I believe:

>>> m = gateway.jvm.java.util.HashMap()
>>> m["a"] = 0
>>> m.put("b",1)
>>> m
{u'a': 0, u'b': 1}
>>> u"b" in m
True
>>> del(m["a"])
>>> m
{u'b': 1}
>>> m["c"] = 2 
Barthelemy
  • 8,277
  • 6
  • 33
  • 36
  • does every statement like `m["a"] = 0` sends some data through socket? I beileve it does. 100 modifications will send 100 socket requests. Or is my understanding wrong? I want to create the dictionary of 100 items in python and send it to JAVA for whatever :). I think the second method is more efficient for such case. This is what my understanding is. Please let me know if its incorrect or I am missing something... – Mayank Apr 04 '13 at 13:39
  • Indeed, the second solution is the best for the usage you are describing in your comment. You are right that with the first solution, 100 assignments would result in 100 exchanges through socket. – Barthelemy Apr 04 '13 at 17:32
  • One last question. Is it possible to get a dictionary from a JAVA function that returns NavigableMap. The bytes can be a HashMap recursively... I think I should go for simplejson. What do you suggest :) – Mayank Apr 04 '13 at 19:24
  • Not sure about that one, but Py4J provides a dict interface based on the Map interface (it is not tied to HashMap). If you are using byte arrays, you should use the latest version from git because many bugs were fixed with the conversion of byte array since the last release. – Barthelemy Apr 05 '13 at 09:24
  • @Barthelemy thanks for py4j at all. i have the problem, that i am calling from java side a python function with a list of maps. If i want to set a new value (array, narray) to one of the parameter java map (py4j.java_collections.JavaMap) i get an error: 'numpy.ndarray' object has no attribute '_get_object_id'. any idea? even is there something like a cast to a python dict? cause with a normal dict i dont have any problems – white91wolf Jan 20 '20 at 09:28