apache storm - run jar in single node correctly but not in multi nodes

Question

I am new to apache storm, I wrote the code that includes 1 spout and 2 bolt, when I am running this 3 parts on one worker, the code generates output correctly, but when I'm running code in three worker that 1 worker executes spout, another run bolt 1 and last one run bolt 2, the output won't be generated. specific situation: when I put bolt 1 and 2 in one worker, the output generated!

I have to say that the emit work successfully and there is no problem with emit variables.

In details: i created tree in hashmap structure in bolt 1, and I want to mine this tree in bolt 2. the id of objects that insert in tree in bolt 1 are like "MyTreeNode@e70014d5" and when I received this tuple (hashmap) in bolt 2, the id changed to something like this "MyTreeNode@z5542r12".

what is the main problem?

Is the problem because of the changing object id? if yes, could you please inform me how can I solve it?

score 1 · Accepted Answer · answered Jan 14 '19 at 21:25

1

Let's look at an example topology.

Let's say your topology goes spout -> bolt1, and you're emitting MyObject instances from the spout.

Let's say you've set up the topology to run in 1 worker.

When a tuple (e.g. MyObject@1234) is emitted from the spout, Storm will check if the tuple needs to go to another worker. If not, it just passes the object reference along to bolt1. This is what you are seeing when you have only 1 worker. When MyObject@1234 needs to get transferred from the spout to the bolt, Storm just hands the bolt the MyObject@1234 reference.

Now let's say you tell the topology to use 2 workers, and Storm decides to put the spout in worker 1 and the bolt in worker 2. Recall that each worker is a separate JVM process, so passing the object reference from worker 1 to worker 2 won't work.

When the tuple is emitted from the spout, Storm will see that it is going to another worker, and serialize it using either Kryo or Java serialization depending on your configuration. This means that MyObject@1234 gets serialized. Storm hands the serialized form to worker 2, which deserializes it. When it is deserialized, it is very reasonably given a new memory address (e.g. MyObject@6789).

This is not an issue if you design your bolts to assume that they are not running in the same JVM, which you should absolutely do. For example, if you want to transfer a MyObject from worker 1 to worker 2, you might make it Serializable, or you might register it with Kryo (see how at https://storm.apache.org/releases/2.0.0-SNAPSHOT/Serialization.html). You need to do this so Storm can put your spouts and bolts in separate JVMs without breaking your topology.

When you are testing your topology, you should enable https://storm.apache.org/releases/1.2.2/javadocs/org/apache/storm/Config.html#TOPOLOGY_TESTING_ALWAYS_TRY_SERIALIZE. This will cause Storm to serialize your tuples always, even if the tuple isn't being transferred between workers. This can help you catch issues with serialization before they make it into production.

As an aside, you should always prefer Kryo serialization to Java serialization. Kryo serialization is much faster.

answered Jan 14 '19 at 21:25

Stig Rohde Døssing

3,621
2
7
7

Thank you very much "Stig Rohde Døssing". actually I emiting TreeMap and some integer values. it's seems TreeMap won't serialize like HashMap and maybe this is my problem. how can I serialize TreeMap?? – Sadegh Rahmani Jan 17 '19 at 09:04
TreeMap is serializable. More than likely, the problem is that either your keys or values are not serializable. Make sure all the keys/values implement Serializable, or register the key/value types (plus maybe TreeMap) with Kryo. You can see how to register with Kryo at the link I posted to the Storm docs. – Stig Rohde Døssing Jan 17 '19 at 12:48
Thank you for your posted link. I serialized TreeMap value, but I receive this error: java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:339) at java.util.HashMap.get(HashMap.java:557) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at ... and so on OR this error: java.lang.StackOverflowError at com.esotericsoftware.kryo.Generics.getConcreteClass ... – Sadegh Rahmani Jan 17 '19 at 15:58
I googled your error, and got https://github.com/EsotericSoftware/kryo/issues/462. This might be a bug in Kryo. I have raised https://issues.apache.org/jira/browse/STORM-3315. In the meantime, you can try the following things: Use a HashMap instead, use Java serialization instead, use some other serialization mechanism (e.g. serialize to JSON with something like Jackson). – Stig Rohde Døssing Jan 17 '19 at 17:34
Thank you again. I read release note of kryo in https://github.com/EsotericSoftware/kryo/releases and it seems kryo version 5 and 4 released. As you mentioned "Upgrade to Kryo 4" in https://issues.apache.org/jira/browse/STORM-3315, is that possible upgrade kryo version? if yes, how? – Sadegh Rahmani Jan 17 '19 at 19:43
It requires Storm to be patched. I've raised https://github.com/apache/storm/pull/2939. You are welcome to check out the branch and build the modified Storm. You can see how at https://github.com/apache/storm/blob/master/DEVELOPER.md#create-a-storm-distribution-packaging (add -Dgpg.skip to the command) – Stig Rohde Døssing Jan 17 '19 at 22:18
Thank you for your all answers and comments. I finally solved this problem, and the grouping was the problem! Before solving I used the shuffle grouping, and then I changed to global grouping, and something else is that in cluster mode I have to remove (commented) the "TOPOLOGY_TESTING_ALWAYS_TRY_SERIALIZE". In addition I turned to false the "TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION" and finally the jar file ran and generate output successfully. Thanks again for all your advice and help. – Sadegh Rahmani Jan 21 '19 at 07:10
I'm afraid that just hides your issue. Using global grouping makes all your tuples go to a single bolt task, which likely happens to be in the same JVM as the sending bolt. Disabling TOPOLOGY_TESTING_ALWAYS_TRY_SERIALIZE then causes Storm to skip trying to serialize the tuple if it's going into the same worker. I think the issue is still there, and will happen in cluster mode if your bolts happen to be in different workers. – Stig Rohde Døssing Jan 21 '19 at 13:09
You are right, unfortunately the issue still exist. I tried some scenario and I findout in my case direct grouping is solution. I ran code in direct grouping mode and my problem solved, but another issue appeared that the runtime increased, when number of workers increased! because nodes was waiting for each other! Is there any solution for runtime issue? – Sadegh Rahmani Jan 26 '19 at 22:28

apache storm - run jar in single node correctly but not in multi nodes

1 Answers1