Trident Topology throws out of memory exception

Question

I am shifting from Storm's traditional topology to Trident topology, which maintain batches of tuples before pushing them to database. We are processing XMLs as single tuple. In the traditional topology, which processes one xml at a time, this worked fine. But in Trident topology, it keeps a lot of tuples in memory, before committing in database, which leads to out of memory exception. It is also not clear how does storm decides the batch size and it changes on each iteration. Following is the error which we receive:

java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuilder.append(StringBuilder.java:132) at clojure.core$str$fn__3896.invoke(core.clj:517) at clojure.core$str.doInvoke(core.clj:519) at clojure.lang.RestFn.invoke(RestFn.java:423) at backtype.storm.daemon.executor$mk_task_receiver$fn__5564.invoke(executor.clj:397) at backtype.storm.disruptor$clojure_handler$reify__745.onEvent(disruptor.clj:58) at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) at backtype.storm.daemon.executor$fn__5641$fn__5653$fn__5700.invoke(executor.clj:746) at backtype.storm.util$async_loop$fn__457.invoke(util.clj:431) at clojure.lang.AFn.run(AFn.java:24) at java.lang.Thread.run(Thread.java:745)

Futher Info:

In processing bolts we use DOM parser to parse the XMLs. We tried to reduce the size of individual tuples by taking single element of XML as one tuple, but it didn't help either.

Possible solution might include limiting the size of batches stored in memory or employing fast Garbage Collection.

Problems with garbage collector are tough. See [this reference](http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html). What memory options are you using? You might be able to tune some of the values a bit, but anyway, unless there's something really wrong in your memory options, the reason for the error must be related to creating loads of memory objects that cannot be removed... What part of the code you are using is yours? — lrnzcig, Feb 14 '15 at 10:52
You should be able to control the batch size using the kafkaConfig settings bufferSizeBytes and fetchSizeBytes. See http://czcodezone.blogspot.com/2014/12/trident-what-is-batch-size-for.html — Joshua Martell, Feb 16 '15 at 03:46
consider reading [tuning Storm+Trident](https://gist.github.com/mrflip/5958028) — user2720864, Feb 16 '15 at 07:25

score 0 · Answer 1 · answered Feb 18 '15 at 08:55

java.lang.OutOfMemoryError: GC overhead limit exceeded at

Following is the cause of the exception

The detail message "GC overhead limit exceeded" indicates that the garbage collector is running all the time and Java program is making very slow progress. After a garbage collection, if the Java process is spending more than approximately 98% of its time doing garbage collection and if it is recovering less than 2% of the heap and has been doing so far the last 5 (compile time constant) consecutive garbage collections, then a java.lang.OutOfMemoryError is thrown. This exception is typically thrown because the amount of live data barely fits into the Java heap having little free space for new allocations.

The details can be found here

http://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/memleaks.html

The real cause of this issue is the increase in the memory usage by the application and the GC not able to clear enough memory that might be required for application to continue working, so before throwing the OOME(java.lang.OutOfMemoryError: Java heap space) the JVM vomits this. I have done plenty of JVM's tuning etc but have never seen this message, because I might have done tuning with older versions of JVM which was not vomiting this message.

Logically there can be two possibility of seeing this message - Application is leaking the memory. - Application is consuming more memory.

For the former case you need to fix the memory leak and the way to do it would be to analyze the heapdump and check what is consuming the memory, make sure it is not the leak. The heapdump can be analysed using the eclipse MAT, I have personally used it.

For the later case you will have to bump the heap size which is also explained in the link that I have pasted above, you need to do the following

Action: Increase the heap size. The java.lang.OutOfMemoryError exception for GC Overhead limit exceeded can be turned off with the command line flag -XX:-UseGCOverheadLimit.

score 0 · Answer 2 · answered Feb 20 '15 at 10:20

0

I was able to control the batch size in each iteration by setting kafka fetch size and buffer size as follows:

    spoutConf.fetchSizeBytes = 5*1024*1024;
    spoutConf.bufferSizeBytes = 5*1024*1024;

This limits the amount of data kept in-memory. We will have to tune this limit depending on your use case, so that the in-memory data is not too large for your system yet there is maximal throughput which the system can deliver.

answered Feb 20 '15 at 10:20

Kshitij Dixit

71
1
6

Excuse me . you wrote that you used Trident so how can you set spout for Trident. Should Trident doesn't has spout it bolts ? – Apr 14 '16 at 06:50

Trident Topology throws out of memory exception

2 Answers2