0

I have a Kafka topic that started at about 100GB that I tried to read in to a IMap with Hazelcast-jet. The machine has plenty of memory and I gave it 300 GB of heap. The topic was partitioned into 147 partitions, but when I run the code telling the Pipeline to read from the topic at "earliest" with local parallelism set to 84, the process doesn't seem to use many cores and after running a while doesn't have anywhere near the number of entries there should be in the map (as compared to data ingested into Elastic search at the same time). Now that the topic has grown beyond 500GB I would expect that the process would eventually run out of memory, but it still seems to not use many cores and load only a fraction of the data.

Does anyone have any ideas why this might be?

Marc Mason
  • 11
  • 3
  • So you say the load is proceeding, but slowly? Just guessing: CPU might not be the the bottleneck, it can be network or disk. Also 300 GB in Kafka might be lot more in IMap: the per-entry overhead is about 220 bytes: large entries have relatively lower overhead. – Oliv Mar 20 '19 at 07:17
  • Can you give some more details, how many cores do you have? Java will perform very poorly with 300GB of heap and almost all time will be spent on GC. the parallelism should be set according to the number of cores on your machine, not according to the number of partitions, having too large parallelism creates contention. – Can Gencer Mar 20 '19 at 07:50
  • A reasonable heap size for each Jet node is 5 to 10GB, significantly bigger than that it is recommended that you would use HD memory. It's also worthwhile enabling GC logs to see what's happening and how long are your pauses. – Can Gencer Mar 20 '19 at 08:09
  • I am using nodes with 88 cores and 512 GB of RAM. I know that I need to start another Jet instance now that the total amount of data is getting over 500 GB. I thought a reason for using Hazelcast was to get all the data in memory to make things fast. Are people here suggesting that with 512 GB of RAM I should have 50 - 10 GB instances per machine rather than one large one? Seems like some extra time would be taken in network, although we do have a fast network. – Marc Mason Mar 20 '19 at 13:25
  • @Marc Mason Curious if you got to test with more nodes and it improved the situation? – Can Gencer Apr 14 '19 at 06:11
  • As I recall it didn't improve the situation. – Marc Mason Apr 15 '19 at 16:21

0 Answers0