4

I Just want to understand how we should plan for the capacity of a NiFi instance.

We have a NiFi instance which is having around 500 flows. So, the total number of processors enabled on NiFi canvas is around 4000. We do run 2-5 flows simultaneously which does not take more than half an hour i.e. we do process data in MBs.

It was working fine till now but we are seeing outofMemory error very often. So we increased xms and xmx parameters from 4g to 8g which has resolved the problem for now. But going forward we will have more flows and we may face outofmemory issue again.

So, can anyone help with matrix of capacity planning or any suggestion to avoid such issues before happening? eg:- If we have 3000 processors enabled with/without any processing then Xg amount memory required.

Any input on NiFi capacity planning would be appreciated.

Thanks in Advance.

Ankit Tripathi
  • 325
  • 2
  • 12
  • It really depends on processors you are using, size of flow files, and total number of running threads you defined in nifi. Some processors loading full flow file into memory. Show on what processors you got out of memory, size of flow files, and number of threads. – daggett Nov 17 '19 at 07:07
  • Thank daggett for the thoughts!! Yes, I agree it also depends on size of flow files and running threads. But only 2-3 flows are simultaneously running with 100 MB data in each. So I think enabled processor in canvas which have no thread is also consuming memory. – Ankit Tripathi Nov 17 '19 at 07:24
  • For example updateattribute processor consuming so low memory, that you could have 1000s of threads without oom with 4g memory. So, question remains: what processors you are using and which of them throws oom. – daggett Nov 17 '19 at 08:04
  • Let me add on more thing here!! Everything was when we had 400 flows at that time things were good. Our all the flows are more or less same and at that time also we were running 2-5 flows simultaneously with around 100 MB. I completely understand the point you are raising but my question is like if we create 1000s of flows then does it also consume memory? How can that be calculate it per processor? Note - There is no active thread in any of flows. – Ankit Tripathi Nov 17 '19 at 08:19
  • The number of processors in your nifi flow almost does not affect the memory. But number of threads in `sandwich menu -> controller settings -> general` + processors types + flow file size does affect the memory. – daggett Nov 17 '19 at 09:10

1 Answers1

0

OOM errors can occur due to specific memory consuming processors. For example: SplitXML is loading your whole record to memory, so it could load a 1GiB file for instance.

Each processors can document what resource considerations should be taken. All of the Apache processors(as far as I can tell) are documented in that matter so you can rely on them.

In our example, by the way, SplitXML can be replaced with SplitRecord which doesn't load all of the record to memory.

So even if you use 1000 processors simultaneously, they might not consume as much memory as one processor that loads your whole FlowFile's content to memory.

Check which processors you are using and make sure you don't use one like that(there are more like this one that load the whole document to memory).

Ben Yaakobi
  • 1,620
  • 8
  • 22
  • Thank you!! But only 2-3 flows are simultaneously running with 100 MB data in each. So I think enabled processor in canvas which have no thread is also consuming memory. – Ankit Tripathi Nov 17 '19 at 07:26