I have been doing some reading on real time processing using hadoop and stumbled upon this http://www.scaleoutsoftware.com/hserver/
From what the documentation says, it looks like they implemented an in memory data grid using the hadoop worker/slave nodes. I have couple of questions here
From my understanding, if i have a data of size 100 GB, i would atleast need 100GB of ram across all nodes on my cluster just for the data + additional ram for task tracker, data node daemons + additional ram for the hServer service that would run on all these nodes. Is my understanding correct?
The software claims they can do real-time data processing by improving the latency issues in hadoop. Is it because, it allows us to write data to the in-memory grid instead of HDFS?
I am new to Big Data technologies. Apologize if some of the questions are naive.