This is a noobie question
I have a hadoop setup and thinking of uisng Giraph or Hama for graph based computation. I have a large file in the form
3 4 3 7 3 8 5 6
where each column denotes vertices and each row denote edges. For normal programs I read the whole file into a form like
3: [4,7,8] 5: [6]
which means vertex 3 has got edges to 4,7,8 and 5 has edges to 6.
How to handle this condition for a large file in Hadoop? Reading like this means loading whole contents to RAM? What is the best way to do it in Hadoop?