-1

I'm using SNAP facebook dataset for social network analysis. SNAP uses simple edge list as a data format "node1 node2" . How can I read SNAP dataset in Apache Giraph? I am reading the file with BufferedReader line per line but do not know how to save it in BSP model with adjacency lists. Can someone help me with a code example in java? I would also like to add information about the nodes (characteristics each user/node has) how can I do that in Giraph?

anu
  • 39
  • 5

1 Answers1

0

You can use SNAP facebook dataset directly. In your command instead of using -vif ... use -eif org.apache.giraph.io.formats.IntNullTextEdgeInputFormat. This format reads each line as (source_vertex destination_vertex) just like SNAP dataset.

  • For command : ./hadoop jar /usr/local/giraph-1.1.0/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-2.5.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -eif org.apache.giraph.io.formats.IntNullTextEdgeInputFormat -vip /user/hduser/input/facebook/0.edges -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/hduser/output/shortestpaths -w 1 I get error of IllegalArgument – anu May 28 '15 at 09:28
  • use -eip instead of -vip – Masoud Sagharichian May 28 '15 at 09:34
  • ./hadoop jar /usr/local/giraph-1.1.0/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-‌​2.5.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -eif org.apache.giraph.io.formats.IntNullTextEdgeInputFormat -eip /user/hduser/input/facebook/0.edges -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/hduser/output/shortestpaths -w 1 – Masoud Sagharichian May 28 '15 at 09:36
  • No vertex input specified. No vertex output specified – anu May 28 '15 at 11:25
  • Since in this command you use -eif (edge input format) and -eip (edge input path), the warning no vertex input specified is not important. – Masoud Sagharichian May 30 '15 at 04:34