Questions tagged [giraph]

Apache Giraph is an iterative graph processing system built for high scalability.

Apache Giraph is an iterative graph processing system built for high scalability.

For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.

Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in this paper.

Both systems are inspired by the Bulk Synchronous Parallel model of distributed computation introduced by Leslie Valiant.

Bulk Synchronous Parallel (BSP) abstract computer is a bridging model for designing parallel algorithms. It differs from Parallel random access machine (PRAM) by not talking communication and synchronization for granted. An important part of analyzing a BSP algorithm rests in qualifying the synchronization and the communication needed.

Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more.

With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale.

References

156 questions
1
vote
2 answers

What is equivalent to hadoop-core-xxx.jar in hadoop 2.7.1

I am working on Stanford's GPS (graph processing system) framework for distributed processing of graphs. The framework uses hadoop. As per GPS documentation, installing GPS requires copying hadoop-core-xxx.jar file to be copied in its libs…
satya
  • 469
  • 1
  • 6
  • 14
1
vote
1 answer

OutOfMemory error while reading bytes from edges in yarn

I'm doing a BFS algorithm in yarn, and i make a custom value for the data on my vertex (Vertex Data). But, after i did this, something went wrong for the process of reading edges. I trace the error to the following lines of code: In ByteArrayEdges,…
chomp
  • 1,352
  • 13
  • 31
1
vote
2 answers

Giraph best's Vertex Input format, for an input file with ids of type String

I have a multinode giraph cluster working properly in my PC. I executed the SimpleShortestPathExample from Giraph and was executed fine. This algorithm was ran with this file…
chomp
  • 1,352
  • 13
  • 31
1
vote
1 answer

EdgeValue isnt the same after calling Vertex.getEdgeValue() twice

I'm trying to implement the Spinner graph partitioning algorithm in giraph. In the first steps, my program adds edges to a given input graph so that it becomes an undirected graph and every vertex chooses a random partition. (This partition-integer…
Gaze
  • 149
  • 1
  • 9
1
vote
1 answer

Large Scale Social Network Analysis for Hypergraphs

I have been trying to implement large scale social network analysis for hypergraphs. But Apache Giraph allows only simple graphs and Multigraphs. II couldn't find any suitable method to implement Large Scale SNA in Hypergraphs. Please suggest me…
Alen Jacob
  • 11
  • 1
1
vote
1 answer

How to give simple edge list format for Apache Giraph

I am trying to run Stanford Network Analysis Program (SNAP) graphs on Apache Giraph using Hadoop. The link is provided below http://snap.stanford.edu/snap/ Currently I am trying to run the facebook graph which is in the simple edge list format…
Aditya
  • 1,172
  • 11
  • 32
1
vote
1 answer

java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected installation example

I am running the example from http://giraph.apache.org/quick_start.html#qs_section_2 After successfully installing Giraph i create file /tmp/tiny_graph.txt and run $HADOOP_HOME/bin/hadoop jar…
anu
  • 39
  • 5
1
vote
1 answer

Is it possible to send message to predecessor in Apache Giraph?

Like in the title: Is it possible to send message to predecessor in Apache Giraph? And what's more important is it recommended (I can find some applications where it might be usefull).
iku
  • 1,007
  • 2
  • 10
  • 23
1
vote
1 answer

How we know the value of Messages used in Giraph

How we know the value of message.get() in SimpleShortestPathsComputation? if we have Vertex vertex instead of Vertex vertex How we know that Messages…
user349
  • 11
  • 4
1
vote
1 answer

Giraph: Using Text as VertexId

I try to test Giraph. VertexId type Text Input Edge-Base If I use Text as VertexId, I get error. If LongWritable, everything is OK. Questions: 1. Is it OK to use Text as VertexId? 2. If yes, what an I doing wring? Error: 14/10/15 14:59:28 INFO…
pavel
  • 29
  • 6
1
vote
0 answers

How do I control which tasks run on which hosts?

I'm running Giraph, which executes on our small CDH4 Hadoop cluster of five hosts (four compute nodes and a head node - call them 0-3 and 'w') - see versions below. All five hosts are running mapreduce tasktracker services, and 'w' is also running…
Matthew Cornell
  • 4,114
  • 3
  • 27
  • 40
1
vote
1 answer

How do I look up a Vertex using its ID?

I have a graph computation that passes 'visited' Vertex IDs around, and I need to output information from those in the output phase. How do I look up a Vertex from its ID? I found Partition.getVertex(), but IIUC there is no guarantee that an…
Matthew Cornell
  • 4,114
  • 3
  • 27
  • 40
1
vote
1 answer

How do I output only a subset of a graph?

I have a graph computation that starts with a subset of vertices of a certain type and propagates information through the graph to a set of target vertices, which are also subset of the graph. I want to output only information from those particular…
Matthew Cornell
  • 4,114
  • 3
  • 27
  • 40
1
vote
1 answer

giraph/ hadoop reading manifest file

I am trying to run RandomWalkWith Restart example https://github.com/apache/giraph/blob/release-1.0/giraph-examples/src/main/java/org/apache/giraph/examples/RandomWalkWithRestartVertex.java My Input is data is 12 34 56 34 78 56 34 78 78 …
frazman
  • 32,081
  • 75
  • 184
  • 269
1
vote
2 answers

ClassNotFoundException running GiraphRunner on a modified SimpleShortestPathsVertex

I'm relatively new to Giraph and I'm trying to get my Giraph edit-compile-deploy loop working for our code. I am able to run various examples inspired by http://blog.cloudera.com/blog/2014/02/how-to-write-and-run-giraph-jobs-on-hadoop/ , but I'm…
Matthew Cornell
  • 4,114
  • 3
  • 27
  • 40