Questions tagged [giraph]

Apache Giraph is an iterative graph processing system built for high scalability.

Apache Giraph is an iterative graph processing system built for high scalability.

For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.

Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in this paper.

Both systems are inspired by the Bulk Synchronous Parallel model of distributed computation introduced by Leslie Valiant.

Bulk Synchronous Parallel (BSP) abstract computer is a bridging model for designing parallel algorithms. It differs from Parallel random access machine (PRAM) by not talking communication and synchronization for granted. An important part of analyzing a BSP algorithm rests in qualifying the synchronization and the communication needed.

Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more.

With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale.

References

156 questions
3
votes
1 answer

Giraph: Class not Found Exception on custom Job

I am developing an algorithm using Giraph. I am working with version 1.0.0 on Hadoop 1.2.1. I am pretty new to developing Giraph, so please be gentle ;) My custom job is split into three packages: io: contains the input and output format…
Alessio Arleo
  • 71
  • 1
  • 1
  • 6
3
votes
1 answer

Apache Giraph 1.0.0 - How is memory allocated for vertices?

Recently, I was successfully able to create a custom vertex class in which each vertex has a LongWritable id, and this id is also its own value. My Giraph program runs successfully on a small vertex set (100,000 vertices) and the program completes…
PortilR
  • 31
  • 1
3
votes
1 answer

Can't run a giraph SimpleInDegreeCountComputation

I'm trying to run the SimpleInDegreeCountComputation example included with Giraph. My approach is as follows: SimpleInDegreeCountComputation.java: public class SimpleInDegreeCountComputation extends BasicComputation
2
votes
1 answer

apache giraph build error

I got following error in compiling giraph. I'm using ubuntu 16.04 with java 1.8 and maven 3.3.9. Follows detail of mvn -version command: Apache Maven 3.3.9 Maven home: /usr/share/maven Java version: 1.8.0_171, vendor: Oracle Corporation Java home:…
2
votes
0 answers

Giraph job never ends with more than one worker

I am new in Giraph and Hadoop. I am trying to run the shortest path algorithm in a multi nodes cluster (1 master and two slaves). I used the following command to run the algorithm: bin/hadoop jar…
imen
  • 35
  • 8
2
votes
1 answer

Graph DB For Network Representation In A Web Application

I'm not sure if this question is too broad, but here we go.... I'm interested to design a web application (side project), that queries DB for information and represents it in a network structure. Pretty broad right?! Let's narrow a bit. DB can be…
Simply_me
  • 2,840
  • 4
  • 19
  • 27
2
votes
0 answers

Giraph build for Hadoop Yarn profile fails with org.apache.maven.wagon.TransferFailedException

I am trying to build Giraph for Hadoop 2.7.1 using the Yarn profile. I am getting a TransferFailedException for resource at: http://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-core/1.0-alpha-8/doxia-core-1.0-alpha-8.jar. The resource…
AxxE
  • 813
  • 2
  • 9
  • 18
2
votes
1 answer

Link Nodes Together

I have a Graph based database like Neo4j or Giraph with say existing 50 vertices and some edges linking them together. Now i want to introduce a new Vertex - X into the Graph. However the Vertex needs to run a similarity algo against all of the…
myloginid
  • 1,463
  • 2
  • 22
  • 37
2
votes
1 answer

How to run Giraph on YARN (Hadoop 2.6) ('Worker failed during input split')

I'm trying to set up a pseudo-distributed Hadoop 2.6 cluster for running Giraph jobs. As I couldn't find a comprehensive guide for that, I've been relying on Giraph QuickStart (http://giraph.apache.org/quick_start.html), which is unfortunately for…
Wojciech Ptak
  • 683
  • 4
  • 14
2
votes
2 answers

Giraph tutorial ShortestPath example job failing

I'm going through the Apache Giraph quick start tutorial: http://giraph.apache.org/quick_start.html and have successfully setup a pseudo-distributed hadoop cluster and have successfully run the example mapreduce jobs. However when moving to the…
WillJones
  • 907
  • 1
  • 9
  • 19
2
votes
1 answer

Why is Speculative execution doesn't make sense for Giraph?

recently I am running some benchmarks to learn about failover mechanism in Giraph. Actually I'm curious; when a worker in a job gets slower, the other workers will just wait for it. Later I found something like this in GiraphJob.java: // Speculative…
Algorithman
  • 1,309
  • 1
  • 16
  • 39
2
votes
0 answers

Use Apache Giraph as Neo4j with Big Amount of Data

I was trying having some tests on Neo4j calculating shortest path between 2 nodes. With 100k nodes and 10 million edges (100 edges each node), shortest path algo was run in 0.4-3s With 200k nodes and 40 million edges (200 edges each node), it takes…
M4rk
  • 2,172
  • 5
  • 36
  • 70
2
votes
1 answer

Sending messages to incoming edges in giraph

Is there any way to send messages to incoming edges in giraph? Or, Is there any way to send messages through any particular edge(type or label etc.,) instead of sending messages to all outgoing edges?
Ashok Krishnamoorthy
  • 853
  • 2
  • 14
  • 24
2
votes
1 answer

Apache Giraph on EMR

Has any tried Apache Giraph on EMR? It seems to me the only requirements to run on EMR are to add proper bootstrap scripts to the Job Flow configuration. Then I should just need to use a standard Custom JAR launch step to launch the Giraph Runner…
rusho1234
  • 241
  • 2
  • 12
2
votes
2 answers

Compilation Error when building Giraph

I am trying to build Giraph. I have the following: java version "1.7.0_25", Apache Maven 3.0.4, Hadoop 1.0.4. I am following the instruction in this page: https://cwiki.apache.org/confluence/display/GIRAPH/Quick+Start+Guide When I run: mvn compile ,…
1
2
3
10 11