Questions tagged [giraph]

Apache Giraph is an iterative graph processing system built for high scalability.

Apache Giraph is an iterative graph processing system built for high scalability.

For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.

Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in this paper.

Both systems are inspired by the Bulk Synchronous Parallel model of distributed computation introduced by Leslie Valiant.

Bulk Synchronous Parallel (BSP) abstract computer is a bridging model for designing parallel algorithms. It differs from Parallel random access machine (PRAM) by not talking communication and synchronization for granted. An important part of analyzing a BSP algorithm rests in qualifying the synchronization and the communication needed.

Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more.

With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale.

References

156 questions
1
vote
3 answers

Reading edge list data set in apache giraph?

I'm using SNAP dataset for social network analysis. SNAP uses simple edge list as a data format. How to read SNAP dataset in Apache Giraph?
user
  • 5,335
  • 7
  • 47
  • 63
1
vote
1 answer

Issues while deploying Giraph

I am trying to deploying Giraph in order to run some examples. I follow the QuickStart guide, skipping the step Deploying Hadoop, because I have already set up hadoop on my machine as a single node. However I get the following error: [ERROR] Failed…
salvador
  • 1,079
  • 3
  • 14
  • 28
1
vote
1 answer

Vertices with complex values in Apache Giraph

I am trying to read some text file containing relevant vertices information into Giraph: each line is vertex_id attribute_1 attribute_2 .....attribute_n where each attribute is a string. The goal would be to create a vertex where all these…
1
vote
2 answers

How to print debug statments in Giraph

I find the below statement before print log statement. if (LOG.isDebugEnabled()) How can we enable or disable debug statements when running a Giraph program? And where can one find the logs of these statements?
nittoor
  • 113
  • 6
1
vote
1 answer

Executing example on Giraph1.1.0 on hadoop 2.3.0-cdh-5.0.shows the following error

root@pseudo-hadoop:/usr/lib/hadoop# bin/hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation…
Debashisenator
  • 1,621
  • 4
  • 17
  • 16
1
vote
1 answer

Giraph aggregators in InputFormat

I am running some basic examples with Giraph and I want to verify the data being read by my EdgeInputFormat. On a classic MapReduce job I could do that using Counters and Giraph uses aggregators for this.…
Sorin
  • 908
  • 2
  • 8
  • 19
1
vote
0 answers

out-of-core messages config in giraph

I was working with pregel and i noticed that when the jobs finish, They don't deallocate memory. so i searched and knew that the latest version "Version: 1.1.0-SNAPSHOT" doesn't have this problem so i downloaded the latest version of giraph from…
ali abdoli
  • 33
  • 6
1
vote
1 answer

Memory is not deallocated after a giraph job is finished

I am using Apache Giraph version 1.0 upon Hadoop version 0.20.203. It executes ConnectedComponentsVertex and SimpleShortetPathsVertex, examples of apache giraph, jobs successfully, but there exists a problem. After a job is finished memory is not…
1
vote
2 answers

Hortonworks HDP2.0 + giraph

I have hortonworks HDP2.0 running in sandbox (recently installed) at Windows 8.1 platform. I need to learn how to get giraph working with HDP 2.0,. I think, giraph is not currently installed with HDP 2.0 bydefault. Can someone help me installing…
Varun Gupta
  • 1,419
  • 6
  • 28
  • 53
1
vote
1 answer

Is there any Spark or Giraph implement of Louvain method?

This is louvain method to find community in social graph. https://sites.google.com/site/findcommunities/ I want to run it on a big graph using BSP method such as Spark or Giraph.
Billy Ren
  • 48
  • 6
1
vote
0 answers

What is Apache Giraph's consistency model? Is it ACID compliant?

The title pretty much sums it up. I am trying to figure out what kind of consistency Giraph provides. Is it ACID compliant? Does it leave that up to the Hadoop framework?
benjaminjsanders
  • 827
  • 8
  • 13
1
vote
1 answer

Apache Giraph cannot run on CDH4.4.0

I try to run latest version of apache giraph examples, describe on the quickstart page (http://giraph.apache.org/quick_start.html). I use CDH 4.4.0 (Cloudera distribution of Hadoop) I have built Giraph with the dependecies updated to CDH 4.4.0.…
Hightower
  • 11
  • 2
0
votes
1 answer

Set JVM flags in an Apache Giraph job

I am running an Apache Giraph job which ultimately runs a Hadoop MapReduce job. The job is run by calling a hadoop jar lib/giraph_2.12.jar org.apache.giraph.GiraphRunner command I'm trying to set a few JVM flags/System properties using the -ca flag…
0
votes
0 answers

Giraph SimpleShortestPathComputation examples FAILED

I have installed and deployed giraph-1.4.0 using the Hadoop yarn profile with the following command mvn -Phadoop_yarn -Dhadoop.version=2.7.0 -DskipTests package I tried running the SimpleShortestPathsComputation example $HADOOP_HOME/bin/hadoop jar…
0
votes
1 answer

Will YARN working on NUMA respect node memory locality?

I'm working with the Giraph-based application that makes heavy use of memory in a NUMA system. It frequently writes and reads to the memory and has multiple threads. Assuming I schedule 4 workers with as many cores as there are cores per chip would…
kboom
  • 2,279
  • 3
  • 28
  • 43