Questions tagged [giraph]

Apache Giraph is an iterative graph processing system built for high scalability.

Apache Giraph is an iterative graph processing system built for high scalability.

For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.

Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in this paper.

Both systems are inspired by the Bulk Synchronous Parallel model of distributed computation introduced by Leslie Valiant.

Bulk Synchronous Parallel (BSP) abstract computer is a bridging model for designing parallel algorithms. It differs from Parallel random access machine (PRAM) by not talking communication and synchronization for granted. An important part of analyzing a BSP algorithm rests in qualifying the synchronization and the communication needed.

Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more.

With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale.

References

156 questions
2
votes
2 answers

Apache Giraph SendMessage

I am writing a distributed clustering algorithm using Apache Giraph. In the compute() method I need to access the value that each neighbors sent plus the weight of the edge between the current vertex and the neighbor who sent that message. However,…
1
vote
0 answers

Apache Giraph: Read in postgres rows as vertices?

Is it possible to read in rows from a sql database as vertices in apache giraph? If so, could someone provide a small code example? Thanks.
Megan
  • 1,000
  • 1
  • 14
  • 44
1
vote
0 answers

4-profiles calculus of big graph with apache giraph

for my master thesis in computer science I succeed in implementing 4-profiles calculus (https://arxiv.org/abs/1510.02215) using giraph-1.3.0-snapshot (compiled with -Phadoop_yarn profile) and hadoop-2.8.4. I configured a cluster on amazon ec2…
1
vote
0 answers

Applying a patch at runtime in Maven

I'm trying to install Giraph 1.1 but ran into an issue. According to this thread I should apply a patch to my installation. Unfortunately, my problem stems from that. I downloaded and copied the .patch file linked in there to the source folder and…
1
vote
1 answer

How to Update "Practical Graph Analytics with Apache Giraph" examples to run on current Cloudera Quickstart VM

I am new to Hadoop/Giraph and Java. As part of a task, I downloaded Cloudera Quickstart VM and Giraph on top of it. I am using this book named "Practical Graph Analytics with Apache Giraph; Authors: Shaposhnik, Roman, Martella, Claudio, Logothetis,…
user9068137
1
vote
1 answer

Apache Giraph on Cloudera VM - POM for org.apache.hadoop:hadoop-core:jar:2.6.0 missing, no dependency info

I am new to Hadoop/Giraph and Java. As part of a task, I downloaded Cloudera Quickstart VM and Giraph on top of it. I am using this book named "Practical Graph Analytics with Apache Giraph; Authors: Shaposhnik, Roman, Martella, Claudio, Logothetis,…
user9068137
1
vote
1 answer

Hadoop 1.2.1 is running in local mode despite set mapred.job.tracker value

I am trying to submit a giraph job to a hadoop 1.2.1 cluster. The cluster has a name node master, a map reduce master, and four slaves. The job is failing with the following exception: java.util.concurrent.ExecutionException:…
cscan
  • 3,684
  • 9
  • 45
  • 83
1
vote
1 answer

Giraph, Hadoop, Spark and Cassandra

Is it possible for me to use Giraph if I have Spark clusters and Cassandra but no Hadoop clusters? Currently, I am using GraphX and would like to use Giraph instead. Is this possible considering that I have Spark clusters and am using Cassandra?
BigBug
  • 6,202
  • 23
  • 87
  • 138
1
vote
1 answer

Convert csv data to graph data

I am experimenting Apache Giraph.I need to create a simple graph for my csv file residing in HDFS,which shows a relationship between 2 columns.(victim related to store name) My data size is of above 1Gb csv format.Initially tried to use neo4j using…
USB
  • 6,019
  • 15
  • 62
  • 93
1
vote
0 answers

Error in building Apache Giraph Core

I'm trying to install Giraph in single node following this link: http://lab.hypotheses.org/1207 by using Java JDK 1.8 and hadoop 2.4.0. When I run: mvn -Phadoop_yarn -Dhadoop.version=2.4.0 -DskipTests package , I get the following error: [INFO]…
Sira
  • 11
  • 3
1
vote
1 answer

How to give multiple input paths to gora avrostore in giraph (or) how to make giraph read multiple input files

How do I make giraph read data from multiple input paths. I am using this in gora.properties gora.datastore.default = org.apache.gora.avro.store.Avrostore gora.avrostore.input.path=file:///path/to/file1.avro,file:///path/to/file2.avro But it gives…
1
vote
0 answers

Method for getting Vertex with given id

Is there a method for getting a Vertex with given id ? I want to get reference to the Vertex with given id during compute() method in BasicComputation class. I cannot find anything in java doc.
Marcin Majewski
  • 1,007
  • 1
  • 15
  • 30
1
vote
1 answer

giraph.numInputThreads execution time for "input superstep" it's the same using 1 or 8 threads, how this can be possible?

I'm doing BFS search through the Wikipedia (spanish edition) site. I converted the dump into a file that could be read with Giraph. Using 1 worker, a file of 1 GB took 452 seconds. I executed Giraph with this command: /home/hadoop/bin/yarn jar…
chomp
  • 1,352
  • 13
  • 31
1
vote
1 answer

Shortest path - Giraph example - Not working on AWS

I'm having problems running the shortest path example on AWS. I downloaded the giraph jar through S3 (compiled inside the same AMI that i'm using and uploaded for that there ) , configured correctly zookeper in both master and slave, and i did the…
chomp
  • 1,352
  • 13
  • 31
1
vote
0 answers

what is a partition class in Apache Giraph? eg -pc in Giraph job

What is partitionClass (-pc) in giraph job submitted through command line ? What are the arguments or how to give the arguments ? Could u please give an example ? I saw the API which says Hash partitioning etc.. but couldnt find an example to see…
drk
  • 153
  • 1
  • 17
1 2
3
10 11