Questions tagged [giraph]

Apache Giraph is an iterative graph processing system built for high scalability.

Apache Giraph is an iterative graph processing system built for high scalability.

For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.

Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in this paper.

Both systems are inspired by the Bulk Synchronous Parallel model of distributed computation introduced by Leslie Valiant.

Bulk Synchronous Parallel (BSP) abstract computer is a bridging model for designing parallel algorithms. It differs from Parallel random access machine (PRAM) by not talking communication and synchronization for granted. An important part of analyzing a BSP algorithm rests in qualifying the synchronization and the communication needed.

Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more.

With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale.

References

156 questions
0
votes
0 answers

Where are Apache Giraph logs (with log4j) located?

I am currently experimenting with Apache Giraph using the configuration described in the QuickStart Tutorial : Ubuntu 12 Hadoop 0.20 Giraph Release 1.2 I am running all computation on a single-node local cluster, as described in the…
NyuB
  • 93
  • 7
0
votes
0 answers

Job failed as tasks failed. failedMaps:1 failedReduces:0 exception while using hadoop and giraph

I'm using a VM and have a cluster consists of 1 master and node and 3 nodes else, I installed hadoop and copy it to all nodes and deploying giraph in master but I don't know that I should to copy giraph folder too to all nodes or not!!, and I'm…
Soad Ahmed
  • 13
  • 6
0
votes
0 answers

giraph building error at "Apache Giraph Parent"

Hi i'm trying to build giraph at virtualbox ubuntu. I followed under two link http://giraph.apache.org/quick_start.html https://lab.hypotheses.org/1207 hadoop worked well but giraph was not installed both of cases... mvn package -DskipTests [INFO]…
0
votes
2 answers

YARN Giraph application on Google Cloud - fat jar not found

I'm trying to run my Giraph-based application on a Hadoop cluster through YARN. The command I use is yarn jar solver-1.0-SNAPSHOT.jar edu.agh.iga.adi.giraph.IgaSolverTool First I need to copy that JAR to one of the directories that are reported when…
kboom
  • 2,279
  • 3
  • 28
  • 43
0
votes
1 answer

giraph fails only on large graphs after warn "likely client has closed socket"

I'm using giraph-1.3.0.-SNAPSHOT and hadoop-2.8.4 in a EC2 cluster composed of 5 nodes (everyone has 32 cpus and 60 GBs ram). If I give small input to my algorithm implemented in giraph, It properly works. When I give a large input (like…
Francesco Sclano
  • 145
  • 1
  • 12
0
votes
1 answer

Is there a way to activate Giraph Stats in giraph built for yarn?

It seems that Giraph Stas are written in log only using map-reduce ( giraph-1.3.0-snapshot built with -Phadoop2 mvn profile). Is there a way to activate Giraph Stats in log using yarn too (giraph-1.3.0-snapshot built with -Phadoop_yarn mvn profile)…
0
votes
1 answer

how determine the number of workers of giraph to set in -w argument?

I'm using an ec2 hadoop cluster that is comprised of 20 c3.8xlarge machines, each having 60 GB RAM and 32 virtual CPUs. In every machine I set up yarn and mapreduce settings as documented here…
Francesco Sclano
  • 145
  • 1
  • 12
0
votes
2 answers

Is it correct that master runs on a datanode?

I'm using giraph-1.3 built with yarn profile. For starting I configured 1 namenode and 2 datanodes on a ec2 cluster. My application properly works because I see expected output in logs (and in output directory). I launched giraph with "-w 2"…
0
votes
0 answers

What does it change in hadoop usage by giraph built with -Phadoop_2 and by giraph built with -Phadoop_yarn?

I understood that giraph-dist-1.2.0-hadoop2-bin.tar.gz binary distribution is built with following maven command and it is officially supported by with hadoop-2.5.1. "mvn -Phadoop_2 clean install" I successfully used…
0
votes
0 answers

what is the difference between two downloadable versions of giraph: 1.2giraph-dist-1.2.0-hadoop2-bin.tar.gz and giraph-dist-1.2.0-bin.tar.gz

What is the difference between giraph-dist-1.2.0-hadoop2-bin.tar.gz and giraph-dist-1.2.0-bin.tar.gz. Is there any documentation about that? The only documentation that I found is the following one: Apache Hadoop 2 (latest version: 2.5.1) This is…
Francesco Sclano
  • 145
  • 1
  • 12
0
votes
1 answer

Apache Nutch 2.3.1 give more preference to seed domains at selection point

I have configured apache Nutch 2.3.1 with complete Hadoop/Hbase ecosystem. I want that my crawler should give more preference to those domains that are given in seed in each iteration. According to my testing; It can go complete in either direction…
Hafiz Muhammad Shafiq
  • 8,168
  • 12
  • 63
  • 121
0
votes
1 answer

Running my own job on Giraph

So, I've successfully executed the SimpleShortestPathComputation on my computer via the script shown…
0
votes
1 answer

FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster java.lang.NoClassDefFoundError

I am new to hadoop. I am trying to setup Giraph to run on hadoop-2.6.5 with yarn. When I submit the Giraph job the job gets submitted successfully but fails and I get below log in container syslog: 2018-01-30 12:09:01,190 INFO [main] …
enator
  • 2,431
  • 2
  • 28
  • 46
0
votes
1 answer

Installing Giraph on HDInsight using script actions

I'm trying to install Giraph on HDInsight cluster with hadoop, using script actions. After 30+- minutes when deploying the cluster, an error shows up. Deployment failed Deployment to resource group 'graphs' failed. Additional details from the…
Robert
  • 1
0
votes
1 answer

Calling a Giraph job from a simple java program

I am new to Giraph and Hadoop Yarn. Following Giraph's quick start leads me to run the example job from jar build from source from command line. I want to run the job from simple java program. The question is inspired from previous similar MapReduce…
enator
  • 2,431
  • 2
  • 28
  • 46