Questions tagged [hama]

Apache Hama is a pure BSP (Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms.

Why Hama and BSP?

Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are:

  • Supports message passing paradigm style of application development
  • Provides a flexible, simple, and easy-to-use small APIs
  • Enables to perform better than MPI for communication-intensive applications
  • Guarantees impossibility of deadlocks or collisions in the communication mechanisms

Source: The Apache Hama Project

22 questions
1
vote
0 answers

hadoop and graph theory

Has anyone ever implemented algorithms on the centrality, betweenness centrality, closeness centrality etc in Hadoop Giraph or Hama? Or rather, has anyone ever parallelized the calculation of metrics on oriented weighted graphs? I found a thesis of…
0
votes
1 answer

How to process a large file in Hadoop?

This is a noobie question I have a hadoop setup and thinking of uisng Giraph or Hama for graph based computation. I have a large file in the form 3 4 3 7 3 8 5 6 where each column denotes vertices and each row denote edges. For normal programs I…
user567879
  • 5,139
  • 20
  • 71
  • 105
0
votes
1 answer

How to run my java source for Apache hama from terminal

I have Apache Hama installed and i can invoke it from Eclipse and it works fine. How could I run the same thing from unix terminal. When I run hama SSSP.java I am getting the error Exception in thread "main" java.lang.NoClassDefFoundError:…
user567879
  • 5,139
  • 20
  • 71
  • 105
0
votes
1 answer

Is apache hama suitable for implementing adaboost alghoritm?

I'm interested in implementing adaboost algorithm in hadoop environment. I've made research that mapreduce could be slow due to lack of native iterative support. Apache hama is interesting alternative but is there any feature of apache hama which…
caruso
  • 191
  • 10
0
votes
0 answers

Big matrix multiplication using apache hama

I am trying to multiply a dense matrix A for its transpose A'. The matrix is about 2 million rows and 4 hundred columns. I implemented the multiplication in hadoop map reduce, but it runs too slowly because of the non locality of the job (every…
giulatona
  • 137
  • 2
  • 9
0
votes
0 answers

How to get the result stored in the counter from a Hama BSPJob?

Similar to Hadoop Mapreduce, Hama also has Counters as explained in this link. In hadoop mapreduce, retrieving the value of a Counter is as simple as follows using the getCounters() function: long value =…
keelar
  • 5,814
  • 7
  • 40
  • 79
0
votes
1 answer

DryadOpt (A library for parallel branch and Bound ) - availablity

I am trying to implement a parallel branch and bound BFS. I am interested in using DryadOpt which runs on top of Dryad LinQ. Has anyone obtained DryadOpt. I know we can get academic version of DryadLinQ and it is also present on Azure but is there…
LGG
  • 528
  • 9
  • 21
1
2