1

Is it possible for me to use Giraph if I have Spark clusters and Cassandra but no Hadoop clusters?

Currently, I am using GraphX and would like to use Giraph instead. Is this possible considering that I have Spark clusters and am using Cassandra?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
BigBug
  • 6,202
  • 23
  • 87
  • 138

1 Answers1

1

I have only limited experience with Giraph from years ago, and I never tried using it outside of a Hadoop cluster. But it looks like what you want is at least technically possible if not necessarily easy.

This code is the companion to Practical Graph Analytics with Apache Giraph. As you can see, it requires Hadoop in the classpath for DoubleWritable and Text, for example, but it does nothing with a Hadoop cluster. Instead, it works with in-memory arrays. It looks like all you need to do is implement compute in the BasicComputation class to do whatever you need with Cassandra as long as you keep Hadoop around as a dependency to help satisfy the type boundaries for BasicComputation.

I never found Giraph terribly intuitive, but hopefully you can make this unconventional setup work.

Vidya
  • 29,932
  • 7
  • 42
  • 70
  • Would the downvoter care to provide a reason? Responsible users realize downvotes are for "[extreme cases](http://stackoverflow.com/help/privileges/vote-down)," and if something is incorrect, comments and edits are better. An alternative answer the OP accepts would be best. So what's the problem here? We all look forward to your contribution. – Vidya Apr 04 '17 at 13:18
  • Glad to help. Good luck with your project! – Vidya Apr 06 '17 at 14:29