1

I have been looking at HPX (https://github.com/STEllAR-GROUP/hpx) as a potential mechanism for making applications more scalable.

I believe HPX is primarily targeted at (and therefore optimised for) the HPC community who typically have clusters of nodes with many code with fast interconnects between them. The parallelX model doens't require this but of course your performance will degrade due to the higher cost of passing data between nodes.

On the other end of the spectrum we have a suite of Java frameworks including hadoop, spark & flink. These come out of the commercial community and addressing different sorts of workload.

So what's in it if you were choosing between them (ignoring C++ vs Java flamewars)

If considering purely on performance grounds how do they compare in terms of overheads?

Granted it depends heavily on the kind of problem you are trying to solve. I'd like to understand the trade-offs better.

Bruce Adams
  • 4,953
  • 4
  • 48
  • 111
  • It is really not a SO type of question... – zero323 Feb 12 '16 at 14:22
  • I think it is as I'm looking for considerations as to whether to develop an app using HPX or Spark. Where else would you ask that? – Bruce Adams Feb 12 '16 at 14:28
  • I guess [Quora](http://quora.com/) could be a good place. You could try [Software Recommendations](https://softwarerecs.stackexchange.com/) but I am pretty sure it doesn't meet the criteria. – zero323 Feb 12 '16 at 14:30
  • Quora are you stark raving mad! I think Joel would climb into a coffin just to turn over! :) – Bruce Adams Feb 12 '16 at 15:03
  • That was my idea :) Seriously though StackExchange sites are highly opinionated when it comes to content but Spark has user list and ste||ar has irc channel. Given there are 6 questions including yours tagged with hpx I kind of doubt you'll get your answer here even if it won't be downvoted and deleted. – zero323 Feb 12 '16 at 15:09
  • Granted HPX is quite new. Still perhaps my question is ahead of the curve if it takes off? – Bruce Adams Feb 12 '16 at 15:36
  • Actually Ste||ar recommend asking questions *here* on S/O :) https://github.com/STEllAR-GROUP/hpx – Bruce Adams Feb 12 '16 at 15:46
  • Well, its been public longer than Spark. But fundamentally it is looks like much more specialized tool and takes is focused on (correct me if I'm wrong) on task parallelization not data parallelism. – zero323 Feb 12 '16 at 15:48
  • I doesn't make this particular question more on-topic :) – zero323 Feb 12 '16 at 15:48
  • Why is this question about HPX vs, any different from similar questions (which are not closed as off topic) for flink vs spark or flink vs storm? E.g. http://stackoverflow.com/questions/30699119/what-is-are-the-main-differences-between-flink-and-storm or http://stackoverflow.com/questions/29780747/apache-flink-vs-apache-spark-as-platforms-for-large-scale-machine-learning?rq=1 – Bruce Adams Feb 12 '16 at 16:00
  • It is not. And same as before it doesn't make any of these more on-topic IMHO. – zero323 Feb 12 '16 at 17:12

1 Answers1

1

HPX has not been used or adapted to cloud-type scenarios at this point. We have thought about adapting it, but have not implemented anything. It would be possible (in principle, as you noted as well), though.

hkaiser
  • 11,403
  • 1
  • 30
  • 35
  • I see. So I guess no-one attempt as has been made to compare things yet. What kind of adaptations does your team have in mind? Is it primarily to optimise for slow interconnects between nodes or is the something more fundamental involved? – Bruce Adams Feb 15 '16 at 15:44
  • First of all we would need to create a special networking layer (for instance on top of websockets). Some additional security considerations would apply as well. Everything else would be up to the application, I guess. – hkaiser Feb 17 '16 at 20:16
  • I'm not clear why you woudl need a layer based on websockets. What is wrong with using unix sockets and TCP as normal? Is it to avoid the need to open ports in the firewall manually? I can see that would make sense. – Bruce Adams Feb 18 '16 at 12:48
  • Yes, firewall issues, mostly. Security is another concern. – hkaiser Feb 18 '16 at 14:01