1

This is really a bioinformatics question, but I'll make it as general as I can. Here's the semi-hypothetical situation:

Let's say I have access to a cluster or even a cloud. I want to run some very specific programs on this cluster/cloud (genomic/transcriptomic assembly programs to be exact). The problem is that I expect these programs (Velvet/Oases, Trinity, whatever else) to require large amounts of RAM, conservatively 100GB+, and the largest node on my cluster/cloud is only 32GB.

Now besides switching to an MPI/Hadoop based program (ABySS or whatnot), writing my own, or buying a new computer, what are my viable options? Has anyone tried using a distributed operating system (MOSIX, Kerrighed, ...) with shared memory on multiple nodes of a cluster/cloud? What about a virtual SMP? What else?

Thanks for any help!

Edit for clarification: Let's also say that the programs mentioned above (Velvet/Oases and Trinity) require a single system with a large pool of RAM. In a nut shell, I'm looking for a viable way to "paste" a bunch of nodes together into one virtual super-node where a single process could access all of the RAM from all of the nodes like it was a single system. I know that anything like this would probably give a pretty substantial performance hit, but I'm looking for something that's possible, not necessarily efficient.

p.s. Sorry if my terminology is making things confusing. I'm somewhat new to a lot of this.

Pete
  • 11
  • 3
  • If you could add some notes about the actual processing to be done, that would be very helpful. Describing the data a bit more would also help. – Iterator Sep 25 '11 at 23:22
  • What about [AWS](http://aws.amazon.com/) ? Maybe a combination of the services they provide could be a good option. – ppareja Sep 03 '11 at 18:07
  • AWS has the same similar problem as far as I know. Lots of small (medium-small anyway) nodes of cluster/cloud and no easy way to paste a few of the small ones together into a single large environment for running a single thread that needs 1TB of RAM. – Pete Sep 06 '11 at 15:59

3 Answers3

2

It totally depends on the nature of your application. Switching to Hadoop, MPI, MOSIX or VSMP may not solve your problem, because these technologies are helpful when you could partition your application into concurrent executing blocks.

Now, if your application is partitionable into concurrent blocks, choose the best software technology that fits your needs. Otherwise, it is recommended to upgrade your hardware. For choosing the software technology if your application:

  1. Is data intensive: Try Hadoop or Dryad or something like that.
  2. Is process intensive and passes many messages between its blocks: try MPI
  3. Contains many light-weight threads: Use GPGPUs for your app.
  4. ....

Take a look at RAMCloud project at Stanford university. It is somehow relevant.

TonySalimi
  • 8,257
  • 4
  • 33
  • 62
  • Thanks for your post. But what about when the application is *not* "partionable into concurrent blocks"? Imagine that the applications in question are single-threaded programs that require 100GB+ of RAM on a single system. That's why I was mentioning shared memory distributed operating systems. I'm certainly no expert on distributed OS or SMP, but my understanding is that they both offer the ability to basically paste multiple smaller systems into a single larger pool of shared resources that a single process could then potentially access? I'll edit my post to reflect this specificity. – Pete Aug 25 '11 at 19:48
  • Well, as you know, DSM systems (distributed shared memory) suffer from poor performance due to transferring pages between nodes. As you know, in these systems, each memory page request is trapped to find its location on the pool and may need a network transfer of that page. But it benefits from a simpler programming model, i.e. the shared memory paradigm which is more handy than the distributed one. So, if your app is not partitionable into different parts, my advice is to just promote your hardware and forget about a distributed solution, like the ones you mentioned. – TonySalimi Aug 25 '11 at 20:03
  • Forgot to say, if you run your app on a pool of resources, you will use just the memory of other nodes, not their processor or disk. So, it is more logical to have those extra memory chips on your main machine instead of distributing them on your pool – TonySalimi Aug 25 '11 at 20:08
  • Again, thanks for your posts, you're confirming a lot of what I wasn't sure about. But the issue still remains that many of us have free access to hundreds or even thousands of nodes of clusters/clouds and no access to a single system with large amounts of RAM. I can imagine these programs requiring 1TB+ of RAM on a single system in the next year or two. Completely disregarding performance thrashing caused by using something distributed or virtual like this, are you suggesting this isn't possible or hasn't been done before? – Pete Aug 25 '11 at 21:50
  • Of course using DSMs or sth like that is possible and there is no doubt about it. You can set up your own DSM system and that will work. But lets clarify my words by comparing the two alternatives 1) Using a DSM that shares lots of memory between nodes 2) Using a single system with 64G of RAM. --- In the first case you should use the network to compensate the lack of memory. In the second case you should use the disk (virtual memory) to cover your lack. As you see, in both cases you are relying on I/O, either Net or Disk. So, do you think that network is faster than disk? I don't think so – TonySalimi Aug 26 '11 at 08:07
  • And my conclusion: Using a DSM is feasible for your case, but I think it will not lead to a high degree of performance, comparing to the case that your application is running on a single machine with 64G of memory. – TonySalimi Aug 26 '11 at 08:13
1

Your question omits the nature of the processing to be done. This is particularly important. For instance, is each object really 100GB, or is the 100GB a collection of a lot of objects that are much smaller in size?

Nonetheless, addressing the general question, I routinely work with 100GB+ datasets in memory mapped files. If learn how to do memory mapping, you will likely find this to be a very easy route to go. What's more, if the data is in one place, then an easy kludge is to use NFS and then multiple systems could access the same data at the same time. In any case, memory mapping is often very easily woven in to existing programs, especially compared to managing the movement of blocks of data around your grid.

As you note, there are options like MOSIX or MPI, or you could look at memcached or memcacheDB, though I think that won't work out very well in the long run. In terms of an ordering for your system, I'd recommend memory mapping first, then MPI, MOSIX, and memcached.

Iterator
  • 20,250
  • 12
  • 75
  • 111
0

In any case, do not use MOSIX to solve this problem. MOSIX is a system to distribute CPU-intensive threads and generally does not perform very well when you require a lot of collaboration. You will have to use MPI anyway to use this large dataset.

parasietje
  • 1,529
  • 8
  • 36