14

Is it possible to span one huge Virtual Machine across several physical commodity servers?

Here is our use case:

  • We need to implement a 32-processor db server with 64 GB of RAM
  • We don't have a physical server of such capacity
  • We do have a lot of servers with smaller resources.

Is there a technology or (better) a product that lets us utilize these servers to create a VM with required capacity? Say, are we able to combine 8 physical 4-processor machines with 8 GB of RAM each into one 32-processor "logical unit" with 64 GB of RAM and set up an Oracle server that uses all this capacity?

Before posting this question, we read similar questions but didn't find an answer.

Maybe, someone could give us a hint now?

user54614
  • 379
  • 2
  • 6
  • 18
  • 2
    This is not an answer for your question, but It's feel strange that nobody advice about to look at the software limitations. If your company build apps for middle range business, it's seems to me obvious that the problem are software limitations probably the software architect and designers didn't think in a database with billions of records or with huge temp tables or procedures, think on that and create some self test and error reporting for the slow queries for my it's the way to solve the problem... think on the 3.3GB limit in the x86 –  Aug 08 '12 at 21:33

4 Answers4

9

There is no way to get the exact same functionality as a single 32-processor machine... with several separate servers. Your best bet is to look at clustering or grid computing. Done right, you can end up with comparable performance... and a higher level of high-availability. A lot of your question also depends on your "db" type. Microsoft SQL Server works significantly differently than MySQL or Oracle... and the scalability is also done completely differently.

Alternatively... you may want to consider letting someone do the database for you... like using the EC2 RDS...

Sadly, there is no way to combine several physical servers together & slap vmware on them and end up with a singular uber-powerful virtual server.

TheCompWiz
  • 7,409
  • 17
  • 23
  • TheCompWiz, thanks for the answer. OK, if the answer depends on my db type, let it be Oracle or Microsoft SQL Server. With these corrections, is it still impossible? Yes, we know about EC2 but we need exactly Oracle or Microsoft SQL Server to test issues with a software product we deliver for a customer... – user54614 Sep 17 '10 at 20:01
  • Also, why only vmware to take into account? We don't mind any other hypervisors... – user54614 Sep 17 '10 at 20:03
  • The ability to traverse multiple servers is a HUGE logistics nightmare... not to mention the lack of available bandwidth between devices. Think about how fast a CPU is... then all of the things you would have to do that would slow down the process... i.e. CPU -> bus -> PCI-bridge -> network card -> ethernet cable -> network stack -> ... even before it's reached the other server? You wouldn't want to wait 1 second to be able to add 1+1. Clusters typically are able to do this because tasks are assigned in "Jobs" and a job is issued to a compute node which does all the tasks in that job... – TheCompWiz Sep 17 '10 at 20:20
  • ... and then sends the answers back to the management node. Windows doesn't. There is no way to setup a virtual X86 environment (or X86_64) that would even attempt doing this. – TheCompWiz Sep 17 '10 at 20:23
  • There are also huge differences between Oracle & Microsoft. In all honesty, you would really need to reflect on *Why* you would need such a HUGE singular instance of a database-server... – TheCompWiz Sep 17 '10 at 20:25
  • Once you can answer the why... you can also begin to design your clustered framework to work with it. – TheCompWiz Sep 17 '10 at 20:26
  • ОК. Here is the answer. We develop and sell a product. It works fine for most customers but fails to work properly on the environment of a huge customer which runs our application on an 32-proc Oracle server with 64 GB of RAM. We want to reproduce this failure in our environment. – user54614 Sep 17 '10 at 20:34
  • 2
    @user54614 - You're absolutely not going to be able to replicate their scenario by strapping machines together. I would suggest talking to both your client and Oracle support to pinpoint and identify the problems. – Chris Thorpe Sep 18 '10 at 01:38
  • What about ScaleMP (http://www.scalemp.com/)? – user54614 Sep 18 '10 at 13:53
  • @user54614 How would you know you've reproduced the failure? With such a bizarre setup how could you trust that any result isn't just a function of running a cobbled-together set of instances? – ceejayoz Aug 08 '12 at 21:45
8

There is a commercial product from ScaleMP called vSMP. It allows you to aggregate multiple x86 systems into a single virtual instance. I've never personally tried this before though, but I've been through a presentation from them. If I remember correctly, there are specific requirements for this to work, and you'll need to get some additional hardware (Infiniband for fast, low latency interconnects). It might cost a pretty penny too!

ryanlim
  • 458
  • 3
  • 4
  • 1
    ScaleMP doesn't emulate an x86 environment. You'll never get Windows or any other standard x86 OS to run in the virtual environment. You The only flavors supported are various versions of Linux built on SMP type architecture. And of that type of architecture... there are SEVERAL flavors. Even free ones. – TheCompWiz Sep 20 '10 at 13:02
  • OP was not specific about the other requirements. I only answered what I could gather from his/her post. – ryanlim Sep 20 '10 at 15:26
  • 1
    This looks bloody cool. I suspect that a 32 core box (possible with 2x 16 Core AMD chips) might be cheaper than a cluster with Infiniband, but there we go. This solution earns more bragging rights. – Tom O'Connor Aug 08 '12 at 21:53
-1

"TheCompWiz" answered your question usefully.

I'd still like to say that, yes, you could build a hypervisor which allowed a single VM to span several physical hosts and it could run that VM "correctly" where everything functioned.

But, even with really good, high-speed networks between the physical hosts, the performance of such a thing would be truly awful, running much more slowly than a smaller VM that fit within a single one of those hosts. You'd have to simulate the cache coherency properties of a single VM by intercepting every single memory read or write that the guest OS and application did, which would multiply the cost of memory access by thousands, if not millions.

So no commercial hypervisor vendor enables such a thing. It's been tried in the lab. Nobody has bothered to make a product out of it.

To underscore the point one more time, look toward clustering for a solution.

Jake Oshins
  • 5,146
  • 18
  • 15
  • But what if a software product we deliver works fine for most customers but fails to work properly on the environment of a huge customer which runs our application on an 32-proc Oracle server with 64 GB of RAM. We want to reproduce this failure in our environment. – user54614 Sep 17 '10 at 20:37
  • 2
    I know nothing of your software, but what's happening at 32-processors and 64GB RAM that's not happening at 2-processors and 8 GB RAM? If there's truly something repeatably wrong at that level, then it's an Oracle/OS/driver/IO/hardware issue. – gravyface Sep 17 '10 at 20:46
  • You will never get a hypervisor to traverse physical machines. They're still confined within the physical core of the machine. That being said... I bet you could build a mainframe type architecture similar to those of archaic behemoths from long ago... but you'd never get anything x86 running on it. – TheCompWiz Sep 17 '10 at 21:00
  • 1
    Your huge customer *should* have a second QA instance of that monster database server. If they don't have that available, it truly is their problem. In 15 years of IT work, I have never seen anybody expect a software vendor to duplicate their infrastructure (unless it is part of a service contract specifying exactly that, and the customer pays for it). Especially when that infrastructure is esoteric (although a 32-core 64-GB server can be had for about $22K from Dell these days). – rmalayter Sep 17 '10 at 22:03
  • What about ScaleMP (http://www.scalemp.com/)? – user54614 Sep 18 '10 at 13:54
  • ScaleMP doesn't emulate an x86 platform. Aka... you'll never get Windows to run on it. The only OSes that work are ones designed around SMP architecture. (linux) and if you want to do that... there are several options. – TheCompWiz Sep 20 '10 at 12:57
  • Ah, it does not? Learn reading: The innovative Versatile SMP™ (vSMP) architecture aggregates multiple x86 systems into a single virtual x86 system, from http://www.scalemp.com/ – TomTom Mar 27 '13 at 11:16
  • @TomTom It doesn't create a singular fully functional x86 system. It does create a great x86 SMP system. You cannot run Windows on it. Learn to read all the information... not just 1 line from their website. – TheCompWiz Mar 29 '13 at 15:30
-2

VMWare does. It is called DRS or Distributed Resources Scheduler. It allows you to combine the resources of 16 servers. You can then distribute that total to one or more virtual environments.

  • No, this is not at all what DRS does. DRS lets one automatically vMotion machines around the cluster to even load out among the host nodes. It does not in any way give access to multiple hosts from one single VM. – EEAA Jul 25 '17 at 02:40