Sure (I really should do a real write-up on this...):
So the main purposes are simple horizontal scaling and efficient use of hardware. Virtualization makes horizontal scaling simple because it's just a matter of cloning a particular machine (or machines). It makes efficient use of hardware because there's no need to have a dozen physical machines for a dozen different purposes, unless they all use a full machine worth of resources.
Lets say I want to add a new (non-static) Web server. Well, Apache is on its own VM; I can clone and migrate it to a new physical box (or just have two on the same hardware). I could also simply add more resources dynamically. If I need to scale the database, same deal. The biggest win here is that when I scale one of those, nothing else comes with it. There's no DNS server on the Web box. There is no NFS server on the database box.
Before virtualization, you basically had two choices: Throw a whole bunch of packages on a single box or spread it out over different physical machines. The first completely ruins encapsulation, thus adding unnecessary complexity, while the second is really uneconomical unless you're using all those resources out of the gate.
Then, what happens when you need to scale? All that crap needs to be setup again! Hopefully we were smart about it and made it as simple as possible, but I have never made a system as scalable as my current one-click-cloning mechanism.
Right now we only have a single physical machine (16 cores, 32gb ram, iSCSI); using your recommendation of "whatever packages on whatever servers" I would end up with this clusterfuck of a server that does a dozen different things at once. What I have now is exactly that, except encapsulated into VMs with their own resources and their own purpose.
Yes, virtualization has a slight performance cost (though bare-metal virts like Xen have a pretty marginal one), but I'll gladly accept it for the massively easier scaling and efficient use of hardware. And, yes, if one VM happens to go insane for some reason, it doesn't affect anything else. For instance, on one of our older servers, the MySQL VM's "drive" had a tendency to become corrupt randomly. I never really did figure out why, but I imagine it was because I make fun of MySQL all the time, but I digress -- the point is, it never affected the rest of the machine and, since MySQL was used for things of little importance (Wordpress), it didn't even take down the website when it happened.
That was a pretty rambling explanation, but hopefully it covers your questions. If not, let me know.
Thanks! That's a pretty thorough reply, and I have to agree that it's hard to imagine an easier way to deploy new server instances than copying a VM. I think you can get pretty darn close with a package system like DEB or RPM, but in terms of being able to bring up and test an environment on a development machine, and then be assured the same environment will work exactly the same in production, this is a reassuring approach. I think it makes a lot of sense if you are expecting to require rapid scaling at some point in the future.
I'm a little unsure why you wouldn't just use EC2 in that case, although it might make sense to use a dedicated box for as long as you can and supplement that with EC2 instances when appropriate. Obviously you can get a lot of EC2-instance-equivalents out of a 16-core 32gb box, and your approach would likely be easy to integrate with EC2 when the time comes.
I'm not sure what your load trends look like, but if you're not likely to surpass what your dedicated server can handle within the next few months, this would seem like overengineering. I know you said it "ruins encapsulation" and makes a "clusterfuck of a server" but I don't quite see it. If this is a web app running on a modern gnu/linux distro, with maybe some DNS and cron jobs and email and blogs, we're talking about a very common setup that's running successfully on zillions of boxes. On the other hand, two or three times in five years I have had buggy Apache modules bring down a machine by leaking processes, and it would have been nice if that didn't take e.g. email down with it.
I think my biggest reservation about your approach is that it's weird. It's basically hand-made custom EC2, right? Is there an open source project somewhere to package up the tools to do this on one's own servers? (If not, maybe you should start one... :)
It's basically hand-made custom EC2, right? Is there an open source project somewhere to package up the tools to do this on one's own servers?
There's two that I know of that support the EC2 wire protocol: Nimbus and Eucalyptus.
I am the primary developer of Nimbus. We had an EC2-like system released before EC2 existed, only later adapting to their protocol because our users wanted to use the EC2 client tools.
Yeah, we should surpass it. As you said though, it's also very much about simplification (following the initial learning, anyway). EC2 may have simplified a little more eventually, but it was also even more expensive and difficult for me to learn up front.
As for a custom EC2, it's really the other way around; EC2 is a custom Xen setup. Having initially learned of virtualization before AWS and such cloud services, EC2 is the strange one to me. I'd only ever known "DYI EC2", so to speak.
"EC2 is the strange one to me. I'd only ever known "DYI EC2", so to speak."
Those EC2-like systems do have their place. There is a lot more to do and think about when you are allowing others to run VMs on your infrastructure. That is often not the case and not your situation either, it sounds like.
So the main purposes are simple horizontal scaling and efficient use of hardware. Virtualization makes horizontal scaling simple because it's just a matter of cloning a particular machine (or machines). It makes efficient use of hardware because there's no need to have a dozen physical machines for a dozen different purposes, unless they all use a full machine worth of resources.
Lets say I want to add a new (non-static) Web server. Well, Apache is on its own VM; I can clone and migrate it to a new physical box (or just have two on the same hardware). I could also simply add more resources dynamically. If I need to scale the database, same deal. The biggest win here is that when I scale one of those, nothing else comes with it. There's no DNS server on the Web box. There is no NFS server on the database box.
Before virtualization, you basically had two choices: Throw a whole bunch of packages on a single box or spread it out over different physical machines. The first completely ruins encapsulation, thus adding unnecessary complexity, while the second is really uneconomical unless you're using all those resources out of the gate.
Then, what happens when you need to scale? All that crap needs to be setup again! Hopefully we were smart about it and made it as simple as possible, but I have never made a system as scalable as my current one-click-cloning mechanism.
Right now we only have a single physical machine (16 cores, 32gb ram, iSCSI); using your recommendation of "whatever packages on whatever servers" I would end up with this clusterfuck of a server that does a dozen different things at once. What I have now is exactly that, except encapsulated into VMs with their own resources and their own purpose.
Yes, virtualization has a slight performance cost (though bare-metal virts like Xen have a pretty marginal one), but I'll gladly accept it for the massively easier scaling and efficient use of hardware. And, yes, if one VM happens to go insane for some reason, it doesn't affect anything else. For instance, on one of our older servers, the MySQL VM's "drive" had a tendency to become corrupt randomly. I never really did figure out why, but I imagine it was because I make fun of MySQL all the time, but I digress -- the point is, it never affected the rest of the machine and, since MySQL was used for things of little importance (Wordpress), it didn't even take down the website when it happened.
That was a pretty rambling explanation, but hopefully it covers your questions. If not, let me know.