From buzz to reality
A free PDF of this guide is available to non-subscribers via the Enterprise IT Guide, presented by Ars Technica and Intel. Check it out for this and other free whitepapers.
In 2003, Intel announced that it was working on a technology called "Vanderpool" that was aimed at providing hardware-level support for something called "virtualization." With that announcement, the decades-old concept of virtualization had officially arrived on the technology press radar. In spite of its long history in computing, however, as a new buzzword, "virtualization" at first smelled ominously similar to terms like "trusted computing" and "convergence." In other words, many folks had a vague notion of what virtualization was, and from what they could tell it sounded like a decent enough idea, but you got the impression that nobody outside of a few vendors and CIO types was really too excited.
Fast-forward to 2008, and virtualization has gone from a solution in search of a problem, to an explosive market with an array of real implementations on offer, to a word that's often mentioned in the same sentence with terms like "shakeout" and "consolidation." But whatever the state of "virtualization" as a buzzword, virtualization as a technology is definitely here to stay.
Virtualization implementations are so widespread that some are even popular in the consumer market, and some (the really popular ones) even involve gaming. Anyone who uses an emulator like MAME uses virtualization, as does anyone who uses either the Xbox 360 or the Playstation 3. From the server closet to the living room, virtualization is subtly, but radically, changing the relationship between software applications and hardware.
In the present article I'll take a close look at virtualization—what it is, what it does, and how it does what it does.
Abstraction, and the big shifts in computing
Most of the biggest tectonic shifts in computing have been fundamentally about remixing the relationship between hardware and software by inserting a new abstraction layer in between programmers and the processor. The first of these shifts was the instruction set architecture (ISA) revolution, which was kicked off by IBM's invention of the microcode engine. By putting a stable interface—the programming model and the instruction set—in between the programmer and the hardware, IBM and its imitators were able to cut down on software development costs by letting programmers reuse binary code from previous generations of a product, an idea that was novel at the time.
Another major shift in computing came with the introduction of the reduced instruction set computing (RISC) concept, a concept that put compilers and high-level languages in between programmers and the ISA, leading to better performance.
Virtualization is the latest in this progression of moving software further away from hardware, and this time, the benefits have less to do with reducing development costs and increasing raw performance than they do with reducing infrastructure costs by allowing software to take better advantage of existing hardware.
Right now, there are two different technologies being pushed by vendors under the name of "virtualization": OS virtualization, and application virtualization. This article will cover only OS virtualization, but application virtualization is definitely important and deserves its own article.
The hardware/software stack
Figure 1 below shows a typical hardware/software stack. In a typical stack, the operating system runs directly on top of the hardware, while application software runs on top of the operating system. The operating system, then, is accustomed to having exclusive, privileged control of the underlying hardware, hardware that it exposes selectively to applications. To use client/server terminology, the operating system is a server that provides its client applications with access to a multitude of hardware and software services, while hiding from those clients the complexity of the underlying hardware/software stack.
Figure 1: Hardware/OS stack
Because of its special, intermediary position in the hardware/software stack, two of the operating system's most important jobs are isolating the various running applications from one another so that they don't overwrite each other's data, and arbitrating among the applications for the use of shared resources (memory, storage, networking, etc.). In order to carry out these isolation and arbitration duties, the OS must have free and uninterrupted rein to manage every corner of the machine as it sees fit... or, rather, it must think that it has such exclusive latitude. There are a number of situations (described below) where it's helpful to limit the OS's access to the underlying hardware, and that's where virtualization comes in.
Virtualization basics
The basic idea behind virtualization is to slip a relatively thin layer of software, called a virtual machine monitor (VMM) directly underneath the OS, and then to let this new software layer run multiple copies of the OS, or multiple different OSes, or both. There are two main ways that this is accomplished: 1) by running a VMM on top of a host OS, and letting it host multiple virtual machines, or 2) by wedging the VMM between the hardware and the guest OSes, in which case the VMM is called a hypervisor. Let's look at the second, hypervisor-based method, first.
The hypervisor
In a virtualized system like the one shown in Figure 2, each operating system that runs on top of the hypervisor is typically called a guest operating system. These guest operating systems don't "know" that they're running on top of another software layer. Each one believes that it has the kind of exclusive and privileged access to the hardware that it needs in order to carry out its isolation and arbitration duties. Much of the challenge of virtualization on an x86 platform lies in maintaining this illusion of supreme privilege for each guest OS. The x86 ISA is particularly uncooperative in this regard, which is why Intel's virtualization technology (VT-x, formerly known as Vanderpool) is so important. But more on VT-x later.
Figure 2: Hardware/software stack with virtualization
In order to create the illusion that each OS has exclusive access to the hardware, the hypervisor (also called the virtual machine monitor, or VMM) presents to guest OS a software-created image or simulation of an idealized computer—processor, peripherals, the works. These software-created images are called virtual machines (VMs), and the VM is what the OS runs on top of and interacts with.
In the end, the virtualized software stack is arranged as follows: at the lowest level, the hypervisor runs multiple VMs; each VM hosts an OS; and each OS runs multiple applications. So the hypervisor swaps virtual machines on and off of the actual system hardware, in a very low-granularity form of time sharing.
I'll go into much more technical detail on exactly how the hypervisor does its thing in a bit, but now that we've got the basics out of the way let's move the discussion back out to the practical level for a moment.
The host/guest model
Another, very popular method for implementing virtualization is to run virtual machines as part of a user-level process on a regular OS. This model is depicted in Figure 3, where an application like VMware runs on top of a host OS, just like any other user-level app, but it contains a VMM that hosts one or more virtual machines. Each of these VMs, in turn, host guest operating systems.
Figure 3: Virtualization using a host and guest OS.
As you might imagine, this virtualization method is typically slower than the hypervisor-based approach, since there's much more software sitting between the guest OS and the actual hardware. But virtualization packages that are based on this approach are relatively painless to deploy, since you can install them and run them like any other application, without requiring a reboot.
Why virtualization?
Virtualization is finding a growing number of uses, in both the enterprise and the home. Here are a few places where you'll see virtualization at work.
Server consolidation
A common enterprise use of virtualization is server consolidation. Server consolidation involves the use of virtualization to replace multiple real but underutilized machines with multiple virtual machines running on a single system. This practice of taking multiple underutilized servers offline and consolidating all of them onto a single server machine with virtualization saves on space, power, cooling, and maintenance costs.
Live migration for load balancing and fault tolerance
Load balancing and fault tolerance are closely related enterprise uses of virtualization. Both of these uses involve a technique called live migration, in which an entire virtual machine that's running an OS and application stack is seamlessly moved from one physical server to another, all without any apparent interruption in the OS/application stack's execution. So a server farm can load-balance by moving a VM from an over-utilized system to an under-utilized system; and if the hardware in a particular server starts to fail, then that server's VMs can be live migrated to other servers on the network and the original server shut down for maintenance, all without a service interruption.
Performance isolation and security
Sometimes, multi-user OSes don't do a good enough job of isolating users from one another; this is especially true when a user or program is a resource hog or is actively hostile, as is the case with an intruder or a virus. By implementing a more robust and coarse-grained form of hardware sharing that swaps entire OS/application stacks on and off the hardware, a VMM can more effectively isolate users and applications from one another for both performance and security reasons.
Note that security is more than an enterprise use of virtualization. Both the Xbox 360 and the Playstation 3 use virtual machines to limit the kinds of software that can be run on the console hardware and to control users' access to protected content
Software development and legacy system support
For individual users, virtualization provides a number of work- and entertainment-related benefits. On the work side, software developers make extensive use of virtualization to write and debug programs. A program with a bug that crashes an entire OS can be a huge pain to debug if you have to reboot every time you run it; with virtualization, you can do your test runs in a virtual machine and just reboot the VM whenever it goes down.
Developers also use virtualization to write programs for one OS or ISA on another. So a Windows user who wants to write software for Linux using Windows-based development tools can easily do test runs by running Linux in a VM on the Windows machine.
A popular entertainment use for virtualization is the emulation of obsolete hardware, especially older game consoles. Users of popular game system emulators like MAME can enjoy games written for hardware that's no longer in production.