VM Escape & VMware Critical vmkernel Updates

The 9/21/2007 SANS NewsBites newsletter has some good commentary on the VMware updates that have shipped in the last two months. In short, if you are running any VMware product you need to be at the latest version in order to be secure against potential VM escapes.

Normally virtual machines are encapsulated, isolated environments. The operating systems running inside the virtual machine shouldn’t know that they are virtualized, and there should be no way to break out of the virtual machine and alter the parent hypervisor. The process of breaking out and interacting with the hypervisor is called a “VM escape” and it is bad news. If an attacker can gain access to the hypervisor they effectively have unlimited control over every other virtual machine running on the host. Not good.

The SANS editors make the point that VMware needs to be more forthcoming about problems with their hypervisors. I agree. VMware has published a number of “critical” patches for VMkernel, but nowhere do they mention any security issues. Compare the disclosure with the description of problems fixed in ESX-8258730. Do you see any mention of a security problem in 3.0.1? No. Furthermore, there is nothing indicating the urgency of upgrading to 3.0.2. There was nearly a month between the release of 3.0.2 and the disclosure of the problem, meaning attackers had at least a month to exploit this in the wild. Despite firewalls and other security precautions, my environments are a subset of “the wild.”

Certainly all software has bugs, and some of those are bound to be security problems. Fixing those bugs takes time, and with multiple shipping products it will take some time to fix everything. However, downplaying the issue by not mentioning it at all is a credibility problem for VMware. Somebody knows about the security holes, right? And if someone knows about them, someone is exploiting them. Virtualization security is a hot topic right now, and appearing to hide security problems doesn’t seem like the best course of action to me.

What has this all taught me? Mainly that I cannot trust VMware to tell me when I’m vulnerable. This means that any time I see a new version of ESX I have to assume it fixes security vulnerabilities, and get it deployed as soon as I can. It also means that whenever I see an update to the vmkernel marked “critical” I have to apply that, too, despite what it might list as the resolved problems.

Patching more frequently will add to the labor needed to maintain my virtual environments. It also means I will have to work harder to justify the extra labor to my customers and employers. Unplanned labor to remediate a critical security problem is easy to justify. Unplanned labor to remediate what looks like a SCSI driver problem seems like a waste. But now it has to be done, every time a point release comes out, and every time a vmkernel patch is issued. What they aren’t saying is what I’m worried about: that they’ve fixed a big security problem, and mum’s the word.