There’s a lot of discussion going on lately about memory overcommitment in virtual environments. All I have to say is that memory overcommitment is great when you have to treat a VM like you’re buying a physical server.
When I buy a physical server (or twenty) I often look for the “sweet spots” in pricing. I might know that the app that will run on a server will need 4 GB of RAM, but for $100 more I can get 8 GB and not have to worry about being short on RAM when something changes two years later. Worth it? Yes, and my customers think so, too, because the cost of me adding RAM later is much more than $100.
When I provision a VM there isn’t a sweet spot of price vs. capacity, though. You’re just using my capacity. So when someone tells me they want 8 GB of RAM, but historical data shows they’ll need less than 2 GB, I have two options.
I could give them the RAM they ask for and rely on overcommitment. I don’t like this for a few reasons:
- I like it when people think about resource consumption and why they’re asking for 8 GB of RAM. Often it’s just because their last server had that much and it worked fine, or because a vendor’s sizing guide says so. Vendors usually err on the side of “huge” when it comes to recommendations, but if we can think critically about it for two minutes maybe 8 GB isn’t the best estimate.
- Each byte of RAM allocated wastes my disk in the form of swap space. Get 250 VMs that are each overallocated by 4 GB and you’ve wasted nearly a terabyte of disk.
- VMotion operations take longer and temporarily consume more resources (namely that whole 8 GB of RAM). If you’re tight on resources you get even tighter, and you run a higher risk of swapping.
- It’s much harder to do capacity planning, because you have to dig out the memory graphs for a host and read each of them to find out how much RAM a VM is actually using. If you know the amount allocated is close to the right amount then you can just use those figures, saving you hours of data exports, spreadsheets, and misery.
- What happens when your VMs start using that extra RAM, like for their filesystem cache? Remember the guest OS doesn’t really know what’s happening behind the scenes, and most OSes use extra RAM to help buffer disk operations. If you’re overcommitted and your VMs start using RAM you don’t have you’ll have a performance problem when your virtualization environment starts swapping.
The other choice is to size their VM appropriately. This requires a little more work up-front for me because my customers often don’t really know how much RAM they actually need. Since it takes so little time to increase the amount of memory allocated to a VM (at least in VMware) I can often negotiate a deal to give them a lot less, and we monitor the usage through their project’s test phases. If they run short we fix the problem and try it again. I can usually instantly talk them down 50% by showing them their historical memory usage graphs, especially the graphs from their physical servers. Often my customers have already looked at those graphs and have confused filesystem cache with their app’s memory utilization. “OMFG we’re at 100% of our RAM!” gets replaced by “Oh, that line on the bottom is my app?”
In short, my advice to people is that if you have to overcommit, you have the technology to do so, but be careful that you aren’t laying a minefield for yourself later. If you don’t absolutely have to overcommit try not to.
Update: A caveat I thought of after posting: I should mention here that I’m talking strictly server-type VMs here. If you’re doing VDI, like the examples in Mike DiPetrillo’s posts, overcommitment seems like it might be worth considering, given the way most people use their desktops: not heavily. 🙂
Related to this is to see how _low_ memory allocation can actually be set to. I’ve got bind running in a virtual machine with 32 MB of memory 🙂 Of course, stuff like Yum uses a lot of memory so for the purpose of updating software, I had to increase the machine to 64 MB.