I’ve been thinking a lot lately about upgrading to vSphere 5, mainly the questions of when and how I’d like to get it done.
During the launch on July 12th there was a lot of talk about how many QA hours went into vSphere 5 (2 million+). That’s good news. We had some serious problems with vSphere 4 when we deployed it, bugs all over the place, vCenter crashing every couple of days, etc. VMware support wasn’t super helpful in fixing the problems because they didn’t have much experience, and they were unwilling or unable to get Engineering involved. As a result I took a lot of crap from my coworkers about my decision to upgrade things so quickly. To my defense some of those guys are the types that won’t upgrade unless they’re absolutely forced to (and then sometimes not, I mean, we did finally get rid of Token Ring from the data center last year…), but regardless I really hope that vSphere 5 has truly had much better testing. We have 8 times more VMs than the last time, and the stakes are higher. Heck, my C-level executives know the name “VMware” now — last time they didn’t.
Thinking back on my vSphere 4 experience, it went something like this:
- Install vSphere 4 from scratch in my test environment. I spent about a month messing with vSphere 4 in my test environment. Looked great, and I was excited about the new features. I did two rounds of testing, one where I installed from scratch and one where I did a VI 3.5 to vSphere 4 upgrade. Looking back on it now my testing strategy’s main fault was that it was mostly based on clean copies of everything, from ESX to the VMs themselves, which led to a very ideal test situation. Nowadays I have a pretty decent & formal list of things to test, and I use cloned & fenced copies of real VMs to do some of it.
- Build a new server with Windows 2008 Standard, 64-bit, and SQL Server 2008 Standard, 64-bit. Since I needed a new server anyhow I decided we’d do the latest stuff, and vCenter 4 supported 64-bit OSes. Turned out to be a mixed blessing, as it made 4.1 an easy upgrade later. There were also some terrible bugs with 64-bit environments, and some serious kludges to make 32-bit stuff work. If I did it again I’d probably do it the same way, though, mainly because my VC 3.5 server was so decrepit and old.
- Install vCenter 4.0, detach the ESX 3.5 hosts from the old VirtualCenter and import them into 4.0. I was aiming for a clean start here, with a new, fresh database. This actually worked okay, at least until vCenter started locking up every couple of days. We fixed that with a scheduled job to restart it every night. Knowing what I know now I probably should have stopped here for a while, but then again each update to ESX introduced new problems, too, despite fixing a bunch. If I wanted the vSphere 4 features, and I did, I had to put up with it.
- Start a rolling upgrade to ESX 4.0. This part went very smoothly, and I did it over the course of two days, hammering it out on my 10 hosts or so. I didn’t upgrade, but instead I rebuilt from scratch so things were clean. At this point I was probably two months past the release.
- Start upgrading VMs to hardware version 7. This was a large effort that was basically about standardizing the configurations of virtual machines to specific VMXNet & SCSI drivers, removing unnecessary virtual hardware (from P2Vs and mistakes), and getting VMware Tools updated. I’m glad we did this. The only thing I’d do differently is better testing of the VMware Tools, because we ended up having some big problems with them, especially on Windows hosts and especially with the autoupgrade functions enabled. This process was long, though, and took roughly six months to get everything done. I tried to piggyback on normal patching processes, and wrote documentation that every sysadmin followed to do the upgrades.
So how does this inform my vSphere 5 upgrade thoughts?
- Test like a maniac, using what I have learned from this journey. A former boss of mine, an ex-naval aviator, used to say that in the Navy lessons are written in blood. Yeah, pretty much the same. Minus the blood, of course (he also used to tell us, when we were all stressed out, that at least our bad days didn’t involve people shooting at us).
- Chill out. I will probably upgrade vCenter to 5.0 sometime a couple months after the GA release, if it tests out okay. I have a couple of new cluster builds coming up and I’d like to run ESXi 5 on them, thus necessitating vCenter 5. But I may not actually update my main clusters for a while, at least until I know 5.0 is solid (or, rather, predictable).
- When I do upgrade to ESXi 5.0, I might see if I can just upgrade one host in a cluster and leave it as the only 5.0 host for a couple of weeks. Note that this idea might be the dumbest thing ever, and might not be supported or a good idea. But all my vSphere 5 experience is based on clean installs, in test environments, and we all know production is different. So if the documentation, when it’s released, doesn’t completely kill this idea it might be a good way to dip a toe in the production ESXi 5 water without converting completely and irreversibly.
- I won’t push for upgrades to hardware version 8. We don’t need the features in version 8 as much as we did in 7. I’m willing to have two standards for virtual hardware, though, and I’ll upgrade the template VMs to version 8, with all the requisite driver updates and such. I’ll also write documentation for sysadmins to do the upgrade if they want to, and arbitrarily tie inclusion in things like upcoming SRM deployments to the upgrade. Over the next year I expect half of our VMs will get upgraded, quietly and without me having to be the bad guy prodding people to get it done.
- Stick with vCenter 5 installed on my Windows 2008 physical host for now, and once my server is end-of-life I’ll move to a Windows 2008 R2 VM. vCenter 5 as a Linux-based appliance is really a 1.0 product, it has some decent-sized stated limitations, and probably a few complications that aren’t stated. I also suspect that VMware Support won’t know a darn thing about it for a while, so if I do run into a problem I’m going to be on my own. That said, I’ll probably run it in my lab.
- Take it easy on the new features, like autodeploy. I will likely install ESXi 5 from scratch, but still to the local disks. I think autodeploy is going to be great, and I do plan on moving to it, but as we don’t use DHCP or PXE in our data center I’ll need to have some changes made. I’ll also need to consider what happens to our DR & COOP plans with autodeploy, because there will be new dependencies involved. Likewise with Storage DRS, and policy-driven storage. Great ideas to start using in 2012, and it’ll give me time to get the storage and provisioning people warmed up to the ideas. Realistically, my initial goal will be solely to replicate vSphere 4 functionality with vSphere 5 and be stable. Perhaps I’ll use some of the time to convert datastores to VMFS 5, via migrating & reformatting.
These are just my thoughts. I encourage you to have your own, and share them with me in the comments! Doubly so if I’ve said something really dumb or factually incorrect. :)