When Should I Upgrade to VMware vSphere 6?

Jenga champion by Jessica Gardner, on Flickr.

I’ve been asked a few times about when I’m planning to upgrade to VMware vSphere 6.

Truth is, I don’t know. A Magic 8 Ball would say “reply hazy, try again.”

Some people say that you should wait until the first major update, like the first update pack or first service pack. I’ve always thought that approach is crap. Software is a rolling collection of bugs. Some are old, some are new, and while vendors try to make the number of bugs go down the truth is that isn’t the case all the time. Especially with large releases, like service packs. The real bug fixing gains are, to borrow a baseball term, in the “small ball” between the big plays. The way I see it, the most stable product is the version right before the big service pack.

Some people say that because 6.0 ends in .0 they’ll never run that code. “Dot-oh code is always horrible,” they volunteer. My best theory is that these people have some sort of PTSD from a .0, coupled with some form of cult-like shared delusion. A delusion like “nobody gets fired for buying IBM” or “we’ll be whisked away on the approaching comet.” Personally, I should get twitchy when I think about versions ending in .1. The upgrade to vSphere 5.1 was one of the most horrific I had. Actually, speaking of IBM, it seems to me that I filed a fair number of bugs against AIX 5.1, too, back in the day. Somehow I still can sleep at night.

Thing is, a version number is just a name, often chosen more for its marketing value than its basis in software development reality. It could have been vSphere 5.8, but some products were already 5.8. It could have vSphere 5.9, but that’s real close to 6.0. Six is a nice round number, easy to rebase a ton of products to and call them all Six Dot Oh. Hell, AIX never had a real 5.0, either, except internally in IBM as an Itanium prototype. To the masses they went from 4.3.3 to 5.1. Oh, and IBM’s version number was 5.1.0. OH MY GOD A DOT-ONE AND A DOT-OH. Microsoft is skipping Windows 9, not because Windows 10 is so epically awesome, but because string comparisons on “Windows 9*” will match “Windows 95,” too. And a lot of version numbers get pickedย just because they’re bigger than the competitors’ versions.

Given all this it seems pretty stupid to put much stock in a version number. To me, it’s there to tell us where this release fits in the sequence of time. 6.0 was before 7.0 and after 5.5.

Oh, but what about build numbers? I’ve had people suggest that. Sounds good, until you realize that the build numbers started back years ago when the codebase was forked for the new version. And, like the version number, it means almost nothing. It doesn’t tell you what bugs are fixed. It doesn’t tell you if there are regressions where 6.0 still has a bug that was fixed in 5.5, or the other way around where 5.5 still has a bug that 6.0 doesn’t because of the rework done to fix something else. Build numbers tell you where you are in time for a particular version, and roughly how many times someone (or something) recompiled the software. That’s it.

Some people say “don’t upgrade, what features does it have that you really need?” Heard this today on Twitter, and it’ll likely end up as a harsh comment on this post. Sure, maybe vSphere 6 doesn’t have any features I really need. But sometimes new versions have features that I want, like a much-improved version of that goddamned web client. Or automated certificate management — the manual process makes me think suicidal thoughts. Or cross-vCenter vMotion, oh baby where have you been all my life. Truth is, every time I hear this sort of upgrade “advice” and ask the person what version of vSphere they’re running it’s something ancient, like 4.1. I suspect their idea of job security is all the busy work it takes to run an environment like that, not to mention flaunting end-of-support deadlines. Count me out. I like meaningful work and taking advantage of improvements that make things better.

Some people say “upgrade your smallest environments first, so if you have problems it doesn’t impact very much.” Isn’t that the role of a test environment, though? Test is never like production, mostly because there’s never the same amount of load on it. And if you do manage to get the same load on it it’s not variable & weird like real user load. Just never the same. And while I agree in principle that we should choose the first upgrades wisely I always rephrase it to say “the least critical environments.” My smallest environments hold some of the most critical workloads I have. One of them is “things die & police are dispatched if there are problems” critical. I don’t think I’ll start there.

So where do I start? And how long will it take?

Green fields of grain on Public Domain Images
Green fields of grain on Public Domain Images

First, I’m doing a completely fresh install of vSphere 6.0 GA code in a test environment. I’m setting it up like I’d want it to be in production. Load-balanced Platform Service Controllers (PSCs). Fresh vCenters, the new linked mode (old linked mode was a hack, new linked mode isn’t even really quite linked mode, just a shared perception of the PSCs). A few nested ESXi hosts for now. I just want to check out the new features and test compatibility, gauge if it’s worth it.

Second, I’m going to wait for the hardware and software vendors in my ecosystem to catch up. Dell has certified the servers I’m running with ESXi 6.0. Dell, HDS, and NetApp have certified my storage arrays. But Veeam hasn’t released a version of Backup & Replication that supports 6.0 yet (soon, says Rick). Backups are important, after all, and I like Veeam because they actually do meaningful QA (I got a laugh from them once because I said I adore their radical & non-standard coding practices, like actually checking return codes). Beyond that, I’m going to need to test some of my code, scripts written to do billing that use the Perl SDK, PowerCLI scripts to manage forgotten snapshots, etc. I’m also going to need to test the redundancy. What happens when a patch comes along? What happens if we lose a PSC, or a vCenter, or something? Does HA work for vRealize Automation? Does AD authentication work? Can I restore a backup?

Third, I’m going to test actual upgrades. I’ll do this with a fresh 5.5 install, running against demo ESXi hosts with demo VMs, with the goal of having the upgraded environment look exactly like my fresh install. Load balanced PSCs, linked mode, vRealize Operations, Replication, Veeam, Converter, Perl SDK, PowerCLI, everything. I’ll write it all down so I can repeat it.

Last, I’ll test it against a clone of my 5.5 VCSA, fenced off from the production networks. I’ll use the playbook I wrote from the last step, and change it as I run into issues.

Truth is, I’ll probably get through step 1 and 2 by mid-May. But then it’ll drag out a bit. I expect upgrade problems, based on experience. I also know I’ve got some big high-priority projects coming, so my time will be limited for something like this. And it’ll be summer, so I’ll want to be in a canoe or on my motorcycle and not upgrading vSphere.

The one thing I do know, though, is that when I get to the production upgrade my path will be laid out by facts and experience, and not folk wisdom and the wives’ tales of IT.

12 thoughts on “When Should I Upgrade to VMware vSphere 6?”

  1. You forgot Linus Torvalds take on versioning with the recent bump to 4.0 ๐Ÿ™‚
    “But nobody should notice. Because moving to 4.0 does *not* mean that
    we somehow changed what people see. It’s all just more of the same,
    just with smaller numbers so that I can do releases without having to
    take off my socks again.”

    • Great point. I like semantic versioning but it does get pretty long sometimes. It’s nice to increment it, start again.

      Changing topics for a second, I really wish more distros were able to put 4.0 into their latest releases. Especially Red Hat.

  2. Bob,
    Great discussion – it highlights a huge IT challenge – the gap between vendor QA and your full stack deployment. Virtualization helps a little in isolating your application from the rest of the stack, but doesn’t eliminate the interoperability problem. Not to oversimplify, but if I look at a SaaS or public cloud environment, the “version control” is mostly out of your hands. If I use SalesForce or AWS, I am on the latest version, not on whatever version I’ve manage to test myself. It would be nice not to ask if the new version is really needed and to lower the pain in upgrades; it’s not something that we can just blindly go into, there are more solutions and platforms that are striving to make this part of IT easier.
    What do you think?

    • My reply here will need to be my next post, too much and too important a discussion for the comments. The big thing I’m thinking about is that nothing is infallible. Cloud vendors are down once a year for some reason. And automated updates anywhere have big potential for ugliness. I also think that ITaaS/SaaS locks us into evolutionary updates, with little room for the big revolutionary stuff that really changes how we work. I’m more of a punctuated equilibrium guy myself (periods of stability with short periods of huge, meaningful change). ๐Ÿ™‚

  3. Happy programmers make happy software, Happy programmers come from California :-). Not Wisconsin. ๐Ÿ™‚

    1. Always look at the length of a beta for new major releases. Having been on this end, short beta releases indicate extreme pressure to get a release out.

    2. The devil is in the details, read about
    the major infrastructure changes, if they replace a major component, be very wary of the .0 release. IOS 32 bit 2 64 bit for example.

    3. Read the book “blink”, your best assessment is what I think in the first 1/2 second, If u think its a bad release, it probably is. Your brain has sophisticated analysis capabilities that your conscious doesn’t control. even with excellent addition data, research shows that 1/2 second analysis is still more accurate. ๐Ÿ™‚

    Happy cows come from California! ๐Ÿ™‚

    • Bring it, Nielsen. ๐Ÿ™‚

      1. True — one of the reasons I think vSphere 6 is going to be better than most is the long & public beta cycle.

      2. I didn’t mention VSAN here because I don’t use it, but that’d be a big reason to test hard. VSAN 6.0 is really 2.0 and substantially different. PSCs are different, but really just an evolution of what was in 5.5, split out for better manageability. And so on.

      3. Haven’t read that, I should read it so I can game that process. ๐Ÿ™‚ I don’t disagree.

      You might have more cows but we still rock your world when it comes to cheese. ๐Ÿ™‚

    • On #3: Yes the last 3 chapters of the book look at how marketers try to manipulate that 1/2 second response.

      On a road trip, with my sons girlfriend (from Millwakie) we saw a giant beef corral with 5000+ cows. They were not happy cows.

      • ITo be clear, we were on a road trip in Cslifornia, headed to ZlA on the 5. The California cows were not at all happy in their giant pens.

        • ๐Ÿ™ All because dairy subsidies are based on the distance from Eau Claire, WI. It’ll be interesting to see how that plays against rising prices of water.

  4. I look forward to reading about your 6.0 adventures. Last week I installed ESXi 6.0 and then vCSA 6.0 on top of it. After way too much searching I found the 6.0 x86_64 Linux client plugins for Firefox & Chrome, installed them and then tried to upload a template which failed miserably (supposedly the plugins are not installed which they are). So I’ll wait with exploring 6.0 until there are some Linux plugins available that actually work.

Comments are closed.