Author Archive

VI 3.5 Update 2 Hardware Status »

I had seen this in the release notes for Virtual Infrastructure 3.5 Update 2:

Display of System Health Information – More system health information is displayed in the VI Client for both ESX Server 3.5 and VMware ESX Server 3i.

but only today noticed that my Dell PowerEdge 1950s now have health information listed (and that I lost a drive this morning in one of my test machines… DOH). My PowerEdge 2950s do not, though. Hopefully they’ll make the cut next time.

I like the trend of integrating all the elements of server management back into the VirtualCenter server. Now if I could just have Update Manager update the BIOS, RAID, management controller, and HBA firmware when it updates ESX I’d really be happy. :-)

Update: In the comments Sean suggests disconnecting and reconnecting the ESX hosts, which did the trick for me. Thanks Sean!

Why This VMware Time Bomb Issue is a Big Deal »

Why is this VMware time bomb issue such a big deal?

  1. You can’t fix it without breaking some of your environment, in that you have to set the physical hosts’ time back to get it to work. Then the VMs pick up the time change.
  2. You can’t uncheck the “Synchronize guest time with host” option from VirtualCenter while a VM is running, basically condemning you to going to each host to uncheck that option, or letting the time get unsynchronized briefly.
  3. [kb,kb2].vmware.com had been mostly unavailable all morning, preventing people from actually getting to see the articles on the problem.
  4. In my environment, Windows VMs with Tuesday/Wednesday maintenance windows to pick up Microsoft Patch Tuesday updates had problems where the VMware Tools didn’t complete their post-reboot VMware Tools upgrade (”Check and upgrade Tools before each power on”). Now as we fix the licensing issue those VMs are rebooting themselves outside of their maintenance windows to complete their Tools updates.
  5. People who actually have test environments for their Virtual Infrastructure, and actually have a test regimen for new code, have no way to test for problems like this. Setting the clock forward on machines is tenuous at best.
  6. Waiting longer to roll out patches like this isn’t a solution, because the time bomb could just as easily be three months from now.
  7. Virtual Infrastructure isn’t stable or bug-free enough to wait months to update; each update release like this fixes big problems people are having with their environments.

It all comes down to trust, and there’s a lot of us out here that just got hung out to dry. Doesn’t matter whether Paul Maritz is sorry. We’re sorry, too.

Update: John Troyer reports that the problems with the Knowledge Base are fixed. Thanks guys.

links for 2008-08-13 [delicious.com] »

  • "A failed system or application has known, documented consequences. It is not a game of probability or chance. An unpatched security vulnerability is a game of chance where in most cases the odds against you are not known. " Amen.

Bad Day For People Who Actually Patch »

Let’s just say that if you’re running VMware Virtual Infrastructure 3.5 Update 2 you probably can’t power your VMs on anymore. DOH. Unfortunately, that’s me. I updated everything on Sunday after testing for two weeks, and I can’t even imagine how I’d test for this.

The whole idea of patching sucks. There are always bugs, and you always trade one set of bugs for another when you upgrade. Of course, you use testing to try to figure out if there are more bugs or less, but things like this always show up. I’ve been meaning to write a longer post about patching, especially in the wake of this DNS debacle, but Michael Janke’s post “Patch Now - What Does It Mean?” over at Last In, First Out covers most of what I wanted to say. Especially about security researchers calling for immediate action:

When security researchers/bloggers announce to the world ‘patch now’, are they are implying that the world should ‘patch now without consideration for testing, QA, performance or availability’? Or are they advising an accelerated patch schedule, but in a change managed, tested, QA’d rollout of a patch that considers security and availability? And when they complain about others not patching fast enough, are they assuming that the foot draggers are incompetent? Or are they ignoring the operational realities of making untested changes to critical infrastructure?

Amen. Overall a nice, thoughtful way to present it, and worth the couple minutes to read.

links for 2008-08-12 [delicious.com] »

links for 2008-08-08 [delicious.com] »

links for 2008-08-07 [delicious.com] »

Bandwidth of the USPS »

Matt’s post over at Standalone Sysadmin about flash drives as archival media made me remember conversations I used to have with coworkers about the bandwidth of the U.S. Postal Service, a colleague’s pickup truck loaded with tapes, etc. Sometimes the fastest way to get data to a location is to mail it, even now.

The late Jim Gray had a fantastic interview in ACM Queue back in 2003 where he talked about disk access times vs. capacity vs. Moore’s Law, and especially how he was mailing computers and disks to people. His price comparisons are a little dated now, but the rest is a good use of ten minutes, if you ask me. (and you didn’t, I know). :-)

Close
Powered by ShareThis