How to Troubleshoot Unreliable or Malfunctioning Hardware

My post on Intel X710 NICs being awful has triggered a lot of emotion and commentary from my readers. One of the common questions has been: so I have X710 NICs, what do I do? How do I troubleshoot hardware that isn’t working right? 1. Document how to reproduce the problem and its severity. Is it a management annoyance or does it cause outages & downtime? Is there a reasonable expectation that what you’re trying to do should work the way you expect? That might seem like an odd question, but sometimes other people do the procurement for (and without) us and there are gotchas they didn’t think to ask about. In my case with the X710s I felt I …

Read More

Why Use SD Cards For VMware ESXi?

I’ve had four interactions now regarding my post on replacing a failed SD card in one of my servers. They’ve ranged from inquisitive: @plankers why would you use an SD card in a server. I’m not a sys admin, but just curious. — Allan Çelik (@Allan_Celik) January 22, 2015 to downright rude: “SD cards are NOT reliable and you are putting youre [sic^2] infrastructure at risk. Id [sic] think a person like you would know to use autodeploy.” Aside from that fellow’s malfunctioning apostrophe, he has a good, if blunt, point. SD cards aren’t all that reliable, and there are other technologies to get a hypervisor like ESXi on a host. So why use SD cards? 1. Cost. Looking at …

Read More

Critical Dell BMC Firmware Update

If you’re running a Dell PowerEdge 1900, 1950, 2900, 2950, 2970, 6950, R300, T300, R605, R805, or R905 there are urgent & critical security updates that have been released by Dell on October 15, 2012. Similarly, there’s an urgent update to the Dell-supplied ESXi 4.0 U4 software. Dell describes the fixes as “Critical Security Update –Urgent BMC Release.” To me that says Dell fixed something that’s remotely exploitable and doesn’t want to say what it was out of fear of tipping off troublemakers. I always like to know what the problem is, figuring that the bad guys probably already know, and it helps me determine my priority for the fix. Moral of the story is that if your older Dell server …

Read More