How to Troubleshoot Unreliable or Malfunctioning Hardware

My post on Intel X710 NICs being awful has triggered a lot of emotion and commentary from my readers. One of the common questions has been: so I have X710 NICs, what do I do? How do I troubleshoot hardware that isn’t working right? 1. Document how to reproduce the problem and its severity. Is it a management annoyance or does it cause outages & downtime? Is there a reasonable expectation that what you’re trying to do should work the way you expect? That might seem like an odd question, but sometimes other people do the procurement for (and without) us and there are gotchas they didn’t think to ask about. In my case with the X710s I felt I …

Read More

Critical Dell BMC Firmware Update

If you’re running a Dell PowerEdge 1900, 1950, 2900, 2950, 2970, 6950, R300, T300, R605, R805, or R905 there are urgent & critical security updates that have been released by Dell on October 15, 2012. Similarly, there’s an urgent update to the Dell-supplied ESXi 4.0 U4 software. Dell describes the fixes as “Critical Security Update –Urgent BMC Release.” To me that says Dell fixed something that’s remotely exploitable and doesn’t want to say what it was out of fear of tipping off troublemakers. I always like to know what the problem is, figuring that the bad guys probably already know, and it helps me determine my priority for the fix. Moral of the story is that if your older Dell server …

Read More