We use the Dell Internal Dual SD module (IDSDM) for our VMware ESXi hosts. It works great, and saves us a bunch of money per server in that we don’t need RAID controllers, spinning disks, etc. Ours are populated with two 2 GB SD cards from the factory, and set to Mirror Mode in the BIOS.
The other day we received an alarm:
Failure detected on Internal Dual SD Module SD2
We’d never seen a failure like this so we had no idea how to fix it, and the Internet was only slightly helpful (hence the point of this writeup). Here’s what we did to replace it.
Note: I’m certified to work on Dell servers, and have been messing with hardware for 25 years. To me this is a real easy fix, but you should do what you’re comfortable with. I always suggest you manage possible static electricity by ensuring you’re at the same electrical potential as the server. Touching the metal portions of the case works well for this, which you clearly have to do to get inside a PowerEdge. Just remember to do it again if you wander off and come back. This is also good advice for fueling your car in dry climates. :)
1. First, the SD cards themselves are not covered under Dell warranty, so we bought a new 4 GB card (the smallest we could find) from the nearest place that sold them (try your nearest Walgreens or CVS). The SD card from Dell was Kingston so we chose that brand. Total expenditure was $10 for two cards. Why two cards? Because the staff time & fuel costs more than the part, so stocking a spare makes sense. Plus, if one has failed I suspect I’ll see another failure. After all, I do have a couple hundred of these things.
2. Second, we shut the host down, unplugged it, and found the SD card module using the map under the Dell server cover. On our PowerEdge R720 it was below the PCIe riser closest to the power supplies. On blades it’s out the back, labeled “SD,” and you just have to pull the blade out to get to it.
The Dell IDSDM whitepaper indicates that, because of the way the module is powered, you should always do this work with the AC disconnected.
3. We took the expansion cards out of that PCIe riser and noted which one was in what slot (top vs. bottom). Then we gently removed the PCIe riser itself. Last, the IDSDM has a little blue strap to help you pull it straight up and out of the socket.
4. The error in the system event log indicated that SD2 was faulty. But now that we’ve got the thing in our hands which card is SD2? Turns out on 12G PowerEdge IDSDMs there’s an activity light on each side, and one is labeled SD1_LED and the other is SD2_LED. Your mileage here will vary — 11G servers had the slots labeled, and I haven’t looked at a 13G IDSDM yet. Use your head.
5. The SD card locks in, so you need to push it in to eject it. We took the 2 GB card out, put our new 4 GB card in, and put everything back together.
6. When the server boots it’ll ask you what you want to do about rebuilding the mirror. If you have F1/F2 prompts disabled in the BIOS you’ll have 10 seconds to answer before the boot continues without a rebuild.
For us it took about 5 minutes to resilver the mirror, then the boot process continued into ESXi. In keeping with good security techniques I put the old SD card through a shredder.