I don’t know about other hardware, but on Dell PowerEdge servers the best way to fix a dead drive is just to pull it out and put a new one in while everything is up and running.
It’s blissfully simple. Walk up to the box, pull the drive, put a new one in, and wait until the status light turns green. I walk away after the whole array starts blinking as it rebuilds the missing disk.
Every time, and I really mean every time I’ve seen someone try to use the Windows or Linux-based RAID controller software to help them replace a disk they’ve ended up either needing to power cycle the whole machine or doing something dumb. Dumb, like mirroring the new, blank drive back over the good drive. Every time I try to use the software it can’t be installed, doesn’t work right, or has such a horrible interface (GUI or CLI) that I fear that I will do something dumb.
Maybe this is all a real bad idea (leave me a comment if you think so), but I figure it’s called “hot swappable” for a reason. Why risk human error when the RAID controller knows the score, and can take care of itself so much better than I can take care of it?
Don’t forget to try to just pull out the drive, and stick it back in!
I’ve seen the controller rebuild the same drive that way, and a year later they’re still running without a problem.
Had to do this on a pe2850 with a RAID-5 array and a couple of pe1850 with RAID-1 arrays for a grand total of 4 drives ;-(
I think they sometimes arrive with an ‘almost unplugged’ drive and the vibrations gets to them after a while…
Yeah, good point. I had a machine in Chicago lose a drive, and the day before I was going to go down there to service it it lost power. Came back up fine, and remirrored the disks. Hmmm. It is an 1850, too.
I was looking for an excuse to head to Chicago for a day, though. 🙂
I’ve supported Compaq/HP Proliants since ’98 and Dells since the 4th generation PowerEdges. I’ve had confidence in the Proliant Smart Array controllers’ auto-rebuild abilities for a much longer time than the Dells. Dell’s penchant for changing RAID controller OEMs with almost every generation (and subsequently alienating toolsets and cross-generational array compatibility) hasn’t sat well with me. However, the PERC/4’s and up seem to auto-rebuild OK.
zeirylhb xhrdn aqdys gxifdelo khpyoc vlpceygd onptua