Over the last week there’s been a number of different posts about the new Dell PowerEdge models, the 12th generation (12G) of their server line. I was briefed both by Dell technical staff and by Dell executive staff on the Rx20 lineup and I took a few notes. I was mainly briefed about the Dell PowerEdge R620, R720, R720xd, which will be in the first wave of refreshes. The higher-end models, like the R820 and R920, and the cloud & HPC focused C-series, will be part of another release soon after, and reach into the higher-end E7 CPU models (8 way, 10 cores) from Intel.
The new mid-range hosts are built around the Intel Xeon E5 CPUs, also known as “Sandy Bridge.” These CPUs are 4, 6, or 8 cores, and the the first number in the CPU model number is the number of sockets they will support. An Intel Xeon E5-4600 is a four-way CPU, E5-2600 is a two-way CPU, and an E5-1600 is a one-way (uniprocessor) CPU. Anecdotally, these CPUs are reported to turn in a 70% general workload performance improvement over the last generation of CPUs, and a look through spec.org at the benchmark numbers reported on E3/E7 hosts reinforces that.
Memory is quad-channel now, up from triple-channel in the Intel Xeon 5600 (“Nehalem”) line. This lets them run RAM at up to 1600 MHz, and it falls back like the Nehalems did to 1333, 1066 and 800 MHz. The speed depends on the memory configuration & type of DIMMs, like before. There are now 16 total DIMM sockets, up from 12, due to the extra channel. Peter Bailey of Dell points out that you can now feasibly (i.e. not 800 MHz) & economically (16 GB DIMMs only have a 10-15% premium now) put two 16 GB DIMMs on each channel of each CPU in a dual-socket R720, causing a sweet spot of 256 GB of RAM per host.
The R820, and presumably the R920, will have similar architectures, but increase the number of DIMM sockets to 48 from 32. As a current R810 owner I have no idea where they plan to put all those sockets, but it’ll be awesome to be able to get 1.5 TB of RAM in a 2U host. Perhaps they are using technology from the TARDIS, or they have Hermione Granger casting undetectable extension charms on the systems. Regardless, the R620 also magically gains a PCIe expansion slot, bringing it to a total of 3. For many that means that an R620 is an effective replacement for an R710, saving 1U of space but giving us as many card slots.
The 12G hosts are all PCI Express 3.0, which means that bandwidth and the number of PCIe lanes has gone up. PCIe 3.0 bumps the number of gigatransfers per second to 8, from 2.0′s 5 GT/s. It also changes the encoding from the 2.0 spec’s 8b/10b encoding to 128b/130b, which reduces overhead. Between the two improvements, PCIe 3.0 “effectively delivers double PCIe 2.0 bandwidth” according to the Wikipedia article. From a practical point of view that means that PCIe 3.0 devices can all move down a slot size, so if you needed a x8 slot for your dual 10 Gbps PCIe 2.0 NIC the 3.0 version of that NIC will be x4. Good news for GPUs!
On the storage front there’s a few new improvements. First, the RAID controllers have all been refreshed. Marketing didn’t get the memo, though; the 11G models were the H700/H800, the 12G models are H710/H810. Perhaps with the 13th generation they’ll stop working to confuse everybody and name them the H730/H830… The controllers continue to support CacheCade, the LSI software feature that lets you use an onboard SSD as cache to front-end slower traditional disks attached locally. It’s essentially a direct-attach storage version of the NetApp FastCache cards, and a feature that’s finding its way into most disk controllers, including those built into Intel desktop disk controllers, too. I’d expect that the CacheCade restriction of only working with Dell-supplied SSDs will continue to be enforced, if only for reliability & support reasons.
Drive carriers are the same between the 11G and 12G lines, which is good because the Dell PowerEdge R620 will be able to hold 10 2.5″ drives. The R720 can hold 16, and the R720xd, whose extra letters stand for “extra drives,” can hold up to 24 in front and 2 in back. Sure, you don’t have a front LCD panel or DVD-ROM anymore, but nobody cares in 2012. I suspect that this will be increasingly useful in the face of technologies like the VMware Virtual Storage Appliance (VSA), where you could potentially have a two or three host cluster that has no external storage whatsoever, especially when coupled with the 10 Gbps networking & CacheCade options. At the very least I can recycle a bunch of the orphaned 146 GB 10K disks I’ve abandoned with my vSphere 5 switch to USB sticks for booting.
Beyond traditional SSD, though, is the idea of the PCIe-based flash card, like what Fusion I/O sells. You’ve been able to order Fusion I/O cards with Dell servers for a while now, and they’re incredibly fast because they don’t have all the overhead of RAID. However, those cards suffer some serious drawbacks in that they’re internal and not as serviceable as a traditional drive. In response, Dell has a build-time option that puts two or four PCIe hot-plug slots in the front of the chassis, so that you can have flash modules installed on a host that are more easily serviceable. This is a build-time option that consumes space on the front of the host, so if you think you want to add one of these modules later you need to order the server with the right options initially. You can’t retrofit it later.
The main complaint about SSDs is always the limited number of write cycles before failure. Dell is now saying that the SSDs they ship will withstand 12 PB of writes over their life. Figuring a 5 year lifespan of a server, that’s 6.73 TB of writes a day per device. Though I don’t know what the PCIe flash pricing is yet, prices on SAS SSD drives have fallen 30% or more over the last year, making all of this more affordable against traditional drives.
One big use for local SSD is local I/O caching, especially as a read cache for enterprise disk arrays. EMC has been talking about this with some of their recent releases, and other vendors, like Dell, are saying that they’ll release similar offerings later this year. Local SSD caching is a newer feature even in VMware products based on vSphere 5, and if the trend continues it promises to seriously curtail the amount of read I/O our shared storage arrays have to do. That ought to make storage administrators very, very happy, especially since virtual environments are all random I/O. It’ll be interesting to see the economics of this. Like most new features that stand to cannibalize enterprise sales I expect this to be an expensive add-on, and possibly a Dell Storage-only feature, but we’ll have to wait until the release to know for sure. I’m hoping it’s a more generic caching solution that’s reasonably priced, offsetting my organization’s spending on SAN and array improvements.
The 12G servers have all new network options as well. There is no longer the idea of an “onboard” set of NICs, as everything has been moved to a daughtercard. The upside to that is customers now have choices, Broadcom or Intel, all copper or some SFP+, and FCoE support like that of the X520 cards. This makes me very happy, because I’m an Intel NIC fan, and I’m moving in an all-10 Gbps direction, though this does mean my beloved FCoTR is falling behind again.
The minimum OSes needed to install on these servers are Microsoft Windows Server 2008, Red Hat Enterprise Linux 5.7, and Red Hat Enterprise Linux 6.1, though 6.2 is strongly recommended. Newer releases of Fedora, Ubuntu, and all the RHEL clones will work, too. This is merely due to driver support for the Sandy Bridge hardware. Fans of this blog and people who have attended any of my presentations know that I am a big proponent of keeping your OS somewhat recent. This is a big reason why. Dell merges their driver support with the mainline Linux kernel, so as long as you can run a newer kernel you should be in good shape. Add that to their yum repositories for OpenManage and BIOS updates and excellent history of support for Linux and they continue to have a great platform for open source computing. In the briefings I had, Dell’s Linux support was described as “boring” only because it just works, without fanfare or hassle. I like things that just work, perhaps because it’s so rare in IT.
On the management front Dell is releasing OpenManage Essentials, which is a streamlined hardware management and monitoring solution that is free to customers. Classically, OpenManage has been a real bear to install and maintain, and hopefully this makes it more useful as a tool. The 12G servers can be monitored & updated completely out-of-band. This is a giant win, as agents make everybody’s lives miserable. Of course, you still need an agent for all your older Dells, but it’s a promise of good things to come.
Similarly, they’re releasing the free & standalone OpenManage Power Center, where you can manage and monitor power consumption on all the 11G and 12G hardware you have. Details are a little scarce, but this is probably similar to HP’s Insight Power Manager, where you can administer power caps and monitor usage.
Dell also continues to support & expand their Fresh Air initiative, which certifies certain models and configurations of Dell hardware to work continuously at temperatures up to 95° F (35° C), 900 hours a year at 104° F (40° C), and 90 hours a year at 113° F (45° C). This has serious effects on data center design & operations, because they estimate it saves $100K per megawatt of OpEx and $3000K ($3M) per megawatt of CapEx for new data centers. For us it means we can turn up the temperature a bit as we retire all the older server hardware. It also means that some types of HVAC maintenance can now be done with the systems online.
That’s all I had for notes. If I’m missing something, or I’ve got an error here, please leave a comment. I owe a big thanks to Dell’s Peter Bailey, who gives an excellent, high-bandwidth, pragmatic, non-marketing presentation on all of this, and while he steals some of our jokes I’m stealing some of his numbers. I also owe a continual thanks to Andrew Waxenberg and Pat Meyers, my local Dell guys, a solid team. Likewise, I thank David Gibbs from Blanc & Otus for setting me up with the Dell Global Channel executives to get some of my 12G questions answered. Kevin Houston, a Dell engineer, has great content about the Intel CPUs and M-series of blades at bladesmadesimple.com. I first met him at a Gestalt IT Tech Field Day, which Dell has sponsored, and I’ve attended to great enjoyment and learning, in the spirit of disclosure. Dell has also given me access to seed & evaluation units in the past as part of doing business with them. This doesn’t generally skew my opinion, as most people who know me can attest to. Good technology gets a good review from me, bad ideas get skewered indiscriminately. :)
 Or a Bag of Holding, for AD&D fans.