Alex over at Virtual Infrastructure 411 linked to an interesting study done by CA which indicates that 44% of server virtualization deployments were failures. This is interesting to me, and timely, too. Today I spoke to another organization with a failing virtualization project, the third so far.
The first group that I spoke with was just overwhelmed. They’d gone in head-first, buying blades, storage, everything. They had no experience with any of the gear and were drowning in it, while their managers were just seeing expenses but no savings and freaking out about it. The three IT guys were already way too busy and could not make time to figure any of the new gear out, much less become experts in it. My suggestion to them was to slow down and figure one out at a time, or divide and conquer, with each of the three becoming an expert in one of the new things. Or, hire a consultant to set it up for them, but in the long run it would be better for them to learn the technologies themselves.
Another group I spoke to related how complex virtualization is, and how their company was choosing to de-virtualize because of all the outages they were having. De-virtualize? Are you kidding me? Not knowing much about storage or virtualization, they relied heavily on vendors to suggest technologies which didn’t work well in their environment. For instance, they ended up with iSCSI storage which didn’t really mesh well with their networking group’s service levels, and the resulting finger pointing between the two groups ended up getting their project canned. “Why didn’t you just buy your own, separate network switch?” I asked…
The group today was on the right track. They had purchased nice large quad-CPU, dual core servers to run VMware Virtual Infrastructure. Lots of RAM, lots of CPU. Their mortal mistake was using a series of older NFS servers they had for the back-end storage. The performance was horrible as soon as they got more than a couple of VMs running. They had this carefully laid out budget and plan for implementation, but it hinged on the NFS servers, which weren’t cutting it. It isn’t surprising, to be honest. You need to support those beefy ESX Servers with more than a garden hose of storage capacity. Now the whole project was in danger. My main suggestion was to use the local storage until they could get some fibre channel storage hardware.
All three groups I’ve talked to should have gone more slowly. Start with two servers instead of ten. Start with dual-CPU boxes instead of quad. Start with normal servers instead of blades. Virtualize your less important machines first. Get some iSCSI or fibre channel storage and learn how to use it on a normal server first, before you attach it to your ESX Servers, before you add complexity. Use the free VMware Server first, and if you like it do a free evaluation of ESX Server.
Get it working well on a small scale, and then you’ll know what you need to be successful on a large one.
Almost made the same mistake myself. Told my boss that we were going to virtualize everything, including Him!! Already some HP Blades, and some fiber channel SAN storage. Mostly looked at Virtualization for DR, with server consolidation being second. P2V’d my first server and found out that I did not have enough ram, and was even looking at issues with storage space, also found out that I needed some extra equipment to make the network more redundant, etc… So long, story short, the problem was not the virtualization, it was my lack of planning. This is not something that you grab off the shelf, install, and walk away, for some it has a steep learning curve, but once you create a good plan!!! Virtualization becomes sweet as honey butter!!!
Sage advice. Seems like the most of the successful virtualization projects I’ve heard about have followed along that reasonable path. Even among the IT managers that say that their goal is to run 80 – 90% of their environment in a VM, they’ll also tell you it will take them years to get there. If you want to read a write-up of one such cautious (but successful) virtualization project, check out: Capacity planner limits VMware consolidation ratio. Thanks for the post.
The devil is in the details:
– Demand line item pricing.
– Get everything in writing.
– Validated and supported turn-key configuration from one vendor.
– Buying enough support.
– Training and certifications for multiple staff resources.
It took us 8 months to design, negotiate and implement a two node ESX 2.5.x production farm with redundant storage fabrics and SAN. We avoided blades because they don’t have enough NICs, though we did use a blade for VirtualCenter.
Don’t just buy stuff and expect it to work.
Welcome to my site!
My site describes kinds of earnings by means of sponsors!
Thanks!