Three Thoughts on the Nutanix & StorageReview Situation

Photo courtesy of National Nuclear Security Administration / Nevada Field Office.

Photo courtesy of National Nuclear Security Administration / Nevada Field Office.

I’ve watched the recent dustup between VMware and Nutanix carefully. It’s very instructive to watch how companies war with each other in public, and as a potential customer in the hyperconverged market it’s nice to see companies go through a public opinion shakedown. Certainly both VMware and Nutanix tell stories that seem too good to be true about their technology.

On the VMware side VSAN is new-ish, and VMware doesn’t have the greatest track record for stability in new tech, though vSphere 6 seems to be a major improvement. On the Nutanix side I have always had a guarded opinion of technologies that introduce complexity and dependency loops, especially where storage systems are competing with workloads for resources. I’ve argued the point with Nutanix on several occasions, and their answer has been essentially “well, we sell a lot of them.” I had no real data either way, so it was hard to argue.

As such, you can imagine that I found the StorageReview post on why they cannot review a Nutanix cluster very interesting (link below). I have a lot of respect for Brian and Kevin at StorageReview. Not only are they nice guys, they do a lot of good work supplying useful performance data to customers. They use testing methods designed to reflect real world situations. Not all of us have data centers full of idyllic cloud-ready apps that do 100% read I/O on 512 byte blocks. In fact, most of us in the real world have apps that are haphazardly smashed together by companies like Oracle or Infor, sold to CIOs with lies, kickbacks, and hookers. These abominations are often performance nightmares to start with, and if they’re designed at all it’s for copious professional services and collusion with hardware vendors. I need infrastructure that can run them well (or at least less poorly), and I appreciate a good review with good testing methodologies.

There are a lot of opinions about this article. Here are three of mine.

It Should Have Never Gone This Far

Some industry & vendor folks think that it’s irresponsible to have posted this. I empathize with them. Nobody likes the idea of someone publishing an article like this on their watch, especially during the middle of a nasty war with a huge competitor. StorageReview just armed all the competitors with fresh dirt to throw, and it’s bad.

However, it should have never gone this far. Six months is ample time to fix the situation or work something out in good faith. There are lots of ways to explain performance issues. All systems have tradeoffs, and perhaps NX-OS trades performance for OpEx savings. Perhaps most customers don’t need that level of performance, and the system wasn’t designed for it. Whatever. Anything sounds better than what seems to have happened.

If there are problems, and it seems like there are some big ones, own them and fix them. If you need to know how to do this call someone at Jeep. Between the 2012 “Moose Test” failures (links below) and the recent hacks they’ve had a lot of experience acknowledging a problem, owning it, and fixing it.

Covering Something Up Makes People More Curious About It

Have you ever watched or read Tom Clancy’s “Clear and Present Danger?” In it, the main character, Jack Ryan, advises the US President to not dodge a question about a friend who was revealed as a drug smuggler:

“If a reporter asked if you and Hardin were friends, I’d say, ‘No, we’re good friends.’ If they asked if you were good friends, I’d say, ‘No, no, we’re lifelong friends.’ I would give them no place to go… There’s no sense defusing a bomb after it’s already gone off.”

Why can’t I run a standard benchmark like VMmark on a Nutanix cluster? Why can’t people share performance results? If I bought one of these would I be able to talk about my performance? Why is Nutanix uncomfortable with performance results? Why do they ship underpowered VSAN configurations for comparison to Nutanix clusters? Why do they insist on synthetic workloads? If I buy one of these systems and it doesn’t perform can I return it? What happens if I have performance problems after an upgrade? Can I downgrade? What will it cost to buy a reasonable test system so I can vet all changes on these systems?

We all have a lot of questions now, and that isn’t particularly good for Nutanix or their partner Dell. Great for VMware, great for Simplivity, great for Scale Computing, though.

This Isn’t About Performance, It’s About Support

For me, this whole issue isn’t about performance. It’s about support. It’s about knowing that when I have a problem someone will help me fix it. If a reviewer who was intentionally shipped a system for review cannot get support for that system when they have issues what are the chances I will be able to when I have issues? I already anticipate that, given the fighting, VMware won’t support me well or at all on a Nutanix system. Now I have doubts that Nutanix will be able to make up the difference. Doubly so if I bought an XC unit from Dell.

If you’re in the market for a hyperconverged system you have a lot of new questions to ask. Remember that vendors will tell you anything to get you to buy their goods and services. Insist on a try & buy with specific performance goals. Insist on a bake-off between your top two choices. Ask for industry-standard benchmark numbers. Stick to your guns.

Leave your comments below — I’m interested in what people think.


Comments on this entry are closed.

  • Hi Bob, are you able to elaborate on the comment “On the Nutanix side I have always had a guarded opinion of technologies that introduce complexity and dependency loops, especially where storage systems are competing with workloads for resources”.

    Can you please explain?



    • Sure. When storage is delivered via a virtual storage appliance, or VSA, it sits inside a virtual environment, often logically adjacent to the workloads that reside on that storage. Basically the ESXi host mounts a datastore provided by a VM running on that ESXi host. Hence the dependency loop. Storage doesn’t just require disk, it requires CPU and RAM, too, and where do those things come from? The ESXi host, which would otherwise use those resources for the workloads. So you’ve got a dependency loop and a CPU & RAM commitment.

      In the case of Nutanix they do a lot of computationally expensive things, like erasure coding, deduplication, etc. so the CPU & RAM commitment isn’t trivial. Compare that to VSAN, which uses a kernel module to deliver the storage, thereby avoiding the dependency loop. It also needs CPU & RAM, but they have decided, as a design point, to not do computationally expensive things like deduplication, favoring performance instead. This is one of the major points of the argument between VMware and Nutanix.

      It is worth noting that Nutanix is not the only vendor that does this. Other products like the HP StorVirtual VSA, some Nexenta configurations, etc. also have designs like this. It can work just fine if things are managed carefully.

      • I am glad that someone feels the same way I do, and has articulated better than I ever could. I also had issue with I am buying hardware and a nice chunk of it goes to a VM that is basically a VSA on the box….

  • Great timing on your article. I read that storagereview article yesterday. I also read Nutanix response to Chuck Hollis articles on vsan vs Nutanix performance series. It was very interesting because the Nutanix response blog accused VMware of using unrealistic synthetic tests. Now, we see that Nutanix demands storagereview to use synthetic tests! I was also surprised to read how Nutanix would only allow their own test case and wanted to compete against a lower spec vsan.
    I did leave a comment on that Nutanix blog asking them to comment on the storagereview article. My comment is still awaiting moderation, and I don’t suspect it’ll see the light of day.

  • “This Isn’t About Performance, It’s About Support”
    That’s absolutely the way I look at it.

  • “Remember that vendors will tell you anything to get you to buy their goods and services. Insist on a try & buy with specific performance goals.”

    Disclosure here, I work for Atlantis Computing and we provide software to allow HC storage on generic h/w with a VSA solution, it’d be interesting to have a wider conversation about “feedback loops”: we can do that later.

    I stand firmly behind your main points. There is a key part imo, where you call out “with specific performance goals” – that’s a great suggestion and not done as often as you’d expect. I’d slip in support requirements, training as other things to ask.

    That said, I’d tend against “industry standard benchmarks”, as I’d much rather propose “industry standard testing suites using your environment (i.e. h/w, VMs)”. Your experience/insight on that would be useful.

  • For me- I have a slightly different view: “This Isn’t About Performance, It’s About Transparency”.
    Let’s use a different example: a manufacturer says their car will attain 50MPG under ‘normal’ use, and I buy it get 20MPG. I am going to be dissatisfied in that -60% performance? – YES. Their support- whether it’s free oil changes or car washes isn’t going to erase the violation of trust. In the data center, often it’s our technical reputation that is at stake/ tied to technology recommendations that are made. These impact people’s careers.

    Organizations like Storage Reviews are important for emerging technologies and trends. We hope they are fair and unbiased, they tell the whole story and are careful with the not insignificant power they wield. I’d guess there were many internal discussions around whether to post that article. After all, Nutanix does sell a lot of boxes. In the end, I think SR made the right choice. There’s nothing to stop Nutanix from completing a test run in the future.

    As for your your other point: I do agree that VMWare, Simplivity, Scale Computing and Pivot3 are all licking their chops.

%d bloggers like this: