Over the last year or so I’ve been fortunate to work with a bunch of great folks over at Coho Data, who are coming out of stealth mode with the debut of their storage product, the DataStream. I’ve got a write-up over at The Virtualization Practice on the device, but I’ve also got a prerelease unit running in my lab, and I’ve been liking it a lot. Neither Coho nor EMC nor Nutanix will like this comment, but if an Isilon got frisky with a Nutanix cluster the DataStream might well be the love child. Scale-out architecture and great software smarts on value commodity hardware.
For both personal and technical reasons I’ve always liked the Isilon products, and they’ve done some great work contributing back to the FreeBSD codebase & community as well as creating a product that scales both performance and capacity. People often seem to forget that storage is measured in two ways: how much you can store and how fast you can get to your data. Adding capacity was always easy, adding performance was harder, and in the face of flash memory the monolithic array model just cannot keep up at scale, on either the performance front or the management front. Adding more arrays just doesn’t scale after a while, either — management is a nightmare. Coho Data takes a page from the Isilon playbook, providing one management interface for a ton of IOPS and capacity.
Regular readers of my stuff also know I like Nutanix, too. They’re doing scale-out, they’ve got a great UI, tons of features, and a real concern for how work gets done with their products. Most of their competitive advantage in the converged infrastructure space is extensive use of commodity hardware with propretary software, especially surrounding storage. My affection is tempered by the idea that most of their storage smarts are done as a virtual appliance, though. I’m not the biggest fan of virtual storage appliances mostly because I don’t like the idea that my workloads are competing for resources with the storage they depend on. Nutanix probably does it the best of anybody, but I still think that dependency maps shouldn’t have loops in them, and in my search for simpler infrastructure I tend towards dedicated storage arrays, connected to dedicated compute. Coho Data is in many ways like Nutanix with their emphasis on great, under-the-hood software on absolutely commodity hardware.
Anyhow, it’s nice to see solid competition emerging in the scale-out, commodity-hardware, dedicated storage array space. Like the examples above, Coho Data scales the DataStream with “MicroArrays,” each of which has a pair of Intel E5-2620 (holy crap, 12 2.0 GHz cores), two 800 GB Intel 910 PCIe SSDs, and six 3 TB SATA disks. The MicroArrays are completely commodity hardware, two per 2U chassis. They network with one or two Arista Networks switches. Aside from dominating the 10 Gbps switching market, Arista has been doing some real interesting things with their switches, like implementing OpenFlow and allowing customers to run virtual machines on the switch hardware itself. Coho Data builds on both these features in order to network their MicroArrays with the clients, and it’s what allows the DataStream to be expanded in 15 minutes or less. Pop in a new MicroArray, it network boots from the VM infrastructure in the switch, and it joins the cluster. Once it’s in the cluster rebalances itself to take advantage of all that fresh CPU, SSD, and disk.
Another big thing I like about the DataStream is the thought that is going into solving non-technical business problems. One of those problems is usability. Judging by their UIs, most IT vendors seem to hate us admins. From vast sets of arcane options in unresponsive, non-standard interfaces (EMC, HDS, IBM) to endless pointless click-through warnings that train admins to just hit ‘OK’ until the day the warning was actually serious (NetApp), I’m really tired of the human error that terrible UIs cause. Especially in the storage space, given it’s criticality. Coho deserves praise for turning the heavy-duty high technology that’s under the hood into something straightforward and easy to use.
Another problem they’re solving is integrated, predictive performance management. “Predictive” isn’t really something most other vendors believe in. They also integrate chargeback & performance management, too. Per-VM performance management is also usually not possible but their choice of NFS as a protocol coupled with VMware vSphere integration allows them to do some things that we’ve only seen before from folks like Tintri. Chargeback is also usually a clunky afterthought that’s little more than a CSV export into some graphing library some intern built, but it’s essential in the cloud models of today’s IT world, and they’ve done a nice job integrating it with vSphere’s Storage Profiles for silo-bashing goodness.
Last, I think this will be a great way for organizations that are all fibre channel now to dip into IP-based storage. The array is turnkey, and you can connect your clients directly to the Arista switches for an IP-based SAN based on increasingly low-cost 10GBase-T. At their stated $2.50 per GB list that’s somewhere in the low $100Ks to get started — I’m computing this based on two MicroArrays providing 39 TB, contact Coho for the real numbers — but that’s nothing for a scale-out array with a great cloud-like feature set. I could see an organization getting one for a tactical deployment and liking it so much that it becomes their new midrange storage.
It’ll be really interesting to see whether the market thinks as highly of these guys as I do. I’ve been loving on them a lot here, but this is a 1.0 product, and there are some rough edges. Snapshot scheduling is evolving, for instance. So are notifications, alarms, etc. These are just software issues, though, fixable with a code update. The product won’t be for everybody, either. One of the big selling points is the price vs. performance that is gained from the mix of flash and SATA. But SATA is still SATA, and the bet is that much of the data you put on this array will be at rest. But if that’s true this product will be a real winner for you, serious performance where you need it, and awesome value everywhere else, with a great set of software features icing the cake.
They sent me the press kit, if you’re interested here are a bunch of shots of the array, the UI, and some graphs illustrating their performance:
 Coho is a type of salmon. Salmon swim upstream. DataStream… get it? :)
 And “how fast you can get to your data” is also measured in multiple ways, of course.
 To be fair there are others doing good work in the UI space, too, like Tintri and Dell. But it’s still uncommon. Do me a favor and complain to your sales people the next time they try to sell you something with a terrible UI. Ask them why you should buy their product when it’s clear they hate you.