RSS Feed for FeaturedCategory: Featured

VMworld 2010: Hands-On Labs »

This morning John Troyer coordinated a bunch of bloggers for a session over at the VMworld 2010 Hands-On Lab facilities in Moscone West. Adam Zimman, Dan Anderson, and Curtis Pope took turns explaining and demoing the lab to us. The lab itself was built as a cloud-oriented system, using software-on-demand and service-on-demand principles, and relying heavily on remotely-hosted equipment in data centers in Miami, FL (Terremark) and Ashburn, VA (Verizon).

The Lab team is really building on what they’ve learned from other years. There are many more labs this year than last, and they’re all self-paced, though there are options for instructor interaction as well if you have questions or want more one-on-one guidance. Self-paced labs means they can do almost unlimited content, and it’s easier to get lots of people through the labs. Last year they had, all totalled, about 7000 lab seat hours. This year they have almost 20,000, with 480 View stations in eight rooms. Dan Anderson, the lab’s lead architect, had some proud things to say about what they’ve done. “The content is killer, the best content I’ve seen yet. If someone sits for four days, eight hours a day, they might be able to get through all of them. But nobody can complain about not having enough stick time,” said Dan.

Perhaps he’s never met some of the curmudgeonly people that attend VMworld. :) But I really appreciate the iterative approach they’ve taken this year to making the labs better. For instance, they learned that pre-registration for the labs didn’t work very well in other years, so it’s all first-come, first-served (FIFO). There’s a check-in station that works with your badge number, and a waiting room with couches and whiteboards and Subject Matter Experts while you wait. The labs will be open from 8 AM until 10 PM every day, too, and they will be offering a prize to “dedicated individuals” (they thought speed and quantity might be the factors, but it isn’t set in stone). They did say the prizes would be something like a pass to VMworld 2011, though, which is very cool.

The hardware and software powering the lab is pretty amazing, with a number of sponsors contributing staff, equipment, and software to make it run. Sometimes on very short notice, too. And in some cases this lab is the largest deployment yet of these technologies. They’re pre-populating lab environments with instances of each lab setup, to avoid the on-demand 5 to 7 minute wait from last year, which is great. They’re worried that they’ll have the prepopulation levels off a little on the first day, but even if you do get caught waiting you can still read the manuals. They estimate that the labs are using roughly 36 TB of RAM (yes, TB) and there’s about 200 TB of storage, between EMC and NetApp, in each data center powering the labs, all connected via NFS. The storage itself is everything from enterprise flash (EFD) to SATA, with the EFD often being used as FastCache to front-end the slower storage.

The stations themselves are Wyse thin clients, with dual monitors and even dual chairs, even though it’s geared for one-on-one learning. It’s all about flexibility and options, which extends to the content itself — the vSphere Sandbox lab is just a deployment of all of their products, for freeform messing around. The Lab team even has redundant wiring to the lab stations, just in case they need it (“We even have redundant chairs!” said Adam). They’re flexible, they’re ready, and they’re hoping that they can set records for the number of happy people in the labs this year. And if you’re not happy, there’s 150 staff floating around to help you out, as well as two lab captains per room.

I’m looking forward to it — labs have always been a highlight of VMworld for me, and these guys are making it even better. I know it’s a lot of work to build, in two months, what usually would be done in a year or two (and then, as Dan said, “throw it on a truck.”). On behalf of all of us, thank you Adam, Dan, and Curtis (and all the others that we didn’t meet). I hope all your hard work is a giant success!

VMworld 2010: Saturday »

A quick walk past Moscone on Saturday yielded a real life “401: Not Authorized,” most likely because it’s “501: Not Implemented” yet. So we continued on down to AT&T Park to watch the Giants lose to the Diamondbacks, 11-3 (it was 6-1 after the first inning, Barry Zito was having a real bad night). At least we got Joe DiMaggio bobbleheads. Not much else to do, it looks like a bunch of folks led by the inimitable Veeam guys went to the Chieftain. My coworker Steve and I opted for an early turn-in, to hedge against the upcoming late ones. Below are some photos.

On tap for Sunday: a sneak preview of the Labs this morning, brunch with a bunch of the Communities guys, registration and speaker setup at 2 PM, and then nothing. The vmunderground.com WUPaaS is tonight, but I missed the invites, so I might try sneaking in. Otherwise I’m sure other things will be going on. I do need to work on my presentation a bit, and perhaps a trip over into the Sunset district for some excellent Vietnamese at Pho Phu Quoc is in order, followed by a walk down Irving to the beach and the N Judah home.

I hope you’re all traveling well. Stay safe, and remember that if you’ve got a question or are looking for something to do #vmworld is a good start on Twitter. You can also message me, @plankers.

Rebel Alliances »

Stephen Foskett has a great post today on The Enterprise IT Acquisition Game, wherein he talks about how it’s open season on data storage companies, with a lesser emphasis on networking:

So this is the game: Four full-line enterprise superpowers battling each other for datacenter dominance and coveting the extra profits of a few verticals. HP clearly believes they can chip away at EMC and Cisco in storage and networking; Dell and IBM have so far focused mainly on storage; and Oracle hasn’t made a move in either direction, instead challenging the other three in the core server and software space.

Right on, especially with the “coveting the extra profits” part. For years, Dell, IBM, and HP have been busy commoditizing the compute node side of things. They’ve been driving the prices down on CPU and RAM for so long that the margins aren’t there anymore. And now, with the onset of widespread virtualization, the volume is no longer there to make up for it. However, storage is still a high-margin endeavor, and probably the single most expensive thing in my data center; I have $400K in CPU & RAM connected to $2m in storage. It is not surprising that there is a huge bidding war for 3PAR, and an open season on data storage companies. It’s one of the best ways for big companies to continue bleeding IT budgets, at least until SSD capacities rise and prices fall, and it does to storage what virtualization is doing to servers.

The thing is, virtualization is all about driving costs down in IT, and I just don’t see these big consolidation efforts as an actual way to do that in the long term. EMC has been downright nasty with my organization because they thought we were locked in. Cisco has done the same, and Oracle is generally regarded a bunch of jerks. Why would anybody choose UCS, VBlocks, or an all-in-one option from Oracle knowing that you’re locking yourself in? Even if the pricing is competitive right now, it won’t be for the fourth or fifth year of support where they think they’ve got you trapped.

Stephen’s last paragraph speaks of rebel alliances:

Although I would love to see a rebel alliance rise (imagine Juniper, NetApp, and Symantec joining forces!) this is not a likely scenario.

Like Stephen, I would enjoy seeing a rebel alliance, but I doubt it is likely, for two reasons. One, companies like buying from a single source, because they can get support guarantees, the purchasing process is a lot easier. Second, consolidated efforts like UCS and Vblocks are about software as much as they are about hardware. The promise of a true single pane of glass is very appealing.

If you look at some of the rebels in the software world, like Automattic and Flickr, you’ll notice something: their openness. In both of these cases they allow customers to do whatever they want with their own data. Want to pick it up and leave? That’s fine by them. Contrary to the thinking behind the sales models of EMC and Cisco, though, those companies have great customer retention rates. Because the fundamental idea that their customers are free to choose colors everything they do, they spend their time looking to make customers happy, not lock them in. And as a result you get products people want to use, and not products that are just a little less evil than the other guy.

Seeing that a true rebel alliance is unlikely, I’d love to see Dell become the open, rebel alternative to companies like EMC, Cisco, and Oracle. Even going so far as to develop a “Dblock” based on open technology, sold in increments, delivered in 19″ racks, managed through a single pane of glass, and marketed as the antithesis to vendor lock in. A Dblock could have Dell servers in it, Equallogic/3PAR storage, and Xsigo interconnects (imagine an array full of SSD and quad-rate Infiniband connections), and be very compelling. Especially if their answer to a question like “Can I attach my NetApp to it?” was “the ports are on the top left.” I’d be happy to do business with a company like that.

It’d just need one little twist: a change in attitude.

Three Organizational Decisions That Help Me Virtualize »

Over the last ten years my organization has come a long way with its IT policies and processes. We’ve gone from the wild, wild west of IT where personal heroism ruled the day, to a place where there’s just enough process to make sure that communication happens correctly and things like our Configuration Management Database (CMDB) stay up to date. It’s been a lot of work, but I am actually really proud of where we’re at.

There are three fundamental decisions we made a long time ago that, had they not been made, would have drastically changed how virtualization has proceeded here.

1. Clearly defined maintenance windows.

Knowing exactly when someone can do maintenance on server has been crucial to getting things done in our virtualization environment. There are many adjustments you can & should make in virtual environments, but if you can’t ever take the VMs down to make the changes you’re stuck. We’ve been able to do physical to virtual migrations, performance tuning, VMware Tools upgrades, vSphere upgrades, and a whole slew of other things in relatively short timeframes because we have this all worked out already. This also lets us “right-size” our VMs — rather than deploying huge VMs just in case they need the CPU or RAM, we deploy smaller ones and then can take an outage to add CPUs and RAM if we need to. The maintenance windows for a server are negotiated between the application/service admins and the system administrators when a machine is put into production, we track it in our CMDB, and any member of the whole team supporting the service can take the maintenance window, as long as they follow some rules about notifications for the change (timeframes, etc.).

2. Use of load-balancing technologies.

We use application load balancers (layer 4 of the OSI model) to decouple services from individual servers. Not only does this allow us to take a host down without affecting a service, but it also lets us spread the load out more among the physical hosts we have in our virtual infrastructure. In a lot of cases having more, smaller VMs results in better workload scheduling by ESX and DRS, especially on smaller ESX hosts.

Of course, this also plays nicely into the other points, because it’s very liberating to be able to do what we call “rolling maintenance” on a service, just taking one machine down at a time so that customers are not impacted. It also means that system administrator quality of life goes up, for now we can do maintenance tasks during the day instead of on weekends and off-hours. Doing maintenance during business hours has a couple of benefits. First, it means that the maintenance will actually get done. If you try to use someone’s personal time to do work they tend to opt out of that work. Servers go unpatched, tuning doesn’t happen, lots of things that should get done don’t because people will choose their personal time over work. Second, it means that if something goes wrong there are others around to help out. Doing work at 5 AM on a Sunday is fun, but if things go sideways you have to wake someone up or try fixing it yourself. Doing work during the day means you have the rest of the team around to lend a hand.

Third, it gives you a way to make incremental changes and then watch the effects. This has been particularly awesome for performance tuning of applications and our virtual environments themselves. Testing tuning changes is often hard, because test suites and test load generators are synthetic and often don’t compare to real load. But because the load is spread out we can make a change to one VM, or one ESX host servicing one VM, and keep an eye on it. I’m not advocating being a complete cowboy — you still have to do testing — but the risks to your production environment are a lot lower if you can catch problems on one VM first.

There are usually some other benefits to load balancers, too, that make them virtualization-friendly. Many will offload SSL processing, so your VMs have less work to do. Others have features, like iRules in F5′s products, that let you rewrite network traffic on the fly, which has some really neat implications for security, monitoring, and service delivery. And if you don’t want to buy a piece of hardware you can often get a virtual appliance from these vendors, though the physical appliances are usually a lot faster.

3. Commitment to operating system and application patching.

It is a fundamental belief of mine that one of the best ways to stay secure is to keep up on your patching. My organization agrees, and by using load balancers and defining maintenance windows we’ve made it easy for ourselves to keep our hosts up to date with regular patching cycles. Because we can take servers down without taking services down, and because sysadmins know exactly when a server can come down, we can schedule maintenance cycles easily, whether it’s six months out or two weeks. We can also respond very rapidly to emergency situations, like recent remote execution vulnerabilities in Microsoft Windows, by rolling patches out to development & test hosts, then QA & production, over the course of just two days if needed.

Keeping up to date with patches not only keeps you secure, it also lets you take advantage of new features that are added to operating systems. For example, Red Hat keeps adding new virtualization-friendly features, like kernel interrupt clock dividers. Being a kernel parameter you can’t just change it on the fly. And if you have to reboot, but can’t get a time to do it, you won’t do it. For us, we just rolled the change into one of our patching cycles and reduced the load on our infrastructure dramatically. Meaning more VMs per physical host, and a quantifiable amount of savings from just a small change on each machine.

Furthermore, our commitment to patching also extends to the virtual infrastructure itself, and we have a rule that we will not implement anything that breaks vMotion or Storage vMotion. Why? Because then it becomes very difficult to cope with ESX updates, or hardware failures, or any situation where vMotion could be used to prevent an outage. Sure, this means that we still need physical hardware for some applications, but it’s still just a fraction of the hardware we were buying years ago. This also makes virtual infrastructure easy to upgrade when the time comes, for new versions of vSphere, new storage arrays, and new physical hosts. Instead of planning outages on hundreds of VMs we just vMotion them, and nobody is the wiser.

Disclosure: F5 is a sponsor of Gestalt IT Tech Field Day, of which I have been a participant. I am not a customer of F5 at this time, though.

Youth »

“Hey, do any of you guys have an old, full-height hard disk lying around?”

This was a relatively new person from another group in our organization. People occasionally come looking for random old equipment to use for training & examples, because they know we have things like original IBM PCs, Cisco AGS+ routers, token ring MAUs, and 1200 baud Multitech Multimodems on hand.

“Sure, I’ve got a full height drive, one second.” I produce a full-height 600 MB Imprimis SCSI disk. Made in the USA, so it’s pretty old. It’s a bookend on my bookshelf.

600 MB Imprimis SCSI drive, full height

“What in the heck is that?” he asks.

“Um, a full-height drive?” I reply, really wondering what he thinks he’s asking for.

“No, man, I don’t know what that’s out of but it’s wicked. Full-height is like a couple inches tall, though.”

“And 3.5″ form factor, right?”

“Yeah.”

“Dude, that’s half height.”

“Nah, that’s full height, at least that’s what I’ve been told. So what is that?” he asks, as he points to my impressive specimen of early 1990s drive technology.

“What you were looking for is half height. This is a full height 600 MB SCSI fixed disk. Final answer.” I hope he didn’t learn full height vs. half height from someone he paid.

“I’ve never seen one of those before. Can I borrow it? The other guys will flip out when they see this thing.”

I wonder what they’d think if they saw 8″ floppy disks. Freakin’ kids.

Get Away to VMworld Contest »

If you were thinking about going to VMworld but can’t make it due to finances Gestalt IT is running a contest where the winner gets airfare, hotel, and a conference pass. This is a very compelling contest, and to win:

Entrants must explain how they plan to “pay it forward” if they get to go to VMworld. Will you start a blog? Write some tutorials? Contribute to a forum or online community? Present to your local VMUG? Get creative and spread the wealth of knowledge you get from the event!

Our panel of judges is made up of none other than the most-excellent roster of past Tech Field Day delegates! They’ve proven themselves to be independent-minded and knowledgeable, and we’re sure that they will pick the best entry!

I’ll be one of the judges — I’m looking forward to seeing some really great proposals! Go over to Gestalt IT for all the details.

Oregon Trail: The Movie »

One would think this is off-topic, but for techies it really isn’t. I present to you the official trailer for The Oregon Trail. Very well done!

Makes me proud to have grown up in Minnesota (MECC == “Minnesota Educational Computing Consortium”). If you want to play it Classic Gaming has it and a copy of AppleWin.

Remembering Swap »

VMware vSphere users should always remember that when you allocate a VM the amount of space it will consume on disk includes a swap file equal to the size of the VM’s allocated RAM.

So if you have 96 GB of VMs running you will use 96 GB of your disk as swap, even if those VMs are only actively using 2 GB of RAM.

Yet another argument against overcommitment, in my opinion. If you right-size your VMs you not only save RAM, but storage as well.