Consistency Is the Hobgoblin of Little Minds

Ever heard someone tell you “consistency is the hobgoblin of little minds?”

They’re misquoting Ralph Waldo Emerson by leaving out an important word: foolish.

That’s like leaving out the word “not” in a statement. The whole meaning changes because of the omission. We can all agree that “I am on fire” and “I am not on fire” are two very different statements. The same is true here. Let’s examine the actual quote:

A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. With consistency a great soul has simply nothing to do. He may as well concern himself with his shadow on the wall.

As with most things, context matters, which is what makes this quote inappropriate almost everywhere I see it used. In the context of IT, with consistency, a great soul can trade meaningless & soul-crushing work for important & strategic tasks, moving their organization forward rather than struggling just to keep up. In IT, consistency is generally a good thing, and when it is delivered via standards and automation it forms a stable & solid & predictable foundation on which we can build towering pinnacles of applications and services. Stability and predictability are important things to app developers, end users, and those of us that want to take a vacation from time to time.

However, there are foolish consistencies. When your automation becomes handcuffs and not an enabler it’s foolish. When your standards are held so tightly that you cannot enable new business ventures because of them it’s foolish. When efficiencies aren’t taken advantage of, new technology and methods eschewed, and/or positive changes avoided with the excuse of “standards” or “that isn’t the way we do things” it’s foolish.

Neither standards nor our tools are an end unto themselves. They exist to enable greater things, and when that stops being true we need to change them so they are helpful to us again. After all, having two standards is still better than having 1500 one-offs.

 

Advice On Downgrading Adobe Flash

VMware has a KB article out (linked below) about the Adobe Flash crashes that happen if you’re running the latest version of Flash (27.0.0.170). A lot of us were caught off guard recently when our PCs updated themselves and we couldn’t get into our VMware vSphere environments.

The VMware KB article suggests downgrading your Flash client. Left by itself this is completely irresponsible advice.

1. The Adobe Flash update addresses a critical security vulnerability that is being exploited in the wild. The security advisory (linked below) states:

Adobe has released a security update for Adobe Flash Player for Windows, Macintosh, Linux and Chrome OS. This update addresses a critical type confusion vulnerability that could lead to code execution.

Adobe is aware of a report that an exploit for CVE-2017-11292 exists in the wild, and is being used in limited, targeted attacks against users running Windows.

(as an aside, Adobe acknowledges Kaspersky Labs staff, which makes me think that they’re making good on their promises to figure out how Russian hackers used their software to exfiltrate NSA data).

2. If you downgrade your Flash installations you will need to disable the auto-updaters, which is what got us all into these situations. I don’t know about you but I always forget to re-enable the updaters, and that’s bad.

3. There are workarounds. The HTML5 client, though incomplete, gets many people back in business. Microsoft Edge and Internet Explorer seem to work with Flash on Windows 10 1703, too, at least for all my team’s environments.

So what’s my advice?

  • Limp along with Microsoft Edge and the HTML5 client until VMware updates their clients. I think it’s safe to assume they’re working on it. Start making plans to patch your vCenter in the next few weeks.
  • If you don’t have the HTML5 client you can get it as a VMware Fling (link below).
  • If you absolutely have to downgrade Flash don’t run the vulnerable Flash on a PC you use for anything else. It’s annoying but you can survive a few weeks of this, provided you’re running a supported version of vSphere.
    1. Use network- & host-based firewalls to prevent all traffic that isn’t destined for your vSphere implementations. You’ll probably need to allow DNS, as well, but I’d keep it really locked down. I would even think twice about joining it to Active Directory.
    2. You should already be running antivirus and antimalware on your systems but it’s especially important for systems that are intentionally out of date.
    3. Use a virtual machine running in VMware Workstation for the insecure client. Make it non-persistent and use it for nothing else. Or a Windows Server installation with Terminal Services enabled.
    4. Put a calendar reminder in for your team to clean this whole thing up in a month.
    5. If you have dedicated IT security personnel (CISOs and such) reach out to them proactively. Make a business case around this — you need to do this to be able to support the environments, but you’re being responsible about the risk.
  • If you’re running an unsupported version of vSphere you need to upgrade ASAP. This is a great business driver for it. Never let a crisis go to waste! vSphere 5.5 goes end-of-support on 09/19/2018 so I’d even consider using this as a driver to get to 6.5…

Good luck & stay safe.

———————-

Stop Chrome Autoplay

If you didn’t catch this on Twitter:

In short, go to chrome://flags/-policy and set it to “Document user activation required.”

It’s funny how simple things can be so virally popular.

While Chrome can sync settings between browsers where I am logged in, I have got to figure out if there’s an API to set Chrome configuration options automatically…

Software is Always Broken

Life PreserverI’m sitting here watching my iPhone update to iOS 11.0.1. Apple says that there are just a couple of fixes: some security updates and a fix for the Exchange email problems.

The update is sure taking a while, though. That’s consistent with my knowledge of how software development works. Color me skeptical that the first point release of a new iOS only has a couple of changes. My bet is that there are hundreds of fixes for all sorts of problems reported during the beta, but weren’t large enough to stop the release.

Development of software like Apple’s iOS or VMware’s vCenter never stops. At a certain point someone takes a snapshot of the way it looks and decides that it acts correctly enough to ship. Out the door it goes, called 11.0.0, or 6.5.0, or whatever.

Behind the scenes development never stops, though. People keep reporting bugs and the development groups keep fixing them. Most of the problems they fix will never be known to the end users. If you had a bug open with Apple or VMware you might find out that it’s fixed in a particular release, but without that particular inside knowledge you’d never have known.

What you do see is build numbers and version numbers, though, assigned to the snapshots that became releases.. Read the notes for a point release of VMware vCenter and there will be a handful of problems resolved and one or two new features added. The build number of vCenter will jump by 387,000, though. Nearly 400K builds for a few changes? Yeah, right. In this particular case (vCenter 6.5.0 build 5318154 to 5705665) that’s around 6500 builds a day. Something else is clearly going on.

To me, these are the ripples on the surface that belie the chaos underneath. It’s nice to think of products like iOS or vSphere as a glassy lake our IT boats can glide across, but it is more productive for us to remember that under the surface are the software equivalents of fallen trees, sandbars, litter, and dead fish. The people with the nets pulling the dead fish out sometimes fall in, too. Sometimes we get new software bugs in the fixes for other problems.

That’s why I always laugh at the people who say “if it ain’t broke don’t fix it.” Truth is that it’s all broken and we’ll never know just how bad the situation really is. The best thing we can do is move forward. Keep current on our patching, stay responsibly close to the current major version. Build and use test environments. Grow into new initiatives and deployments. Test our backups and DR.

And wear personal floatation devices for when our boat hits a rock.

 

 

Calibrate Your Monitor

When I build a new computer one of the things I do as part of the setup is calibrate the color of the monitors. It’s actually pretty amazing how much better things look after just a few minutes of adjustments. It’s also nice to have the monitors synchronized, so if I move a window between them it doesn’t change color.

I use Microsoft Windows 10 (1703 as of this writing) on all my desktops now, and here’s my process. Apple appears to have a similar calibration tool built in to MacOS, and all my Linux boxes are headless, so you are on your own. Sorry.

1. Reset the monitor(s) to factory settings. On my Dell LCD panels there’s a “Reset Display Settings” and a “Factory Reset.” Go for Factory Reset.

2. Set the monitor energy mode to the one that doesn’t have automatic dimming. On the Dell panels it’s called “Standard.” Automatic dimming is nice on a laptop, tablet, or phone but not on the desktop. If you’re worried about power consumption use the system power settings to put the display to sleep when it’s idle.

3. In Device Manager, make sure that your monitor has the best driver. Right-click the monitor, choose “Update Driver” and tell it to search automatically. Alternately you can download the driver from the manufacturer, but manufacturers don’t really focus on keeping monitor drivers updated. For example, Dell has a Windows Vista-era driver for the monitors I’m looking at now, and it comes with an ICC color correction profile. Your mileage may vary.

Windows Device Manager - Calibrate Your Monitor

4. Do the same for your video cards. I generally prefer the MS drivers because they’re automatically updated with the OS, and don’t need the heaps of extra spyware crap NVIDIA and AMD install on your PC. I also don’t game a lot, either, so YMMV. Anyhow, a driver change tends to screw up the calibrations, so now’s the right time for a change if you’re going to make one.

5. Disable Night Light (Settings -> System -> Display -> Color -> Night light). A huge shout out to the f.lux folks that started this all).

6. Go to “Display adapter properties” at the bottom of the Display settings. Click the Color Management Tab, and click the “Color Management”button.

Display Adapter Properties - Calibrate Your Monitor

7. Click the “Advanced” tab, then the “Calibrate Display” button about 3/4ths of the way down. This is where you will get to actually calibrate your monitor. If you installed a monitor driver that came with a ICC profile you might want to try setting it in the “Device Profile” dropdown, at least to see if it helps.

Color Managaement Advanced - Calibrate Your Monitor

8. Follow the instructions precisely, and repeat for every monitor you have. It might offer to start the ClearType tuner. It’s a good idea.

9. When you’re done turn Night Light back on, or install f.lux. In either case (and on my tablets and phones) I adjust the color temperature back up a little from the deeper reddish hues it sets by default.

10. If your monitor supports it set the “Menu Lock” set it. This will reduce wear & tear on your tasers and stun guns, which I consider an appropriate response for anyone who tampers with one’s monitor settings, or touches someone’s screens. You might want to look up how to unlock it first (usually just holding the menu button for 15 seconds or so).

As always, good luck.

Let’s Prosecute Unlicensed Engineering in IT

Have you been watching this whole dustup with the Equifax CISO, and how people are saying that she is unqualified because, instead of a Computer Science degree, she had an MFA in music composition? Not surprisingly, there’s a massive backlash from the IT community, much of which doesn’t have a computer science degree, either. That’s part of the appeal of technology for many — on the Internet nobody knows you’re a dog. I’m a mutt, too. I’ve always found computer science programs intentionally inaccessible, with the faculty actively eschewing any form of practical curricula because they’re not a technical college. Snobbish? Yeah. Not my style.

What I find very interesting in all of this is the ignorance of some of the folks throwing rocks in this debate (surprising, I know). I’m particularly interested in a word that many have in their titles: engineer.

I live in the United States, in the state of Wisconsin, and as with many other states we actually have law that sets a legal definition of engineering:

WI 443.01 (6) “Practice of professional engineering” includes any professional service requiring the application of engineering principles and data, in which the public welfare or the safeguarding of life, health or property is concerned and involved, such as consultation, investigation, evaluation, planning, design, or responsible supervision of construction, alteration, or operation, in connection with any public or private utilities, structures, projects, bridges, plants and buildings, machines, equipment, processes and works. A person offers to practice professional engineering if the person by verbal claim, sign, advertisement, letterhead, card or in any other way represents himself or herself to be a professional engineer; or who through the use of some other title implies that he or she is a professional engineer; or who holds himself or herself out as able to practice professional engineering.

Does the work of an information technology professionals affect the public welfare? I think the Equifax case is example enough: yes. Do IT professionals safeguard life, health, or property? Absolutely. In fact, information technology touches most of what is defined in this statute.

It gets more interesting in 443.02 (02):

WI 443.02 (02) No person may practice architecture, landscape architecture, or professional engineering in this state unless the person has been duly registered.

443.04 and 443.09 cover how you become registered. In short, you need an engineering degree from a school accredited in such topics, and you need to pass a written or written & oral examination.

So what happens if you don’t do this, and call yourself a professional engineer anyhow?

443.18 Penalties; law enforcement.
(1) Unauthorized practice; penalty.
(a) Any person who practices or offers to practice architecture, landscape architecture, or professional engineering in this state, or who uses the term “architect,” “landscape architect,” or “professional engineer” as part of the person’s business name or title… may be fined not less than $100 nor more than $500 or imprisoned for not more than 3 months or both.

A man in Oregon was recently fined $500 for doing just this: claiming to be an engineer in a public forum despite not being licensed to practice in Oregon. Specious? In that case, maybe. But if we’re going to throw rocks over CS degrees let’s get serious and enforce the actual laws around engineering, too. Especially if you are designing, building, and/or operating systems that affect others. If one type of engineer isn’t allowed to build an unsafe bridge, skyscraper, or canal, has a mandatory code of ethics, and has to go to hell and back to be licensed why do others with none of this training or effort get to use the title while they build & operate unsafe and unethical systems?

Frankly, it’d be easy to find offending folks. Start by dredging job boards and looking for postings that have names like “Site Reliability Engineer” or “Network Engineer” or “Systems Engineer” or “Software Engineer.” I bet even Equifax has some “engineers.” From there, it’s a subpoena or two and a $500-a-pop form letter. Maybe even some injunctions against organizations, too, for their use and tolerance of unlicensed engineers.

A small price to pay to ensure unqualified people aren’t endangering the public welfare. Right?

Help Me Name a Team, Win $50

I know all you nerds have a few minutes to help me crowd source the solution to a problem. I’m also willing to offer $50 in the form of an Amazon gift card to someone with the right solution.

We want to rename our group at work. We are currently “Systems Engineering” (abbreviated “SE”). There are a number of decent reasons why we want a different name, and I’m not going to get into that. Let’s just suffice it to say that the change is definitely going to happen in the next week or so.

My group designs, builds, and operates compute & storage infrastructure, for all sizes of customer, and all manner of security level (basic all the way to NIST 800-53 “High”). We provide traditional system administration services. We do VMware, Linux, Windows, Solaris, AIX, Mac OS, everything. We also install & maintain applications for customers, and specialize in automation technologies (Puppet, etc.). We often serve as the people who bridge gaps between other groups, and between vendors and the organization, and are often the glue that holds a solution together.

We don’t do anything but the basics of databases or networking, other groups hold the deep expertise there. Nor do we manage facilities. There is also a separate CISO-led security group so they provide a lot of the policy that drives our designs and operations. And other, large, enterprise-y applications (ERPs, email & calendaring, identity management, etc.) have their own support teams.

Another group formed where I work and calls themselves “CAOS” — pronounced as chaos — and we’re a little jealous of that. We’d like a clever name, too, especially one that sounds like supervillians. So far the best name put forth is simply just “Systems & Application Engineering” or “SAE.” It isn’t bad but I think there’s better out there. At least with SAE we can make viscosity jokes about slow projects for the next decade.

My question to you is: can you come up with a clever, HR-safe acronym & name for this team that reflects what we do?

Put your suggestions in the comments. Quality is good but volume works, too — others might riff on your ideas.

If we pick your suggestion I’ll personally send you a $50 gift card to Amazon (this is me doing this, not my employer). If you aren’t in the USA I’ll see if I can do it in the currency of your choice. If a suggestion is made twice the first one wins it. If no suggestions are chosen I will put all the suggestions in a pool and draw one randomly for the prize, as a thank you for participating. Please make sure your email is correct in the comments!

Ready… go!

Fix WinRM Client Issues

My team manages a lot of Dell hardware. Over the years we’ve run into situations where we have to replace the system board on a host. The system board’s management interface, iDRAC, has a license key on it, and when you replace the system board it’s helpful if you can export the license key ahead of time. That way you can reimport it again easily without getting your sales team involved to reissue a key.

Unfortunately sometimes that’s not possible, such as when the iDRAC management interface is what died (my case today). Turns out that Dell has the “Dell EMC License Manager” (get it from support.dell.com under the Systems Management downloads for your hardware) which you can proactively take a copy of your licenses. Seems like a good idea, except I ran into arcane WinRM client issues due to security settings others had applied, and the Internet wasn’t very helpful. Maybe this will help others.

You get an error “The WinRM client cannot process the request. Basic authentication is currently disabled in the client configuration. Change the client configuration and try the request again.”

Open a PowerShell prompt as Administrator and run:

winrm set winrm/config/client/auth '@{Basic="true"}'
winrm set winrm/config/client '@{AllowUnencrypted="true"}'

That’ll either work or…

You get an error “The config setting Basic cannot be changed because is controlled by policies. The policy would need to be set to ‘Not Configured’ in order to change the config setting.”

Edit your Group Policy (run gpedit.msc as an Administrator). Local Computer Policy, then Computer Configuration, then Administrative Templates, then Windows Components, then Windows Remote Management (WinRM), then WinRM Client.

Check to make sure “Allow Basic authentication” and “Allow unencrypted traffic” are set to “Not Configured.”

Repeat with the WinRM Service GPO if you’re having issues with incoming connections (see below).

Run “gpupdate /force” from a command or PowerShell prompt once you’re done editing.

It is also possible that the GPO settings are coming from an Active Directory. Fixing that is left as an exercise for the reader.

If you’re trying to configure incoming WinRM I found this helpful post and suggestions which led me to my fixes above:

winrm quickconfig -q
winrm set winrm/config/winrs ‘@{MaxMemoryPerShellMB=”512″}’
winrm set winrm/config ‘@{MaxTimeoutms=”1800000″}’
winrm set winrm/config/service ‘@{AllowUnencrypted=”true”}’
winrm set winrm/config/service/auth ‘@{Basic=”true”}’
Start-Service WinRM
set-service WinRM -StartupType Automatic

I can’t vouch for the security of this, and I’d definitely wrap it in a firewall on the host and the network, but I mention it in case you find yourself here in a search. The GPO solution works for the service, too.

Good luck.

The Dangers of Experts Writing Documentation: A Real Life Example

There are some real, tangible dangers to having experts write documentation. Experts have the perfect tools, skip steps, know where things are based on experience, use jargon, have spare parts so mistakes aren’t a big deal, and as a result make terrible time & work estimates. This leads to confused, and subsequently angry, people, which is probably not what you wanted.

I was thinking about all this as I entered my fourth hour of installing a trailer wiring harness on my Mazda CX-9 today. It’s a unit from Curt Manufacturing, kit #56016. When my CX-9 was in the shop for an alignment a few weeks back I had them put a hitch on it. They got squirrelly & weird when I mentioned installing the wiring harness, though, and I decided that I could just do it myself some afternoon.

The documentation in the harness kit is horrible, the written equivalent of “just wire it up, duh.” Luckily they offer a YouTube video. Since I don’t do a lot of trailer wiring it seemed prudent to take five minutes and watch it.

One hour? That’s cool. I’m also wondering what “proper safety equipment and precautions” are. I guessed that their lawyers made them keep it vague so the onus was on me to figure it out, and absolve them of any liability. Whatever, I’ll figure it out.

I notice their tools have air hose couplers on them, and aren’t quite what people would have in their garage. These also aren’t the tools that were listed in the installation documentation. I pause the video and spend 10 minutes digging out my cordless Dremel, my right-angle drill (yeah, I actually have one) & my bit index, and 5 more minutes trying to find my 10mm socket which is the social butterfly of my socket set. Total time elapsed so far: about 20 minutes.

I also notice that the car in the video is on a lift, but they omitted a lift from this list of tools. This becomes significant later.

Begin by opening the back hatch. So far, so good.

Remove the floor coverings, storage trays, and rear scuff panel. The woman in the video pops all that stuff out in 5 seconds on the video. Things she didn’t cover include the mystery of the fasteners keeping the storage trays attached to the chassis, five additional panels she didn’t remove but are in the way for me, and the subwoofer back there with its 400 bolts. Back to the web, and luckily eTrailer.com has a 10 minute video on this same topic and doesn’t skip these things. If you ever write documentation these are the sort of details that really matter. Total elapsed time: 50 minutes, and that’s with the help of my Milwaukee impact driver, which also wasn’t on the tool list.

Disconnect the negative battery terminal. Done. 55 minutes.

Remove the rear cargo loops & Phillips screws. Luckily the other video I found had advice on this, because the woman in the Curt video took 10 seconds to pull these things out of their CX-9. I’m up to 65 minutes.

Disconnect the taillights and insert the wiring harness connectors. This seems straightforward but in the video the rear vehicle trim stays popped out so the woman can work on it. I conclude she must have a prehensile tail.  Us normal humans use a scrap 2×4 cut to hold the panel open. Elapsed time: 85 minutes. Note that “miter saw” and “2×4” are not on the list of tools.

Route the green wire over to the other side and hook it up. The other video had good advice on this, too, as there were nuances completely skipped by the Curt video, and by the time you’d realize that you’d have everything all closed up already. 95 minutes.

Grind off some paint, drill a pilot hole, and use the included screw to connect the ground wire to the chassis. “Be mindful of what you drill into and what is behind it.” No kidding. I love my Dremel, by the way. If you get a Dremel get a cordless one with a lithium-ion battery (so it holds a charge over time), the engraving handle accessory (I never take it off), a combo pack of bits, and extra cut-off wheels. You will be unstoppable. 100 minutes.

Find a suitable mounting location for the converter box, and use the tape to attach it to the chassis. Unfortunately for me my CX-9 has a whole gob of mechanical stuff right where she stuck her converter in the video, so I have to figure that one out. In the process I also notice that the green wire is looped outside something where it shouldn’t be, so I have to go back to the other side, unwire it, and rerun the wire. 120 minutes.

Strip the power wire and use the included butt splice connectors to crimp the fuse holder on. No sweat. 125 minutes.

Remove the positive accessory nut on the battery cable. Connect the fuse holder’s eyelet to it & refasten the nut. Done. 130 minutes.

Route the black power wire down past the engine block and towards the back of the car, keeping away from moving parts and excessive heat sources. Um, excuse me?

I mentioned this before, but perhaps you notice something peculiar about the woman in this image:

Yeah, she’s standing underneath the damn car. As I am neither 6 inches tall nor in possession of a car lift I go and get my set of vehicle ramps and get the Mazda up on it. I don’t consider myself an idiot but I can only guess what gets hot or not under there. She ties her power wire to some HVAC connections, but I surmise that one of them probably gets hot at times.

The other video runs the cable from the trunk side first, and in looking at things that seems like a better plan, so I take it all apart, cut the splice connectors off, and run it through a rubber grommet in the trunk. There was some black silicone sealant in the kit which is never mentioned anywhere, so I use that to glue the grommet back down so the cable doesn’t move & rub & short out, and to coat the ground wire connection I made earlier so the chassis doesn’t rust there.

I end up taking that black tray you see behind her hands off the car and running the cord through there. The kit includes the world’s worst zip ties, especially when you’re upside down under a car, and about half of what I used, because I absolutely don’t want this wire snagging on something, nor being an obstacle in the future when some mechanic is hacking away at the car. At this point I’m wondering why I don’t just plug this into the accessory outlet in the back, but trailer wiring is often sketchy, especially for boat trailers, and I’d rather not blow my accessory fuses.

In the video it takes her 11 seconds to do this. ELEVEN. SECONDS. Total elapsed time for me, out here in this hell I call reality: 210 minutes. That’s 3.5 hours, and they said it’d take me 1.

Locate the rubber grommet that gains access to the trunk. Punch a hole in the grommet and run the wire through. Yeah, I did this already.

Strip the power wire and use a splice connector to connect it to the converter box. You have got to be kidding me — I don’t have any more butt splice connectors. I ponder riding my bicycle to the hardware store but I carefully secure everything, reconnect the battery, and drive over there. I look like I’ve rolled around under a car all day, which amuses the staff. Local hardware stores rule, by the way. Try asking a Home Depot employee where the butt splice connectors are and you’ll get a blank look and a “this isn’t my department” comment (to which I always retort “so why are you standing here, then?”). I was in & out of my local True Value in 2 minutes. Anyhow, 230 minutes, and I now own 47 more connectors than I need.

Route the “four flat” under the trim. Replace any previously removed vehicle parts. Put the fuse in. Test it using an electrical tester or a properly wired trailer. I love how they use jargon here, “four flat.” A few more seconds and their demo CX-9 is all back together. It takes me more than that, I mess around and find a better way to route the connector through the spare tire area so that it can be stored out of the way. They also don’t mention that getting the interior of the vehicle reinstalled means 15 minutes of fighting to get it underneath the trunk door gasket again, which is hard because it’s squirrelly. I end up using some levers designed for changing bicycle tires.

Total elapsed time: 270 minutes. 4.5 hours.

It’s one thing to let experts write documentation, but it absolutely needs to be tested by novices. What would have helped here?

  • A video that isn’t abridged. Show the whole process, even if it’s long. Do not skip any steps.
  • Better paper documentation and a complete list of required tools.
  • Acknowledgement that at some point you are going to need to be under the vehicle.
  • Documentation for all the parts in the kit. Nowhere did anything mention the silicone sealant, so at the end you’d be left wondering if you screwed up.
  • Spare parts in the kit. A couple more splice connectors and more higher-quality zip ties would have helped immensely.
  • Better time estimates. Frankly it would have been better to omit the estimate altogether than seriously understate the time committment like they did.

Better product design would have removed the need for a lot of this, too. As it turns out the CX-9 comes pre-wired for trailer wiring, and a product that plugs directly into the harness in the back would have saved immense amounts of time under the car. In the future I know that’s the route I’ll go in the future, choosing the more expensive OEM part that is way easier & faster to install. Opportunity cost is a real thing.

Intel’s Memory Drive Implementation for Optane Guarantees its Doom

A few weeks ago Intel started releasing their Optane product, a commercialization of the 3D Xpoint (Crosspoint) technology they’ve been talking about for a few years. Predictably, there has been a lot of commentary in all directions. Did you know it’s game changing, or that it’s a solution looking for a problem? It’s storage. It isn’t storage. It’s RAM. It isn’t RAM. It’s too slow to be RAM. It’s too small for storage. It’s useful now. Nobody will use it for years.

Yup. Confusion. It’s because Optane is a bunch of different things. It’s consumer and enterprise, and it’s both storage and memory.

There are plenty of articles out there on the technology itself. There’s a small M.2 version for desktops that acts as a cache, which is thoroughly uninteresting to me. I’d rather have a real SSD in one of my precious M.2 slots than a cache that I overrun with three photos from my Nikon SLR. Not to mention I need a 7th generation Intel Core CPU (Kaby Lake) to do this at all.

The real action is with the data center version, the P4800X. The first version is a 375 GB PCIe NVMe card. 375 GB isn’t very much space, but Intel says they’ll have 750 GB and 1.5 TB models out this year. The technology is a lot faster than the NAND flash typically found in SSDs, and the endurance is a lot higher, too (writes to SSDs use voltages that stress, and eventually destroy, the cells in the SSD). Intel says this thing can do 500,000 write IOPS, which makes it a hell of a write cache for something like VMware vSAN, even if it is a bit small. As a storage device, though, Optane is interesting but really just an evolution of NVMe flash technology.

Memory Drive Technology

What’s really interesting to me is the “Memory Drive” component, which seems intent on blurring the lines between memory and storage. You can use the P4800X to create a pool of something that looks to an OS like memory. It is an order of magnitude slower than regular DRAM, but several orders of magnitude faster than an SSD. Given that you could theoretically put 24 TB of Optane in a two socket server — for a lot less money than 24 TB of DRAM – there are some pretty interesting implications. Think about being able to hold a whole enterprise database in memory. The best I/O is one you don’t do, and having all that data close by means a lot less read traffic on your storage, not to mention it being a lot faster.

There aren’t a lot of details about Memory Drive, though. The product brief says it’s Linux only, and that it’s a software layer of some sort. Recently, though, I found a piece over at AnandTech which actually had details around this (link below, kudos to the author, Mr. Tallis, for digging into this). That post indicates it’s a paid add-on, and something like a hypervisor that boots from a USB device, or an IDE controller before the OS loads.

Amateur Hour at Intel

USB or IDE? An extra hypervisor? Paid? What is this, amateur hour? Intel wants me to pay extra for the privilege of booting my servers from a $5 USB drive, which can’t be mirrored or otherwise protected, so that I can load a software layer that basically makes my OS completely unsupported and more complicated? Oooh, sign me up.

Here’s my prediction: no self-respecting enterprise will use this because it is an operational disaster (lack of boot device redundancy, lack of IDE devices, lack of support for popular operating systems, lack of visibility into the Memory Drive layer, even just the nightmare of hardware licensing). As such, nobody will buy the add-on software. A company like Intel charges for features like this to gauge interest, and Intel will eventually incorrectly conclude that the lack of sales is an indicator that nobody is interested. They will then discontinue the product, and because Intel is effectively a monopoly that’ll be the end of this technology. Long live the status quo! Death to the unholy union of DRAM and storage!

On a parallel track, because the poor implementation means little interest from enterprise users, OS vendors won’t be pressured by users, application vendors, or Intel to develop anything for this new layer of addressable storage. That’s a damn shame because there’s real promise here. If Optane support were simply built into the server CPUs and chipsets moving forward, as a native part of what we get for paying the Intel price premiums, people would use it en masse. It should be as easy as plugging an Optane card in and flipping a switch in the BIOS to make it SSD or memory, non-volatile or volatile.

If that happened we’d start seeing real support for it in OSes, applications adapting to use it, and real, interesting, and positive change happening in our data centers. As it stands, though, I fear that Memory Drive is destined to die a slow death for the wrong reasons, at the hands of the ignorant-of-their-customers Ferengis running Intel.

————

%d bloggers like this: