Intel’s Memory Drive Implementation for Optane Guarantees its Doom

A few weeks ago Intel started releasing their Optane product, a commercialization of the 3D Xpoint (Crosspoint) technology they’ve been talking about for a few years. Predictably, there has been a lot of commentary in all directions. Did you know it’s game changing, or that it’s a solution looking for a problem? It’s storage. It isn’t storage. It’s RAM. It isn’t RAM. It’s too slow to be RAM. It’s too small for storage. It’s useful now. Nobody will use it for years.

Yup. Confusion. It’s because Optane is a bunch of different things. It’s consumer and enterprise, and it’s both storage and memory.

There are plenty of articles out there on the technology itself. There’s a small M.2 version for desktops that acts as a cache, which is thoroughly uninteresting to me. I’d rather have a real SSD in one of my precious M.2 slots than a cache that I overrun with three photos from my Nikon SLR. Not to mention I need a 7th generation Intel Core CPU (Kaby Lake) to do this at all.

The real action is with the data center version, the P4800X. The first version is a 375 GB PCIe NVMe card. 375 GB isn’t very much space, but Intel says they’ll have 750 GB and 1.5 TB models out this year. The technology is a lot faster than the NAND flash typically found in SSDs, and the endurance is a lot higher, too (writes to SSDs use voltages that stress, and eventually destroy, the cells in the SSD). Intel says this thing can do 500,000 write IOPS, which makes it a hell of a write cache for something like VMware vSAN, even if it is a bit small. As a storage device, though, Optane is interesting but really just an evolution of NVMe flash technology.

Memory Drive Technology

What’s really interesting to me is the “Memory Drive” component, which seems intent on blurring the lines between memory and storage. You can use the P4800X to create a pool of something that looks to an OS like memory. It is an order of magnitude slower than regular DRAM, but several orders of magnitude faster than an SSD. Given that you could theoretically put 24 TB of Optane in a two socket server — for a lot less money than 24 TB of DRAM – there are some pretty interesting implications. Think about being able to hold a whole enterprise database in memory. The best I/O is one you don’t do, and having all that data close by means a lot less read traffic on your storage, not to mention it being a lot faster.

There aren’t a lot of details about Memory Drive, though. The product brief says it’s Linux only, and that it’s a software layer of some sort. Recently, though, I found a piece over at AnandTech which actually had details around this (link below, kudos to the author, Mr. Tallis, for digging into this). That post indicates it’s a paid add-on, and something like a hypervisor that boots from a USB device, or an IDE controller before the OS loads.

Amateur Hour at Intel

USB or IDE? An extra hypervisor? Paid? What is this, amateur hour? Intel wants me to pay extra for the privilege of booting my servers from a $5 USB drive, which can’t be mirrored or otherwise protected, so that I can load a software layer that basically makes my OS completely unsupported and more complicated? Oooh, sign me up.

Here’s my prediction: no self-respecting enterprise will use this because it is an operational disaster (lack of boot device redundancy, lack of IDE devices, lack of support for popular operating systems, lack of visibility into the Memory Drive layer, even just the nightmare of hardware licensing). As such, nobody will buy the add-on software. A company like Intel charges for features like this to gauge interest, and Intel will eventually incorrectly conclude that the lack of sales is an indicator that nobody is interested. They will then discontinue the product, and because Intel is effectively a monopoly that’ll be the end of this technology. Long live the status quo! Death to the unholy union of DRAM and storage!

On a parallel track, because the poor implementation means little interest from enterprise users, OS vendors won’t be pressured by users, application vendors, or Intel to develop anything for this new layer of addressable storage. That’s a damn shame because there’s real promise here. If Optane support were simply built into the server CPUs and chipsets moving forward, as a native part of what we get for paying the Intel price premiums, people would use it en masse. It should be as easy as plugging an Optane card in and flipping a switch in the BIOS to make it SSD or memory, non-volatile or volatile.

If that happened we’d start seeing real support for it in OSes, applications adapting to use it, and real, interesting, and positive change happening in our data centers. As it stands, though, I fear that Memory Drive is destined to die a slow death for the wrong reasons, at the hands of the ignorant-of-their-customers Ferengis running Intel.

————

Install the vCenter Server Appliance (VCSA) Without Ephemeral Port Groups

Trying to install VMware vCenter in appliance/VCSA form straight to a new ESXi host? Having a problem where it isn’t listing any networks, and it’s telling you that “Non-ephemeral distributed virtual port groups are not supported” in the little informational bubble next to it?

Thinking this is Chicken & Egg 101, because you can’t have an ephemeral port group without a Distributed vSwitch, and you can’t have a dvSwitch without a vCenter, so how do you install vCenter when you need something that only vCenter can create?

Yeah, me too. Here’s the secret, though: don’t remove the default “VM Network” port group, or if you did, put it back, and restart the installer (or just back up to select the host again).

Ah, that’s better. I’d removed it in favor of adding another port group with the right VLAN and such. I should have just customized it in place.

In other news, it’s apparently been a while since I’ve done a completely bare-metal install! As much as I hate to admit it, in my frustration I actually broke down and called VMware Support about this. My reputation is safe, though, since they had absolutely no idea what I was talking about, and I figured it out while they were trying to apply their ponderous & regimented support process to me. Just makes me long for Business Critical Support again. Cost/benefit was wrong for us when we renewed the ELA it was part of but you could ask those folks ANYTHING and have an instant & dead accurate answer, and usually an offer for a WebEx to fix it.

vCenter 6.5b Resets Root Password Expiration Settings

I’m starting to update all my 6.x vCenters and vROPS, pending patches being released. You should be doing this, too, since they’re vulnerable to the Apache Struts 2 critical security holes. One thing I noted in my testing is that after patching the 6.5 appliances, their root password expiration settings go back to the defaults. In this case I’d set them to not expire, but it’s clearly not that way anymore:

Depending on your security requirements this might not be what you want. It’s bad form on VMware’s part, changing something that had been explicitly set. I also didn’t test to see if it resets the actual password age, or just the expiry. You might have far less than 365 days before it expires.

While it’s a good idea to rotate passwords, I also hate being locked out of my infrastructure, especially since I usually discover it in the middle of another problem… But to each their own. Good luck!

How Not To Quit Your Job

I’ve thought a lot lately about Michael Thomas, a moron who caused criminal amounts of damage to his former employer in the process of quitting. From The Register[0]:

As well as deleting ClickMotive’s backups and notification systems for network problems, he cut off people’s VPN access and “tinkered” with the Texas company’s email servers. He deleted internal wiki pages, and removed contact details for the organization’s outside tech support, leaving the automotive software developer scrambling.

The real-life BOFH then left his keys, laptop, and entry badge behind with a letter of resignation and an offer to stay on as a consultant.

More than a decade ago I did some consulting for a company that had this happen. They fired their sysadmin and he basically ransomed them, logging in through dozens of back doors to disrupt their business. My first call was to the local police department. This was before these types of crimes were very prevalent; we were lucky that the larger Californian city these crimes were in had a detective with an idea of what to do. Let me tell you: hiring the guy back was never on the list (though pretending to, and meeting up with the guy to grab him, was what the FBI wanted to do). If you do this to someone and they invite you back in to talk or rehire you, and you go, you deserve everything you get because you’re dumb.

Whistleblowing aside, if you’re playing Michael Thomas in a story like this there is absolutely nothing you can say to law enforcement to keep them from throwing you in jail. Think about it. On one side you have a business with a demonstrable material loss because of your actions. On the other side, you’re saying “BUT THEY WERE MEAN TO ME.” And unlike my story above, set in the early ‘oughts, there are actually laws and law enforcement professionals now that will bust your ass and make the charges stick. The process will be years long, too. Mr. Thomas pulled his stunt in 2011, and they finally got around to convicting him. Do you really want to waste that much of your life, with something like that hanging over your head that’ll ultimately destroy your life and career, because of something that felt good for a few minutes?[1]

Beyond all of that, what bugs me the most is how many ways this guy could have screwed with them and gotten away with it. I’m bothered for two reasons:

1. It speaks to how much trust we place in system administrators, and how system administrators need impeccable ethics as well as good judgement. We can implement all the security in the world and, usually, it still comes down to needing to trust a person. Hiring the right people is SO important.

2. It also bothers me because the guy was JUST. SO. DUMB. In a couple minutes over lunch some colleagues and I had ten different, solid, ideas for ways to screw with someone’s systems, mostly based in real-life experience with well-meaning dumbasses. Some highlights were: change the netmasks in their DHCP pools to non-standard ones (e.g. 254.192.138.0) so it’s pretty random what works and what doesn’t, any manner of trickery with scheduled tasks/at/cron, off-hours system shutdowns that look like scripting errors, and redefining localhost (we just had this happen in our Active Directory with someone trying to join an Ubuntu host… OMFG). Extra points if it all just looks like errors, or makes them think you’re an idiot if & when they find the problem. Though in smaller communities that may backfire — people do talk to one another.

Interestingly enough, though, nothing any of us suggested was inherently destructive, just annoying. And when it comes down to it, none of us would actually do any of it, choosing instead to drink a beer and move on with our lives. That, perhaps, is the biggest lesson in the Michael Thomas story. As cathartic as it may be to stick it to the man, if you don’t like your job it’s always a better choice to just simply find a different one and politely move on.

 


[0] “I was authorized to trash my employer’s network, sysadmin tells court” – The Register, 23 Feb 2017

[1] Get your mind out of the gutter, kids are great.

Standards, to and with Resolve

"You can have any color as long as it's black" - Henry Ford

“You can have any color as long as it’s black” – Henry Ford (Image (C) Michael LoCascio, via Wikimedia Commons)

As the holiday season has progressed I’ve spent a bunch of time in the car, traveling three hours at a crack to see friends and family in various parts of Midwestern USA. Much of that travel has been alone, my family having decided to ensconce themselves with my in-laws for the full duration of the week. That has left me ample time to sing aloud in the car, take unplanned detours to collect growlers of beer from esteemed breweries, and to think.

I don’t do New Year’s resolutions. I’m not against them, per se, but I just think they’re too conveniently abandoned. I like the noun form of “resolve” better — a firm determination to do something. I aspire to have resolve, whether I am deciding firmly on a course of action, or settling or finding a solution to a problem, dispute, or contentious matter.

So to what issue should I bring my resolve to bear? What is it that I want to work on in 2017?

As I thought about this, I always crept back to the idea that IT just isn’t the game I signed up for a few decades ago. It seems a lot less technical, at least at the infrastructure level. A lot of the new infrastructure, whether it’s on site or in the cloud, is just simpler. Storage is getting simpler because SSDs are now cheaper than rotational media. Hyperconverged infrastructure has removed a number of pain points as well, including things like discrete SANs. Compute is getting ridiculously dense. What was possible in a 4U server is now possible in essentially a half rack unit (something like a Dell FX2).

With all that, a lot of the crap we’ve dealt with over the years just evaporates.

So what do I work on? What’s the biggest, most fundamental problem around, lying at the core of everything?

Standards.

That’s it. Standards. Without standards you cannot automate, and cannot remove many of the remaining problems at the infrastructure level. Without standards there are bad assumptions, and the inevitable human error and downtime that follow. The foundation of a modern IT operation is standards.

As it turns out, standards aren’t a technical problem, either. The way I see it, they’re usually a financial problem, insofar as someone didn’t budget enough money to do something the way everybody else does, and now it needs to work. Or perhaps it’s a difference of opinion, or a technical requirement that is incompatible with things. Maybe a time constraint. Or a workflow problem, where the workflow should have included IT but didn’t until it was too late. Regardless, though, I see standards as the foundation of IT moving forward, transcending clouds, containers, applications, networking, everything.

So that’s what I’m going to work on –finding a way to enable deep automation and staff time savings with standardization, without unduly limiting projects or adding financial burdens. I urge you to do the same with the copious free time you now have because of flash disk and hyperconvergence.

:)

esxupdate Error Code 99

So I’ve got a VMware ESXi 6.0 host that’s been causing me pain lately. It had some storage issues, and now it won’t let VMware Update Manager scan it, throwing the error:

The host returns esxupdate error code:99. An unhandled exception was encountered. Check the Update Manager log files and esxupdate log files for more details.

A little Google action later and it’s clear there isn’t a lot of documentation, recent or otherwise, about this out there. People suggest rebuilding Update Manager, or copying files from other hosts to repair them. The VMware KB has documentation of the particular error but only in context of the Cisco Nexus 1000V, and only for ESXi 5.0 and 5.1. Here’s another thought, if you’re in my same situation.

1. First, do what it says: check esxupdate.log. Log into the console of the ESXi host (SSH or otherwise) and “tail -f /var/log/esxupdate.log”

2. Scan the host with Update Manager so that the log has fresh data in it. You should see it pop up. In my case it showed:

2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: An unexpected exception was caught:
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: Traceback (most recent call last):
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/usr/sbin/esxupdate", line 238, in main
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: cmd.Run()
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/build/mts/release/bora-3620759/bora/build/esx/release/vmvisor/sys-boot/lib/python2.7/site-packages/vmware/esx5update/Cmdline.py", line 113, in Run
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/build/mts/release/bora-3620759/bora/build/esx/release/vmvisor/sys-boot/lib/python2.7/site-packages/vmware/esx5update/MetadataScanner.py", line 244, in Scan
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/build/mts/release/bora-3620759/bora/build/esx/release/vmvisor/sys-boot/lib/python2.7/site-packages/vmware/esx5update/MetadataScanner.py", line 106, in _generateOperationData
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/build/mts/release/bora-3620759/bora/build/esx/release/vmvisor/sys-boot/lib/python2.7/site-packages/vmware/esx5update/MetadataScanner.py", line 89, in _getInstallProfile
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/build/mts/release/bora-3620759/bora/build/esx/release/vmvisor/sys-boot/lib/python2.7/site-packages/vmware/esximage/ImageProfile.py", line 627, in ScanVibs
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/build/mts/release/bora-3620759/bora/build/esx/release/vmvisor/sys-boot/lib/python2.7/site-packages/vmware/esximage/VibCollection.py", line 62, in __add__
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/build/mts/release/bora-3620759/bora/build/esx/release/vmvisor/sys-boot/lib/python2.7/site-packages/vmware/esximage/VibCollection.py", line 79, in AddVib
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: File "/build/mts/release/bora-3620759/bora/build/esx/release/vmvisor/sys-boot/lib/python2.7/site-packages/vmware/esximage/Vib.py", line 627, in MergeVib
 2016-05-27T15:54:52Z esxupdate: esxupdate: ERROR: ValueError: Cannot merge VIBs Dell_bootbank_OpenManage_8.3.0.ESXi600-0000, Dell_bootbank_OpenManage_8.3.0.ESXi600-0000 with unequal payloads attributes: ([OpenManage: 7807.439 KB], [OpenManage: 7809.081 KB])
 2016-05-27T15:54:52Z esxupdate: esxupdate: DEBUG: <<<

Ctrl-C will end the “tail” command.

3. It looks like during the storage issues that something about the OpenManage VIB became corrupt, and now it thinks there’s two copies with different payload sizes. You know what? I can just remove this VIB and reinstall it (rather than having to rebuild the host or do some other complicated fixes). I issue a “esxcli software vib list | grep -i dell” command to find the name of the VIB:

[root@GOAT:/var/log] esxcli software vib list | grep -i dell
OpenManage 8.3.0.ESXi600-0000 Dell PartnerSupported 2016-05-04 
iSM        2.3.0.ESXi600-0000 Dell PartnerSupported 2016-05-04

4. Then we need a simple “esxcli software vib remove –vibname=OpenManage”

[root@GOAT:/var/log] esxcli software vib remove --vibname=OpenManage
Removal Result
 Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
 Reboot Required: true
 VIBs Installed: 
 VIBs Removed: Dell_bootbank_OpenManage_8.3.0.ESXi600-0000
 VIBs Skipped:

5. Do what it says and reboot, then scan to see if it works. In my case it did, then I reinstalled the missing extension, and patched to the latest version like normal.

Use Microsoft Excel For Your Text Manipulation Needs

I’m just going to lay it out there: sysadmins should use Microsoft Excel more.

I probably will be labeled a traitor and a heathen for this post. It’s okay, I have years of practice having blasphemous opinions on various IT religious beliefs. Do I know how to use the UNIX text tools like sed, awk, xargs, find, cut, and so on? Yes. Do I know how to use regular expressions? Yes. Do I know how to use Perl and Python to manipulate text, and do poor-man’s extract-transform-load sorts of things? Absolutely.

It’s just that I rarely need such complicated tools in my daily work. I often just have a short list of something that I need to turn into a bunch of one-off commands. And many times I’m sharing it with others of varying proficiency, so readability is key. As it turns out, Excel has some very worthwhile text manipulation. Couple that with the ability to import CSV and autofill it’s a pretty decent solution. Let me give you some examples.

First, we need some text to manipulate. In cells A1 through D1 we have Goats, Sheep, Clowns, and Fire. Some people have Alice & Bob, I have goats & sheep.

Excel Text Example

First, we can concatenate strings very easily in Excel, as well as insert new strings. This is very handy for building commands you can then paste into a CLI, especially for doing one-off sorts of things. We do this with the ampersand, ‘&’.

=C1&” eat “&B1&” that are on “&D1

=”puppet cert sign “&A1&”.domain.com”

Excel Text Example

Oh, you’re doing something that needs the text in all upper- or lower-case? No problem. We have UPPER() and LOWER() functions. Suck it, /usr/bin/tr.

=UPPER(C1)&” eat “&LOWER(B1)&” that are on “&UPPER(D1)

Excel Text Example

Maybe we have a list and we need the first or last few characters from each. There’s LEFT() and RIGHT(), which will return a certain number of characters from those sides of the string.

=LEFT(A1,2)

=RIGHT(C1,4)

Excel Text Example

Perhaps you have a list of domain names, and want to grab the first part. We can use FIND() with LEFT() and RIGHT(). We can add or subtract 1 to get what we want.

=LEFT(A17,FIND(“.”,A17))

=LEFT(A17,FIND(“.”,A17)-1)

Excel Text Example

Maybe we need to do some autofilling, perhaps for a quick way to take some snapshots through VMware’s PowerCLI. I had the list on the left, then incorporated it into a larger command, dragging down to autofill all the names. Copy & paste that into a PowerCLI window and you’re set. Ad-hoc PowerCLI commands on small lists is actually my #1 use case.

=”New-Snapshot -Name Pre-Patch -VM “&A30&” -Confirm:$false”

Excel Text Example

Autofill automatically adjusts cell references, too, so if you specified A1 and dragged down it’ll use A2, A3, A4, and so on. If that’s not what you want you can preface parts of the reference with a dollar sign, ‘$’, to make it a static reference. I made it completely static with $A$1, but you can do $A1 or A$1, too.

=A30&”=”&$A$1&”.domain.com”

Excel Text Example

Excel knows how to autofill just about anything ending in a number or a letter sequence. If it doesn’t catch on with one, try selecting two cells, then filling down. And if it really doesn’t catch on just insert a new column, autofill there, then concatenate that column with your others. In a pinch I’ve built BIND DNS zone files in Excel this way.

I think you get the idea. There’s a good reference in the Excel help, too – hit F1 and then search for “text functions.” The “Text Functions (reference)” result will show more commands, like LEN() for string length, MID() for getting substrings from the middle of a cell, SUBSTITUTE() for replacing text, and so on.

Next time you are tempted to assemble a list of commands by hand save yourself time, keystrokes, and potential errors by doing it in Excel instead!

Here’s my sample workbook, too, if you want to look at these examples yourself. Have fun!

Big Trouble in Little Changes

I was making a few changes today when I ran across this snippet of code. It bothers me.

/bin/mkdir /var/lib/docker
/bin/mount /dev/Volume00/docker_lv /var/lib/docker
echo "/dev/Volume00/docker_lv /var/lib/docker ext4 defaults 1 2" >> /etc/fstab

“Why does it bother you, Bob?” you might ask. “They’re just mounting a filesystem.”

My problem is that any change that affects booting is high risk, because fixing startup problems is a real pain. And until the system reboots the person who executes this won’t know that it works. If it doesn’t work it’ll stop during the boot, sitting there waiting for someone with a root password to come fix it. So you’ll have to get a console on the machine and dig up the root password. Then you need to type it in. If it’s anything like my root passwords it’s 20+ characters long and horrible to type, especially on crappy cloud console applets that tend to repeat characters because they’re written in Java by a high schooler on a reliable, near-zero latency network, twelve versions of Chrome ago.

Once you’re in you need to figure out what the problem is, and that’s an even bigger rub. It might be months or, God help you, years between when these commands run and when they get tested in a reboot. So there’s no correlation, and you’ll have no idea what the problem is aside from a filesystem issue. And all the while it’s burning up your maintenance window and your chance to do the maintenance you actually intended & scheduled, making you look bad.

But what if we just change it a little?

/bin/mkdir /var/lib/docker
echo "/dev/Volume00/docker_lv /var/lib/docker ext4 defaults 1 2" >> /etc/fstab
/bin/mount -a

Now, when it runs it’ll actually test the entry in /etc/fstab, and you’ll know right away if it’s wrong.

Slick, eh?

Are you properly assessing the risk of your changes? Anything that affects booting is high risk, in my opinion. Rebooting properly is the foundation of good patching practices, disaster recovery, automated deployments, and so on.

How do you know the change you’re making actually works? Not just because it worked on a test system, either. How do you know, without a doubt, that it works on each machine you changed?

Configuration management tools help immensely, too, but there’s no substitute for thinking critically about the change you’re making, big or seemingly small.

Interesting Dell iDRAC Tricks

Deploying a bunch of machines all at once? Know your way around for loops in shell scripts, or Excel enough to do some basic text functions & autofill? You, too, can set up a few hundred servers in one shot. Here’s some interesting things I’ve done in the recent past using the Dell iDRAC out-of-band hardware management controllers.

You need to install the racadm utility on your Windows or Linux host. I’ll leave this up to you, but you probably want to look in the Dell Downloads for your server, under “Systems Management.” I recently found it as “Dell OpenManage DRAC Tools, includes Racadm” in 32- and 64-bit flavors.

Basic Command

The basic racadm command I’ll represent with $racadm from now on is:

racadm -r hostname.or.ip.com -u iDRACuser -p password

Set a New Root Password

I don’t know how many times I see people with iDRACs on a network and the root password is still ‘calvin.’ If you do nothing else change that crap right away:

$racadm set iDRAC.Users.2.Password newpassword

The number ‘2’ indicates the user ID on the iDRAC. The root user is 2 by default.

If you have special characters in your password, and you should, you may need to escape them or put them in single quotes. You will want to test this on an iDRAC that has another admin user on it, or where you have console access or access through a blade chassis, for when you screw up the root password and lock yourself out. Not that I’ve ever done this, not even in the course of writing this post. Nope, not admitting anything.

Dump & Restore Machine Configurations

Once upon a time I embarked on a quest to configure a server solely with racadm ‘set’ commands. Want to know a secret? That was a complete waste of a few hours of my life. What I do now is take one server and run through all the BIOS, PERC, and iDRAC settings via the console and/or the web interface, then dump the configuration with a command:

$racadm get -t xml -f idrac-r730xd.xml

That’ll generate an XML file of all the settings, which you can then load back into the other servers with:

$racadm set -t xml -f idrac-r730xd.xml -b graceful -w 600

This tells it to gracefully shut the OS down, if there is one, before rebooting to reload the configurations. It also says to wait 600 seconds for the job to complete. The default is 300 seconds but with an OS shutdown, long reboot, memory check, etc. it gets tight. There are other reboot options, check out the help via:

$racadm help set

You can also edit the XML file to remove parts that you don’t want, such as when you want to preconfigure a new model of server with common iDRAC settings but do the BIOS & RAID configs on your own. That XML file will also give you clues to all the relevant configuration options, too, which you can then use via the normal iDRAC ‘get’ and ‘set’ methods.

Upload New SSL Certificates

I like knowing that the SSL certificates on my equipment aren’t the defaults (and I get tired of all the warnings). With access to a certificate authority you can issue some valid certs for your infrastructure. However, I don’t want to manage SSL certificates for hundreds of servers. Where I can I’ll get a wildcard certificate, or if that’s expensive or difficult I’ll abuse the Subject Alternate Name (SAN) features of SSL certificates to generate one with all my iDRAC names in it. Then I can upload new keys and certificates, and reset the iDRAC to make it effective:

$racadm sslkeyupload -t 1 -f idrac.key
$racadm sslcertupload –t 1 -f idrac.cer
$racadm racreset

Ta-dum, green valid certificates for a few years with only a bit of work. If you don’t have your own CA it’s probably worth creating one. You can load the CA certificate as a trusted root into your desktop OS and make the warnings go away, and you know that your SSL certs aren’t the vendor defaults. What’s the point of crypto when everybody has the same key as you?

There are lots of cool things you can do with the iDRAC, so if you’re doing something manually via the console or iDRAC web interface you might think about looking it up in the Dell iDRAC RACADM Command Line Reference first.

10 Years

Ten years ago I wrote the first post on this blog. 3:43 AM. I’m a late night kinda guy, I guess. Actually, I probably came home from a bar, installed WordPress 1.5.1, and started writing.

Ten years seems like an awfully long time ago. So much has changed in my life. I like my job, most days. That wasn’t true back then. That’s part of why this started, as a way to vent. I have a wife and a kid now… almost two kids, just a couple days more until it is man-to-man coverage around Chez Plankers.

I’ve been a little burnt out lately, with work and kids and life, and slacked off on writing in almost every way. As such, it’s been interesting to look back at some of my first posts here. Ugh. I wonder if, in ten years, my recent posts will be as irrelevant as those early posts are now. They’re not bad, per se, but I hadn’t found focus yet. There’s even recipes back in the archives. Hell, I made the panang the other night. And to this day my #1 post is the one where I show how to reassemble a faucet aerator. No kidding. #2 is how to disable Teredo, 6to4, and whatnot under Windows.

I am definitely a better writer now, though. It is true about Carnegie Hall — you get there with practice.

I wasn’t part of the virtualization community, early on. My goal was to write about system administration, mostly. I’d been virtualizing things for a couple of years at that point, but it was only when I discovered that EMC wasn’t recommending that people align the partitions on their disks, and that there were serious negative performance implications there, that I started writing about VMware. We had Dell PowerEdge 6650s and EMC Clariion CX3s at the time, ESX 1.5, vMotion but nothing more. vMotion made us laugh the first time we set it up. I think we spent an hour moving things back & forth, in a shared area, and by the time my friend & coworker Rich and I were done we’d accumulated a lot of our coworkers around us, witnessing the beginning of the next phase of IT.

So I started writing about it, among other things. I owe two people thanks for support in those early years. John Troyer, who forged the next generation of vendor communities. He reached out to me early and encouraged me to write more and often. He used the term “bully pulpit” at least once with me, but in that I found balance and moderation. He may also have been the first one to tell me I was a good writer, in front of a lot of other people.

The other is Marc Farley, who surprised me once at an early Las Vegas VMworld by reaching out, inviting me to dinner, and drinking tequila with me. I had no idea what to think when he first made contact, but by the end of the night I had gained a sense of the possible community and friendships. Also, tequila, which would repeat itself a few times here and there. Not nearly enough, though, mostly due to proximity.

Thank you guys.

There are so many more out there that encourage me, that have encouraged me, and give me hope and inspiration, reminded me there’s a point to this stuff. People I’ve enjoyed times with over the years, people I’m happy to call friends, even if we don’t see each other all that much anymore. Damian Karlson and an intoxicated evening in the Venetian. Frank Denneman and Duncan Epping and late night hot dogs in Copenhagen. Ed Czerwin, Chris Dearden, and Christian Hobbel, the vSoup guys, for ongoing support and love. Jason Boche, Todd Scalzott, Chris Wahl, Drew Denson, and Rich Lingk, people I can smoke cigars and talk about anything into the wee hours of the morning. Michael Keen, Stu Miniman, and Ganesh Padmanabhan, always up for a Moscow Mule. People I don’t even know how I know them anymore, who I love seeing, people like Julia Weatherby and Jay Weinshenker, Gina Minks, GS Khalsa, and Matt Vogt. Edward Haletky, Bernd Herzog, and all the TVP crew past & present. Stephen Foskett, Claire Chaplais, Tom Hollingsworth, Matt Simmons, Ben Freedman, all the TFD crew, and all the repeat offenders I meet at conferences, like Justin Warren, Howard Marks, Ethan Banks, Greg Ferro, Alastair Cooke, Keith Townsend, John Obeto, Curtis Preston, and more. The TechTarget folks, Nick Martin, Alex Barrett, Colin Steele, and Lauren Horwitz, who have taken my writing to the next level. And of course all the folks with vendors that keep good track of me, and allow me to see some of these people from time to time. Doug Hazelman, Sarah Vela, Jason Collier, Rick Vanover, Melanie Boyer, Eric Nielsen, and more.

It’s late and I’ve forgotten people in this list. People who are important. I’m sorry, and I’m thankful. Thank you to everybody who still works for and in this community of bloggers. Thank you for everybody that has encouraged me. Thank you to everybody who reads my writings. Thank you, all.

%d bloggers like this: