There’s been a fair amount of commentary & impatience from IT staff as we wait for vendors to patch their products for the OpenSSL Heartbleed vulnerability. Why don’t they hurry up? They’ve had 10 days now, what’s taking so long? How big of a deal is it to change a few libraries?

Perhaps, to understand this, we need to consider how software development works.

The Software Development Life Cycle

Software Development Life Cycle Image courtesy of the Wikimedia Commons.

To understand why vendors take a while to do their thing we need to understand how they work. In short, there are a few different phases they work through when designing a new system or responding to bug reports.

Requirement Analysis is where someone figures out precisely what the customer wants and what the constraints are, like budget. It’s a lot of back & forth between stakeholders, end users, and the project staff. In the case of a bug report, like “OMFG OPENSSL LEAKING DATA INTERNET HOLY CRAP” the requirements are often fairly clear. Bugs aren’t always clear, though, which is why you sometimes get a lot of questions from support guys.

Design is where the technical details of implementation show up. The project team takes the customer requirements and turns them into a technical design. In the case of a bug the team figures out how to fix the problem without breaking other stuff. That’s sometimes a real art. Read bugs filed against the kernel in Red Hat’s Bugzilla if you want to see guys try very hard to fix problems without breaking other things.

Implementation is where someone sits down and codes whatever was designed, or implements the agreed-upon fix.

The testing phase can be a variety of things. For new code it’s often it’s full system testing, integration testing, and end-user acceptance testing. But if this is a bug, the testing is often Quality Assurance. Basically a QA team is trying to make sure that whoever coded a fix didn’t introduce more problems along the way. If they find a problem, called a regression, they work with the Engineering team to get it resolved before it ships.

Evolution is basically just deploying what was built. For software vendors there’s a release cycle, and then the process starts again.

So what? Why can’t they just fix the OpenSSL problem?

Git Branching Model Image borrowed from Maescool’s Git Branching Model Tutorial.

The problem is that in an organization with a lot of coders, a sudden need for an unplanned release really messes with a lot of things, short-circuiting the requirements, design, and implementation phases and wreaking havoc in testing.

Using this fine graphic I’ve borrowed from a Git developer we can get an idea of how this happens. In this case there’s a “master” branch of the code that customer releases are done from. Feeding that, there’s a branch called “release” that is likely owned by the QA guys. When the developers think they’re ready for a release they merge “develop” up into “release” and QA tests it. If it is good it moves on to “master.”

Developers who are adding features and fixing bugs create their own branches (“feature/xxx” etc.) where they can work, and then merge into “develop.” At each level there’s usually senior coders and project managers acting as gatekeepers, doing review and managing the flow of updates. On big code bases there are sometimes hundreds of branches open at any given time.

So now imagine that you’re a company like VMware, and you’ve just done a big software release, like VMware vSphere 5.5 Update 1, that has huge new functionality in it (VSAN).[0] There’s a lot of coding activity against your code base because you’re fixing new bugs that are coming in. You’re probably also adding features, and you’re doing all this against multiple major versions of the product. You might have had a plan for a maintenance release in a couple of months, but suddenly this OpenSSL thing pops up. It’s such a basic system library that it affects everything, so everybody will need to get involved at some level.

On top of that, the QA team is in hell because it isn’t just the OpenSSL fix that needs testing. A ton of other stuff was checked in, and is in the queue to be released. But all that needs testing, first. And if they find a regression they might not even be able to jettison the problem code, because it’ll be intertwined with other code in the version control system. So they need to sort it out, and test more, and sort more out, and test again, until it works like it should. The best way out is through, but the particular OpenSSL fix can’t get released until everything else is ready.

This all takes time, to communicate and resolve problems and coordinate hundreds of people. We need to give them that time. While the problem is urgent, we don’t really want software developers doing poor work because they’re burnt out. We also don’t want QA to miss steps or burn out, either, because this is code that we need to work in our production environments. Everybody is going to run this code, because they have to. If something is wrong it’ll create a nightmare for customers and support, bad publicity, and ill will.

So let’s not complain about the pace of vendor-supplied software updates appearing, without at least recognizing our hypocrisy. Let’s encourage them to fix the problem correctly, doing solid QA and remediation so the problem doesn’t get worse. Cut them some slack for a few more days while we remember that this is why we have mitigating controls, and defense-in-depth. Because sometimes one of the controls fails, for an uncomfortably long time, and it’s completely out of our control.

—–

[0] This is 100% speculative, while I have experience with development teams I have no insight into VMware or IBM or any of the other companies I’m waiting for patches from.

{ 0 comments }

I see a lot of misinformation floating around about the OpenSSL Heartbleed bug. In case you’ve been living under a rock, OpenSSL versions 1.0.1 through 1.0.1f are vulnerable to a condition where a particular feature will leak the contents of memory. This is bad, because memory often contains things like the private half of public-key cryptographic exchanges (which should always stay private), protected information, parts of your email, instant messenger conversations, and other information such as logins and passwords for things like web applications.

This problem is bad, but freaking out about it, and talking out of our duffs about it, adds to the problem.

You can test if you’re vulnerable with http://filippo.io/Heartbleed/ – just specify a host and a port, or with http://s3.jspenguin.org/ssltest.py from the command line with Python.

1. Not all versions of OpenSSL are vulnerable. Only fairly recent ones, and given the way enterprises patch you might be just fine. Verify the problem before you start scheduling remediations.

2. Heartbleed doesn’t leak all system memory. It only leaks information from the affected process, like a web server running with a flawed version of OpenSSL. A modern operating system prevents one process from accessing another’s memory space. The big problem is for things like IMAP servers and web applications that process authentication data, where that authentication information will be present in the memory space of the web server. That’s why this is bad, but it doesn’t automatically mean that things like your SSH-based logins to a host are compromised, nor just anything stored on a vulnerable server.

Of course, it’s always a good idea to change your passwords on a regular basis.

3. People are focusing on web servers being vulnerable, but many services can be, including your email servers (imapd, sendmail, etc.), databases (MySQL), snmpd, etc. and some of these services have sensitive authentication information, too. There’s lots of email that I wouldn’t want others to gain access to, like password reset tokens, what my wife calls me, etc.

4. A good way, under Linux, to see what’s running and using the crypto libraries is the lsof command:

$ sudo lsof | egrep "libssl|libcrypto" | cut -f 1 -d " " | sort | uniq
cupsd
dovecot
dsmc
httpd
imap-logi
java
mysqld
named
nmbd
ntpd
sendmail
smbd
snmpd
snmptrapd
spamd
squid
ssh
sshd
sudo
tuned
vsftpd

This does not list things that aren’t running that depend on the OpenSSL libraries. For that you might try mashing up find with ldd, mixing in -perm and -type a bit.

5. Just because you patched doesn’t mean that the applications using those libraries are safe. Applications load a copy of the library into memory when they start, so you replacing the files on disk means almost nothing unless you restart the applications, too. In my item #3 all of those processes have a copy of libcrypto or libssl, and all would need to restart to load the fixed version.

Furthermore, some OSes, like AIX, maintain a shared library cache, so it’s not even enough to replace it on disk. In AIX’s case you need to run /usr/sbin/slibclean as well to purge the flawed library from the cache and reread it from disk.

In most cases so far I have chosen to reboot the OSes rather than try to find and restart everything. Nuke it from orbit, it’s the only way to be sure.

6. Patching the system libraries is one thing, but many applications deliver libraries as part of their installations. You should probably use a command like find to search for them:

$ sudo find / -name libssl\*; sudo find / -name libcrypto\*
/opt/tivoli/tsm/client/ba/bin/libssl.so.0.9.8
/opt/tivoli/tsm/client/api/bin64/libssl.so.0.9.8
/home/plankers/pfs/openssl-1.0.1e/libssl.a
/home/plankers/pfs/openssl-1.0.1e/libssl.pc
/usr/lib/libssl.so.10
/usr/lib/libssl.so.1.0.1e
/usr/lib64/libssl.so.10
/usr/lib64/libssl3.so
/usr/lib64/libssl.so
/usr/lib64/pkgconfig/libssl.pc
/usr/lib64/libssl.so.1.0.1e
/opt/tivoli/tsm/client/ba/bin/libcrypto.so.0.9.8
/opt/tivoli/tsm/client/api/bin64/libcrypto.so.0.9.8
/home/plankers/pfs/openssl-1.0.1e/libcrypto.a
/home/plankers/pfs/openssl-1.0.1e/libcrypto.pc
/usr/lib/libcrypto.so.1.0.1e
/usr/lib/libcrypto.so.10
/usr/lib64/libcrypto.so.1.0.1e
/usr/lib64/libcrypto.so.10
/usr/lib64/libcrypto.so
/usr/lib64/pkgconfig/libcrypto.pc

In this example you can see that the Tivoli Storage Manager client has its own copy of OpenSSL, version 0.9.8, which isn’t vulnerable. I’ve got a vulnerable copy of OpenSSL 1.0.1e in my home directory from when I was messing around with Perfect Forward Secrecy. The rest looks like OpenSSL 1.0.1e but I know that it’s a patched copy from Red Hat. I will delete the vulnerable copy so there is no chance something could link against it.

7. If you were running a vulnerable web, email, or other server application you should change your SSL keys, because the whole point is that nobody but you should know your private keys. If someone knows your private keys they’ll be able to decrypt your traffic, NSA-style, or conduct a man-in-the-middle attack where they insert themselves between your server and a client and pretend to be you. Man-in-the-middle is difficult to achieve, but remember that this vulnerability has been around for about two years (April 19, 2012) so we don’t know who else knew about it. The good assumption is that some bad guys did. So change your keys. Remember that lots of things have SSL keys, mail servers, web servers, Jabber servers, etc.

8. While you’re messing with all your SSL certs, step up your SSL security in general. A great testing tool I use is the Qualys SSL Labs Server Test, and they link to best practices from the results page.

Good luck.

{ 4 comments }

Upgrading to VMware vCenter Server Appliance 5.5 from Windows vCenter 5.1

by Bob Plankers March 17, 2014 How To

My coworkers and I recently undertook the task of upgrading our vSphere 5.1 environment to version 5.5. While upgrades of these nature aren’t really newsworthy we did something of increasing interest in the VMware world: switched from the Windows-based vCenter Server on a physical host to the vCenter Server Appliance, or vCSA, which is a […]

Read the rest of this article...
2 comments Read the full article →

What Clients Don’t Know (and Why It’s Your Fault)

by Bob Plankers March 12, 2014 People Stuff

“Whether you work with outside clients or whether you’re part of an internal team your job is always, always going to include having to convince someone of something. Because your job isn’t just making things. Believe it or not, that’s the easy part. You’re going to spend 90% of your time convincing people that shit […]

Read the rest of this article...
0 comments Read the full article →

Update to VMware vCenter Server Appliance & NTP Issues

by Bob Plankers February 13, 2014 Security

Earlier today I posted “VMware vCenter Server Appliance 5.5.0 Has An Insecure NTP Server.” One of the reasons I like VMware is that they’re responsive to customer issues. This situation is no different. I just spoke with a few guys involved in VMware security, and this is what I’ve learned. 1. There has been mitigation information available […]

Read the rest of this article...
2 comments Read the full article →

VMware vCenter Server Appliance 5.5.0 Has An Insecure NTP Server

by Bob Plankers February 13, 2014 Security

Update: I have updated this article to reflect some new information provided by VMware. I have also published new notes and discussion as a separate blog post. On January 10, 2014 a vulnerability in ntpd, the Network Time Protocol daemon, was made public (US CERT VU#348126): UDP protocols such as NTP can be abused to […]

Read the rest of this article...
9 comments Read the full article →

The Lone Bookshelf: The Macintosh Way by Guy Kawasaki

by Bob Plankers February 11, 2014 Books

(This is the inaugural post of my Lone Bookshelf series. Find more posts using the “Books” category) Last summer my family moved to a different house. By itself, moving isn’t that big of a deal. Take everything out of the old house, put it on a truck, unload it into the new house. What is […]

Read the rest of this article...
Read the full article →

New Java Security Settings: More Proof That Oracle Hates You

by Bob Plankers February 6, 2014 Outright Rant

I began the day yesterday updating to Java 7u51, after which absolutely none of my enterprise Java applications worked anymore. I could not reach the consoles of my Rackspace cloud servers. I could not open the iDRAC console on my Dell PowerEdge. They all exited with some error about the Permissions attribute not being set. Being […]

Read the rest of this article...
17 comments Read the full article →