Big Trouble in Little Changes

I was making a few changes today when I ran across this snippet of code. It bothers me.

/bin/mkdir /var/lib/docker
/bin/mount /dev/Volume00/docker_lv /var/lib/docker
echo "/dev/Volume00/docker_lv /var/lib/docker ext4 defaults 1 2" >> /etc/fstab

“Why does it bother you, Bob?” you might ask. “They’re just mounting a filesystem.”

My problem is that any change that affects booting is high risk, because fixing startup problems is a real pain. And until the system reboots the person who executes this won’t know that it works. If it doesn’t work it’ll stop during the boot, sitting there waiting for someone with a root password to come fix it. So you’ll have to get a console on the machine and dig up the root password. Then you need to type it in. If it’s anything like my root passwords it’s 20+ characters long and horrible to type, especially on crappy cloud console applets that tend to repeat characters because they’re written in Java by a high schooler on a reliable, near-zero latency network, twelve versions of Chrome ago.

Once you’re in you need to figure out what the problem is, and that’s an even bigger rub. It might be months or, God help you, years between when these commands run and when they get tested in a reboot. So there’s no correlation, and you’ll have no idea what the problem is aside from a filesystem issue. And all the while it’s burning up your maintenance window and your chance to do the maintenance you actually intended & scheduled, making you look bad.

But what if we just change it a little?

/bin/mkdir /var/lib/docker
echo "/dev/Volume00/docker_lv /var/lib/docker ext4 defaults 1 2" >> /etc/fstab
/bin/mount -a

Now, when it runs it’ll actually test the entry in /etc/fstab, and you’ll know right away if it’s wrong.

Slick, eh?

Are you properly assessing the risk of your changes? Anything that affects booting is high risk, in my opinion. Rebooting properly is the foundation of good patching practices, disaster recovery, automated deployments, and so on.

How do you know the change you’re making actually works? Not just because it worked on a test system, either. How do you know, without a doubt, that it works on each machine you changed?

Configuration management tools help immensely, too, but there’s no substitute for thinking critically about the change you’re making, big or seemingly small.

Comments on this entry are closed.

  • I totally agree. With regards to critical thinking, it might also be worth running a mount with the -f and -v options beforehand to see *what would be done* during the mounting process. If everything looks ok, you can then proceed with the actual mount.

  • FWIW, I like to take the super-safe approach of only doing this type of work in system build scripts and the script always reboots the system one last time before completing and declaring the system built. But that only works if you only build filesystems when the box is being built.