This is post #1 in my December-long series on Linux VM performance tuning, Tuningmas.
What’s the big deal? In short, it is killing your I/O performance. Logical Block Addressing on your disk drive makes the Master Boot Record 63 bytes long. This means it occupies sectors 0-62 on disk, and the first partition will start at sector 63. The number 63 is a persona non grata in the computer world. It isn’t a power of 2, and it certainly doesn’t line up with your storage’s idea of the world (no matter if it’s a local SSD, a RAID controller, or a big enterprise array). The misaligned partition has blocks that straddle the stripes on the array, and instead of reading a single stripe the array has to read from, or write to, two stripes.
This isn’t a big problem on one or two VMs, but when hundreds of VMs have misaligned I/O the effect is crippling. For every I/O operation you do at the OS level, you’re really doing two on the back end. That hurts performance, deduplication, and on SSD disks it reduces lifespan because SSDs have a limited number of writes they can do. Do twice as many writes as you meant to and your SSD lives half as long. Seriously. It also fills your disk cache with twice as much stuff, which means it’ll be half as useful (or less).
To this day I run into VMs that are misaligned (and people who argue with me that they don’t need to be aligned, which isn’t a common case for block storage, by the way). I also encounter advice that does not include these steps. Please! Read your manuals and align your partitions! If you are using a recent OS, like Red Hat Enterprise Linux 6 or Microsoft Windows 7 or Server 2008 it’ll auto-align things for you. Otherwise refer to your array manual, use good ol’ fdisk to fix things on new installations, or use the NetApp tools mbrscan & mbralign that are part of the NetApp Host Utilities. Nick Weaver also has released UBERAlign, a storage-agnostic tool to help with this problem, though I have not worked with it. It also does not support LVM setups, which is a traditional failing of automated tools.
If you are using Logical Volume Management (LVM) on your VM you could also use pvmove to help you align a VM. Add a scratch virtual hard disk, align it properly, pvcreate, and then add it to your volume group. Use pvmove to migrate all the data from your misaligned LVM partition. Use vgreduce to get the misaligned volume out of the volume group, then use fdisk to fix the alignment (you might need to reboot here to pick up the new partition table). Then just pvcreate the re-aligned partition, vgextend, and pvmove back off the scratch volume. Finish with a vgreduce of the scratch partition and and shutdown to remove the scratch disk from the VM. I’ve used this a lot, especially with P2Vs, and while it won’t correct the alignment of /boot there isn’t much I/O for /boot, either, making it a non-issue.
This series is inspired in part by SysAdvent, organized by Jordan Sissel, which is 25 days of great sysadmin tips & tricks. I suggest going and checking it out, and then checking out the inspiration for that, which is the Perl Advent Calendar. Then come back here tomorrow for more Tuningmas!