RSS Feed for System AdministrationCategory: System Administration

LMGTFY »

At least once a day I have this conversation, usually with one of the same four people:

“Hey Bob, do you know what ‘pam_tally(sshd:setcred): unknown option: no_magic_root’ means?”

“Have you tried searching for the answer yourself?”

“No. Why?”

<uncomfortable pause goes here>

As such, I adore the creators of “Let Me Google That For You.”

http://lmgtfy.com/

(hat tip to my friend Terry Bradshaw for finding this one).

Windows Losing its Default Printer »

For months now my Windows Vista, and now Windows Server 2008 desktop has been losing its default printer every night. I haven’t been able to figure it out until now: it’s the Remote Desktop Client remapping my printers when I connect from home.

There are three fixes for this:

1. You can tell your RDP client to not map printers, in the “Local Resources” options tab. This is easy but you have to remember to do it.

2. On the host side on Windows Server 2008 you can go into Administrative Tools->Terminal Services->Terminal Services Configuration, right click the RDP-TCP connection, pick “Properties,” and disable it under the “Client Settings” tab.

3. On the host side on Windows Vista you can follow Microsoft KB article 268065 and add a registry value in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server\Wds\rdpwd.

Now you know, and as they say in G.I. Joe, knowing is half the battle.

Why I Like Wikis »

Around the places I work (and the non-profits I work with, too) we use Atlassian’s Confluence wiki software. With Atlassian offering $5 licenses for Confluence and Jira through the end of the week I thought it’d be a great time to write about how we use our wiki.

We love wikis because:

1. They let us easily add & maintain documentation. Let’s say that someone finds an error in a document I wrote (it happens). With old static documentation, like HTML, someone would need to find me and tell me how to fix it. Then I’d need to actually fix it, and re-publish it. With a wiki the person who finds the problem can click “edit” and fix it themselves without using two staff members’ time.

2. I can tell who edited what, and what they changed. When I discover that someone changed my document I can tell who did it, when they did it, and I can compare their version to my version to see what changed. I can also “watch” certain documents so I get notified when they do change.

3. Meeting agendas are easy to maintain. Need something on the agenda for the weekly team meeting? Add it yourself. Likewise, during the meeting I can take notes right in the meeting agenda page, and avoid needing to transcribe my notes later (if ever).

4. I can integrate other media, like mailing lists, calendars, and checklists, with my documents. Many of the teams I work with have a web presence, a documentation repository, and a mailing list. A wiki like Confluence can pull all those together into a single searchable archive. Where was that list of servers with expiring warranties? Email? Wiki? Who cares — just search for it. Need a checklist? Just create one as a wiki page. Need a project calendar? Just use the built-in calendar plugin.

5. No more format fights. Some of my coworkers insist on using text files for documentation. Some use text files with RCS. Some with CVS. Some with SVN. Some do the same with HTML. Some love Microsoft Word. Some use OpenOffice.org Calc, or Excel. And with Excel are you saving as .xls or .xlsx? ARGGGH. The wiki ended all of those format fights, especially since tables can have sortable column headings. Additionally, we licensed the Gliffy plugin, which lets us do basic diagramming right in in a wiki page. No, it isn’t as powerful as Visio, but most of the time we don’t need all that power, either.

6. No more document storage fights. Before the wiki we had documents in home directories. In with applications themselves. Out on the file server shares. As HTML. On desktops of servers. On desktop workstations. It was a mess, especially when someone was out of the office and we needed that data. Now we have a single place for our documentation, which gets backed up centrally, and is easily searchable.

7. While wikis are best when open, sometimes you only want to open them to a few people. Confluence has a permission model that lets you give some folks access and keep others out. It also has good anti-spam features to go with the discussion features, too, including methods like CAPTCHA.

Atlassian has always offered limited-time evaluation licenses, but for us it’s taken over a year to really get all of the different parts of our organization into using the wiki. A $5 renewable starter license for 5 users is a great, zero-risk way to get started with a world-class wiki application, especially for a small IT department. Heck, you also get support, too. The same goes for Jira, a great issue tracker that integrates with Confluence. All in all, something you probably shouldn’t pass up.

Servers Too Cold? »

Dear Readers,

Has anybody ever had a server get too cold? I’ve seen them get too warm, but there’s very little data about the cold end of things. Can anybody tell me what happens?

I’m mainly talking about servers. We do have some IBM tape drives that don’t like the cold, but that’s understandable. In the cold I’d expect issues with fans, especially the cheap ones with sleeve bearings. What else?

Useful Error Messages »

The Dell Server Update Utility script on Linux is really helpful when it can’t run:

The Software Update Utility was unable to collect inventory on this system.
[/mnt/suu/./bin/Linux/invcol]
which: no lockfile in (/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin)
invcol Error: Cannot find utilities on the system to execute Inventory
Collector.
Make sure the following utilities are in the path: tar gzip tail rm mkdir
chmod ls basename wc lockfile stat

exiting SUU application ...

This is exactly how errors should be: informative. Instead of quitting suddenly and quietly it’s nice that it tells you what utilities should be installed and in the path, and the “which” error tells you exactly what isn’t there. Excellent.

Arbitrary Milestones »

Scott Lowe’s post this morning echos something I learned a few weeks ago myself: Windows 2008 Server was released as SP1. And I had the same thought as him: WTF.

For this we have to thank all of those organizations that have chosen arbitrary releases as the first time they’ll touch new software. From operating systems to disk array firmware, it seems that a lot of system administrators will only start looking at a new version once it’s had a service pack or patch set released for it. And as a result we now have software vendors gaming the system by releasing first versions as SP1.

I’d like to share with you all a little secret: all software has bugs. Service packs introduce new and potentially buggy features and drivers along with all the fixes for known problems. Heck, sometimes even the fixes need fixes, because the service pack didn’t get as much testing as the beta for the OS itself. As such, waiting for an arbitrary release means that you trade time in the lifecycle of the software for the perception of stability. The time spent waiting for SP1 would be better used developing quality test environments, and quality patch management infrastructure. Then you’d be able to tell for yourself if SP1 is actually any better than the original release was, deploy it promptly when you decide it’s good, and deploy fixes efficiently when you need them. Facts & strategies, versus hearsay & mythology.

As for Microsoft’s gaming the system, I don’t blame them at all. Organizations have decided to be completely arbitrary about software evaluation, why can’t vendors do the same?

How File Deletions Work »

Q: I deleted a bunch of files from one of my virtual machines yesterday. Deduplication happened overnight, but the total disk space in use didn’t go down. That doesn’t make any sense.

Q: I completely evacuated one of the LUNs on my NetApp array, but the NetApp still says that the LUN is almost completely full, even after deduplication. How can that be?

A: To understand what is happening you need to know a little bit about how a file system works. A simple way to explain it is that a file system stores the data in a file as data blocks, and it stores the name of the file (and other data, like access times, etc.) in a directory block. The entry in a directory block points to where all the file’s data blocks are.

Files are found on disk via their directories, and when you move a file from one directory to another that file gets transferred to the new directory block, and removed from the old one. All the data blocks that belong to the file stay put, because it’s only the directory blocks that need to change.

When you rename a file the file system just changes its name in the directory block, and leaves all the data alone.

When you want to delete a file all the file system has to do is remove the entry in the correct directory block. Once that happens the file is “gone.”  However, the filesystem does nothing to remove the data blocks that were part of that file, though. They’re still out there on disk, just not visible to the filesystem.

This is why deduplication doesn’t instantly shrink your disk space. All that data is still out there, just like it was before, it’s just that your OS can’t see it anymore. In the case of VMware you also have to remember that not only do you have filesystems in your VMs, but VMFS is indeed a filesystem, too, with these same properties. Which is why it’s possible to have a completely empty VMFS volume but have your NetApp array complaining that the LUN/Volume/etc. is full.

If you want to have deduplication reclaim the space you have to actually overwrite that old data with something that’s easy to deduplicate, like a huge file full of zeros. On Linux you’d do something like what Leo Raikhman suggests as his zero-out script, and on Windows you can use sdelete to do the same.

Almost 1234567890 »

OMFG, if it hadn’t been for my friend Maitri I would have completely missed the UNIX timestamp becoming 1234567890.

It isn’t too late, though, to script a recursive:

perl -e 'print time(),"\n"'

and witness computing history. :-) That’s a lot of seconds since midnight on January 1, 1970.