Audit By Shutting Stuff Down

Sometimes the only way to tell if someone is using something is to shut it off. You can send notices, post web pages, and announce in meetings that you’re going to take a web server down, delete an email alias, or decommission a server. The hour after you’re done someone will come up to you and need it turned back on. Or, sometimes it takes a month. Maybe I’m not saying the right words in my announcements. Maybe I should pore over tcpdump output to determine who is using something. Maybe I should trust my users less to know what they are doing. What I do know is that a trial outage will always be part of my plan from …

Read More

Wait As Long As Possible To Purchase Hardware

It seems to me that the longer you wait to purchase server hardware the more bang for your buck you’ll get. Whole generations of technology come and go inside a single calendar year. Vendors release new products constantly, driving the price of the old technology down. The $25,000 you were going to spend to get X performance will get you 1.5 * X in a year. I’m putting together quotes for $7,000 replacements for $50,000 servers, four years old. Another way to look at it is to buy only what you need at the time. Deploying an application and need a development environment? Buy just a development environment. Wait to buy the production hardware until you need it, and run …

Read More

I Heart Small Teams

Why do I like small teams? Metcalfe’s Law, for starters, which is something that Frederick Brooks writes about in The Mythical Man Month. Basically, the amount of communication necessary in a group equals approximately the square of the number of people in the group. As you add people to the group you start needing larger meetings, wikis, politics, etc. All of that takes time and energy, and that time and energy isn’t going directly towards the end result. Second, the more people on the team the harder it is to manage roles. It is likely that two members will have overlapping skill sets. When people have similar skill sets sometimes they end up in competition with each other. The other …

Read More

Spreadsheets As System Administration Tools

I think spreadsheet software is an oft-overlooked system administration tool. You can do simple lists of things, do budgets, graph performance, and most of all you can use the autofill features as a substitute for one-off, run-once scripts. Sure, I think everybody should know how to use Perl and the various shell utilities like awk to do things, but why pull out the big guns when you need to do something once? Besides, a lot of folks have trouble with recursion, and it’s easier to see exactly what will happen (or at least the commands you’re about to run) with a spreadsheet. I might be a heretic to some, but I’d rather folks, especially junior admins, feel comfortable with their …

Read More

Meetings Cost You Money

Number of staff from my group in your meeting: 2 Number of meetings you have per year: 50 Scheduled length of meeting (what you get billed for): 1 hour Average actionable/discussable agenda items per meeting: 0.25 (once a month) Total cost to you for us to attend: 2 * 50 * $80 = $8000 Total cost to you for monthly meetings: $2000 Savings: $6000. Your team’s budget shortfall (which started this whole discussion): $3000 Amount you still save after the shortfall is removed: $3000.

Delete Accounts You Don't Need

Some of my coworkers believe that accounts should never be deleted, just locked. Some of my coworkers believe that accounts should always be deleted. I like a combination of the two. First, lock the account. This should tell you if there are programs running as the user, crontab entries, etc. After a few weeks remove the account. If the account is gone there is no chance it’ll get unlocked somehow, get hacked, send spam, conflict with another UID, or make your life difficult in the future.

Nagios: Sun T2000 vs. Dell PowerEdge 2950

Hey web, I am doing a Nagios deployment. I need to decide on hardware, but I can’t. In short, do I run Nagios on a Sun T2000 or a Dell PowerEdge 2950? It’d be Nagios 2.9, with about 30,000 services monitored (500 hosts * 60 service checks). The T2000 would be something like eight 1 GHz cores (32 “CoolThreads”), 8 GB RAM, etc. with Solaris 10. The 2950 would be something like dual quad-core Intel X5355s, 8 GB RAM, etc. with Red Hat Enterprise Linux 5. Is there anybody out there that has anything to  say either way about this? The only information I can find about Nagios on a T2000 is an old white paper about OpsWare, and they …

Read More

My Server Naming Scheme Rules

It seems like everybody has their own idea of the best way to name computers. Some people like functional names, like HQPRINT01 or MSP-SALES. Some people like unique names, like LARRY, VOLTRON, or STRONGBAD. I like to mix the two, with each host getting a unique name plus any “service” names it might need for customer interaction. That way if I repurpose a machine, or add a service, I can just give it a new service name. Likewise, if I move a service I can move the DNS entry for the service without having to reconfigure every client. Regardless of what you like individually a sysadmin team should have a clear policy about naming things, so names are uniform. RFC …

Read More

Mozilla Minefield

Being a fellow that leaves his browsers open with about 50+ tabs I’ve decided that I’m tired of some of the problems with Firefox 2.0, most notably the memory leaks. I refuse to go back to IE, so I’ve been trying the nightly Firefox builds, dubbed “Minefield.” Sure, some things don’t work quite right sometimes (I am without a Flash plugin, oh no![0]), but then again some of the things I hate about 2.0 have gone away. The coolest thing about Minefield is that the auto-updater works for new nightly builds, too. Every couple of days I get a window asking me to install the latest Minefield 3.0a5pre. Sweet! Check it out for yourself if you’re feeling adventurous.[1] [0] I …

Read More

IBM Competes With VMware's VMotion in POWER6 Hosts

IBM announced a bunch of stuff yesterday: 1) IBM announced their POWER6 CPUs, with clock speeds up to 4.2 GHz. Sweet. These machines will run AIX, SUSE Linux Enterprise Server 10, and Red Hat Enterprise Linux 4. The mainframe folks at IBM have been working on getting the open systems outfitted with a lot of the technology that makes mainframes so reliable, and all that work is paying off. 2) They announced AIX 6, which will feature Solaris Zone-like functionality, called “Workload Partitions,” among a number of incremental updates to support the new hardware. They are also going to do an open beta. Looks like they are learning from Sun. 3) IBM announced their “Live Partition Mobility” where you can …

Read More