SAGE - Sage feature


On Reliability ­ What About Yourself?

sellens

by John Sellens
<[email protected]>

John Sellens has recently joined the Network Engineering group at UUNET Canada in Toronto after 11 years as a system administrator and project leader at the University of Waterloo.


In past articles on reliability, I've talked about general principles of reliability, computing hardware, networking, and some aspects of system administration. Most of those things are really quite tangible ­ if you can't put your hands on them physically, you can at least copy them to a printer or a tape drive and hold them in your hands that way.

Since I wrote the last article (for the December issue, publishing deadlines being what they are), I've been to the 11th LISA conference in San Diego, where we spent a lot of time (more than usual) talking about management, motivation, and people issues. Since returning home, I've found myself doing some reading on management and people and thinking more about the people issues that we face in our jobs (and other activities). And I've spent a heck of a lot of time in meetings, working with people, and thinking about motivation, coordination, and how people can really enjoy their work.

So I find myself here with my laptop on my daily commute on the intercity bus and with the Christmas holidays and a new year looming up before me, composing a reliability article with a different flavor this month. I'm compelled to consider, from a purely amateur point of view, personal reliability. By that I mean to consider how we interact with our co-workers, vendors, customers, and, to a lesser extent, friends and families. How does one act "reliably"?

How is this relevant to system administrators and computing professionals in general? How does this help to make our computer systems and networks run better and more effectively? System administration is very closely tied to personal interaction, with individuals and with groups, and sometimes with people that you will never see or talk to

directly. I'll try to give a few examples of why I think that is the case and why reliability and trust are important.

System administration is a service activity ­ we supply the computing resources so that other people can do their work (or play). We solve problems for people, we design systems and software to serve people, and we help people learn to accomplish their computing tasks in the most effective ways. Any time we install a new command, send out a notice or advisory message, or answer the phone on the help desk, the underlying end product is (almost always) a service for some person or group. When we take a system down for maintenance, submit a request for more funding for more equipment, design a mission-critical computing environment, start fixing a computer or network problem, or propose a solution to suit someone's needs, we're asking for trust: trust that we are using good judgment, trust that we are knowledgeable and competent, and trust that our intentions are good. In short (and I'm sure you've been waiting for this), we are asking others to rely on us. And that's where reliability comes into things this time around.

Why is it important to be reliable? Quite simply, if we are to call ourselves "professionals," we must rely on our reputations, and the most important part of a (positive) reputation is the trust that people can place in us, our judgment, and our abilities. If we cannot be relied upon, all of our experience and abilities will be far less valuable to our customers and co-workers. The ability of others to rely on us is the foundation of the value that we bring to the profession of system administration.

How do you demonstrate your reliability? How do you earn the trust of your constituents? I think the most important piece of advice is to avoid the "us vs. them" mentality that we see (or hear about) all too often. Recognize that you and your users are (or should be) working toward the same goals and toward the success of your enterprise. Although the goals and needs of different groups sometimes seem to be at odds, a little goodwill and effort to understand will make it far easier to work together toward the best solutions.

Consider the other people in your organization, and work to understand their concerns and needs. System administration is not done in a vacuum ­ a system is only as worthwhile as the systems and solutions that it provides. A beautiful, carefully designed, "perfect" computing system is useless if it is conceptually pure but unsuited to solving the problems at hand.

When interacting with customers or others in your organization, be honest and open. If there's a problem, admit it; and if it's a result of something you did (or didn't do), own up to it, and take responsibility. Any short-term pain will be far outweighed by the long-term gain as your users trust and rely on you. Say what the problem is (or was) and what you've done to keep it from happening again. Give advance warning when you're about to change something, and be realistic about expected downtimes. And remember to follow through: do what you said you would do, when you said you would do it. And finally, be proactive: talk with your users, solicit their feedback and concerns, and act on them. Earn their trust, and you'll be far better off in the long run.

And if the word "lusers" is a part of your vocabulary, you might want to reconsider your use of it.

If you're a manager or leader of system administrators, can the people in your group rely on you? Are you supportive, understanding, fair? Do you send people home when they are sick, or do you tell them to "tough it out"? Are you an advocate for your co-workers? Do you defend them if they're being attacked (deservedly or not)? Do you champion them in interactions with other groups and higher-ups? Do you fight for appropriate conference and training budgets and extra pay or comp time when they work overtime? Can the people in your group rely on you? And finally, allow me to offer some words from Dee Hock, founder and CEO emeritus of Visa: "If you don't understand that you work for your mislabeled 'subordinates,' then you know nothing of leadership. You know only tyranny."

When I started in system administration years ago, I spent a lot more time concentrating on my "relationship" with the machines. These days, I spend a lot less time dealing with machines and a lot more time dealing with the people who surround them. I'm starting to learn which of those relationships is the more complicated and the more rewarding and where the true value and the true satisfaction lies. (The machines really don't care whether I'm reliable or not, so long as I keep the AC power coming and the backup tapes loaded.)

Well, that's enough of that. I suspect that I've been "preaching to the choir" a little bit here. Next time I promise something a little more concrete that you can sink your teeth into: backups, restores, and disaster recovery.


?Need help? Use our Contacts page.
4th February 1998 efc
Last changed: 4th February 1998 efc
Issue index
;login: index
SAGE home