


STEP 2: END USER SUPPORT
Every seasoned NT administrator has experienced this one: Youre at your desk, on the verge of solving that server performance problem thats been nagging you for weeks, when the VP of Finance walks up and rather testily informs you that she is unable to print. Read hold everything! Its a fire! Next thing you know, youve spent an hour removing and reinstalling the VPs print drivers, undoing the damage her free online service software did to the PCs network setup, and you cant even remember your fantastic server performance solution.
These fires are frustrating sometimes, but its key for Windows NT administrators and engineers to remember that the end users are our raison detre. They are our customers: why we do what we do. Many of them also approve our budget. Even if youre lucky enough to be insulated from the end users by a first-tier help desk, chances are that youll be called upon to provide escalation support at least once a day.
Here are some techniques Ive learned that help me dispatch fires as they come up and hopefully prevent new ones.
When a support issue comes up, whether its bad print drivers or users unable to connect to the network because of a server crash, always jot down a little note to yourself before jumping up from your desk to fight the fire. Five times out of six, before I started doing this, Id forget what I was working on when I got back to my desk after correcting the problem.
Remember your manners
Even if youre interrupted in the middle of a major breakthrough by what you may consider a minor problem, remember that your minor problem may be a major one for the user. Think back to the days when you were an end user, and how you felt if a technical support person blew you off because they considered your problem inconsequential. Nobody wants to be Dogbert the mean network administrator, and in the real world, brushing off end users problems can be severely career limiting. Keep your role as doctor in mind, and solve the problem quickly and professionally. Youll get back to your breakthrough soon enough.
Educate the user
If possible, explain to the user the steps youre taking to solve their problem. An educated user is a happy user. Obviously some users will not be interested in this level of detail, preferring to be notified when the darn thing works, but many users will appreciate this extra information. The same goes for first-level tech support people: If they learn how to solve the problem, chances are they wont be escalating it to you next time. Also, many help desk staffers are grateful for the chance to observe more senior administrators. I started my IT career on a help desk, and every chance I got, I went into the server room and watched as the Windows NT administrators solved a problem. This exposure proved very valuable in later years.
Document your solutions
After fighting a fire, its always a good policy to write an e-mail message or note to yourself explaining the problem and the solution, so youll have that knowledge available in case the problem crops up again. I cant count the times Ive been trying to solve a nagging support issue and got the feeling that I had dealt with the same thing several months before. Had I documented the problem and resolution, I wouldnt have had to reinvent the wheel. If your company has a help desk application or a homegrown solutions database, enter the situation so that the information will be available next time. If the solution was particularly interesting or you think the issue will come up again, share the information with other members of your group, so theyll be in the know if theyre tapped to solve the problem later.
Like it or not, end user support is a big aspect of any Windows NT Server administration career. If you have the right attitude and follow the proper procedures, you can learn a lot from those day-to-day fires.
STEP 3: SERVER TROUBLESHOOTING
Another big part of any Windows NT professionals job is server troubleshooting. Despite what Microsoft would have you believe, NT doesnt run trouble-free at all times, and external influences as well as internal software errors can cause problems with the system. When these issues come up, its key to remember that you, as the administrator, have many different tools at your disposal. Ill touch on each of these briefly a little later.
Right now, my coworkers and I are trying to narrow down the software villain thats causing our NT server to crash every couple of weeks. A reality of Windows NT networking is that no matter how fault-tolerant you make your server hardware, adding redundant power supplies, RAID arrays, load-balanced network adapters, and so on, theres usually a software bug that can bring all of that crashing down.
When this particular server dies, user impact is very heavy, as the majority of users keep their data on this box. Every two weeks or so, with no regularity or predictability, the servers network response time will begin to slow down, and soon it will quit servicing network users entirely. Well run into the server room, and the console is unresponsive. Theres nothing left to do but perform a dirty reboot. When the server comes back up, we immediately check the NT event logs and the Compaq hardware logs for errors, and yes, you guessed it: nothing. These are the server problems that try administrators souls and sometimes cause us to question our motives for going into such a crazy business. Luckily, the problem-solving tools are there. Here is how we are applying them on my network.
Documentation
One of the first things to ask when a new server problem crops up is whether anything changed on the server just before the problem began. Windows NT Server can be a fickle system, and even the most innocuous change has the potential to send it into a tailspin, sometimes for unexplained reasons. Keeping a detailed log of any and all changes to each server on your network can save you countless hours of troubleshooting. In my office, we have an Excel spreadsheet with separate sheets for each server, and we record the date, time, and nature of each change to the server, from installing new software to scheduled or unscheduled reboots, adding new drives or other hardware, and so on. These logs have proved very valuable in the past, when a change we made one week caused problems the next. Unfortunately, this technique hasnt helped us with the current challenge, so we moved on to the process of elimination.
Logical deduction and process of elimination
At first glance, my group thought the server might be crashing because of a network problem. However, all the servers reside on a relatively low-usage 100 Mbps Ethernet segment, and none of the other boxes had any problems servicing users: the Exchange server and our main application server were humming along just fine. Also, we werent seeing an excessive number of collisions on that servers port on the switch. Thus, a broadcast storm or other network problem was eliminated. Since then, weve tried to simplify the servers configuration as much as possible: stopping and disabling nonessential third-party services weve added, moving print services off to another server, and watching closely to see whether these steps make any difference. Its been less than two weeks since our last server crash, so the jurys still out, but were hopeful that removing some of the load from the server will solve our problem. Then, well slowly begin adding the third-party services back in and testing to see which of them might have caused the crashes.
Microsoft TechNet
I guess I cant mention this resource enough. Microsoft TechNet and its online partner Support Online are fantastic resources for Windows NT networking professionals (see Figure 11-7). Chances are if youre having a strange problem with your server, someone else has had the same problem and reported it to Microsoft. When Microsoft identifies a problem, they issue a support article on it, and thousands of those articles are gathered together with white papers and other information on the TechNet CD each month. A TechNet subscription isnt cheap at approximately $300 per year, but its well worth it: The access to service packs, late-breaking technical information, and upgrades can be priceless.
Ironically, one of TechNets strengths is also one of its weaknesses. There is so much information that it can take most of an afternoon to search through all the articles that may come up on a search. A search for the words server crash, for example, brings up 710 articles in the November 1998 TechNet. When you are using TechNet or Support Online, the more specific your query, the better.
Support Online can be found at support.microsoft.com/support and offers access to the same Knowledge Base articles as a TechNet subscription does, but without many of TechNets other features (see Figure 11-8).
Internet support groups
In my opinion, the best aspect of the Internet revolution is the ability it has given private citizens to gather together in clubs and user groups to share information and solve problems quickly. Whether its a car club sharing information about how to go faster at the racetrack or computer types trying to solve a problem, private parties now have much more information at their fingertips than they did just three or four years ago. Rather than waiting for a once-a-month user group meeting, Windows NT professionals have access to many ongoing support discussions, where they can bounce problems off other working administrators. Often, they can get solutions in minutes or hours instead of waiting days or weeks for an answer from Microsoft or another vendor.
This ability to tap into the knowledge of hundreds or thousands of other administrators and ask Any of you guys ever seen this one before? has proved priceless on several occasions in my NT administration career. Someone halfway across the world may have solved the same problem youre facing a month ago, or may be able to help you look at the problem in another way that might propel you toward a solution.
Internet support groups come in two flavors: e-mail mailing lists and Web sitebased solutions databases. Its worth any NT professionals time to join one of the mailing lists and search these sites for answers.
Windows NT Server troubleshooting is definitely a challenging aspect of any NT administrators daily task list, but it can be made easier if you take advantage of all the tools that are available.
STEP 4: TAPE BACKUPS
Back in Chapter 3, I discussed backups as an essential element of any well-run Windows NT network. However, backups are only as good as the media theyre recorded on. Even if your network has a bulletproof backup plan in place, you must verify that the proper files were actually backed up and make sure the backups are good, that the files will be intact should you need to restore them.
Defining a backup schedule
This was touched on in Chapter 3, and Ill review it here. An important part of implementing any backup scheme for your network is defining a schedule and sticking to it. Depending on the size and complexity of your network, and the volatility of your users data, you may need to do nightly backups, or once a week might be sufficient. Once you determine how often backups will be run, you must ensure that this process happens each day or each week. Most third-party backup solutions have an automatic scheduling facility built in, and Windows NTs native NTBACKUP program can be automated using NTs AT command scheduling utility. Once this task is complete, backing up can become as easy as swapping a few tapes daily. However, even this can have its pitfalls: In my network, we rely on end users in field offices to change tapes each day and send the old tape out for offsite storage. On a couple of occasions, a user assigned that task got busy and forgot to put the new tape in after removing the old one. Luckily the server didnt crash the day after one of these mishaps, as we would have had severe data exposure, but the incidents made us nervous nonetheless back at the home office.
|
Page: 1, 2, 3, 4, 5, 6 |
next page  |
|
|
|
|