Abstract
This chapter will help you determine what to do if something goes wrong with your TCP/IP network. The authors combine a review of the most important TCP/IP topics with some helpful troubleshooting guidelines. They also cover the major TCP/IP troubleshooting utilities and discuss how to use them most efficiently. Topics covered in this chapter summarize common TCP/IP-related problems, symptoms, and possible causes, as well as the concrete steps required to troubleshoot them.
GENERAL CONSIDERATIONS
When something goes wrong, we often try to choose a tool that can immediately solve our problem. Before deciding which utility to use, however, you should determine the source of the problem. A number of problems turn out not to be TCP/IP-related (for example, a network interface card malfunction) and need to be solved by other methods. In this chapter, however, we will speak only of TCP/IP-related problems.
TCP/IP problems can be grouped by category, as shown in Table 1.
When a problem occurs, you might want to ask yourself these simple questions:
What should work?
What does work?
What does not work?
What has changed since it last worked?
Answering these questions will help you choose the right tool to isolate the problem. For example, suppose that Mary complains that she is unable to connect to remote NetBIOS hosts by their computer name. After speaking with her, you find out that recently she accidentally deleted some files from the %systemroot%\drivers\etc\ folder on her computer. Knowing what set of actions has resulted in this problem, you will not waste time in low-level connectivity checks, but can go directly to the folder to check for an LMHOSTS file.
WINDOWS NT DIAGNOSTIC TOOLS OVERVIEW
Microsoft Windows NT Server and Workstation have many useful utilities to diagnose and troubleshoot TCP/IP. Many powerful utilities are included in the Windows NT Resource Kit (for example, Browstat (a command line utility that can be used to force the browser elections for a specified domain), Browmon (a graphical utility that can be used to view browsers for selected domains), Wntipcfg (a graphical utility with the same functionality as IPCONFIG). In addition, as you may already know, Microsoft Systems Management Server includes an advanced version of Network Monitor–a great program to trace and monitor your network at the packet level. Table 2 lists common diagnostic utilities that are included in Microsoft TCP/IP.
Each utility may be used to diagnose only one part of a problem. None will solve an entire problem alone. Later in this chapter, you will be introduced how to use these utilities together to troubleshoot your network.
TCP/IP TROUBLESHOOTING GUIDELINES
There is no fixed sequence of steps to troubleshoot TCP/IP-related problems–everything depends on the particular scenario. There are, however, some basic guidelines that fit most situations.
The first thing you should do is to ensure the physical connection is functioning. It’s useless to employ a host of troubleshooting utilities if the office hub is malfunctioning. When link reliability is in question, for example, when the WAN link is malfunctioning, you may want to ping various remote hosts to check connectivity.
Once you’re sure the links are functioning properly, you should start testing the local host’s configuration parameters, then examine routing configurations, and finally check name resolution issues. The hierarchy of troubleshooting steps is illustrated in Figure 1. Note that you test the lower layers of the TCP/IP stack first. Once the low-level TCP/IP functions are working correctly, move to the higher levels.
Identifying the TCP/IP Configuration
Checking the TCP/IP configuration is the most basic troubleshooting step. You might want to ensure the TCP/IP parameters have been entered without mistakes or that DHCP has set them correctly. You should begin by checking the TCP/IP configuration on the computer that appears to be experiencing problems.
A good starting point is the IPCONFIG command line utility. IPCONFIG displays the IP address, subnet mask, and default gateway, as well as other advanced TCP/IP parameters, such as WINS server, IP address, and node type.
You should use the IPCONFIG utility with the /all switch because it produces a detailed report concerning the current TCP/IP configuration. The following is an example of the output from IPCONFIG:
In some cases, just reviewing this report can resolve the problem. For example, if the DHCP client could not obtain the IP address, running IPCONFIG returns the IP address and subnet mask of 0.0.0.0.
This listing could indicate that the DHCP server is down or that there are no free IP addresses in the DHCP server’s scope.
Incorrect IP Address Assignment
To determine if a computer has been assigned a valid IP address, you can use the following guidelines:
Check that the IP address is from the correct subnet.
Check that the IP address is not duplicated.
Check that the IP address is not a broadcast address for the given subnet (host ID is all-ones).
Check that the IP address is not the subnet address (host ID is all-zeroes).
Figure 2 shows a network where computers have IP addressing problems. Computer 1 and Computer 3 have duplicate IP addresses. Computer 2 has the IP address that is the broadcast address for subnet 172.20.0.0, mask 255.255.255.0. Computer 4 has an IP address from another subnet.
Subnet Mask Problems
Subnet mask problems are very hard to diagnose and isolate. This is mainly because, depending on the actual numbers, an invalid subnet mask can have no negative impact or can make the entire network unreachable for a particular computer. In some cases, an incorrect subnet mask could cause some computers to become unreachable, while the rest of network remains operational. In Figure 3, Computer 2 has an incorrect subnet mask (displayed in bold). Although it can successfully establish a connection with Computer 1, Computer 3, and the default gateway, it fails to communicate with Computer 4.
There are two common problems with subnet masks:
The configured subnet mask is shorter than needed (too many bits are reserved for network and subnet ID).
The configured subnet mask is longer than needed (too few bits are reserved for network and subnet ID).
Improper subnet mask configuration is often the result of inaccurate planning or of mistyping the subnet mask during manual TCP/IP parameter assignment. These problems are particularly prevalent when we implement custom subnet masking. (You’ll remember the tediousness in counting ones and zeroes in Chapter 4 and can appreciate how easy it is to make a mistake in doing so.)
Let’s look at some symptoms that can indicate these two problems. Suppose we have the class C network 192.168.18.0. We divide it into eight subnets using the subnet mask 255.255.255.224. (See Figure 4.) Now, what happens if we assign a particular computer a shorter subnet mask: 255.255.255.192 (arrow 1). That computer would think: "The shorter my subnet mask, the more other computers I recognize to be inside my subnet". If the subnet mask is 255.255.255.0, this computer will think the network is not divided into subnets at all.
On the other hand, if we assign the computer a longer subnet mask (arrow 2), let’s say 255.255.255.240, it will think some computers on the local network segment are outside its subnet.
The following example illustrates both of these subnet masking problems. Figure 5 shows a properly planned and configured network. Let’s introduce some subnet masking errors to demonstrate how they affect network communication.
If we begin to enlarge the subnet mask of Computer 6, simulating a data entry error, the computer starts experiencing connectivity problems. Computers inaccessible to Computer 6 hosts are dimmed in Figure 6.
With a subnet mask of 255.255.255.192, Computer 6 thinks that Server 1 is located on a remote network. (You might remember that the computer determines whether the target host is located in the local or remote network by ANDing the IP addresses with its subnet mask and comparing the results.)
Computer 6 Server 1
IP address 172.20.0.20 172.20.0.70
Subnet Mask 255.255.255.192 255.255.255.192
AND operation 172.20.0.0 172.20.0.64
Since the ANDed results do not match, Computer 6 will send all the packets for Server 1 to the router instead of making a direct connection. The larger the subnet mask grows, the more computers jump out of reach of Computer 6. Finally, when Computer 6’s subnet mask is 255.255. 255.240, even the router is outside Computer 6’s subnet and communications to remote networks are impossible.
Computer 6 Default gateway
IP address 172.20.0.20 172.20.0.1
Subnet Mask 255.255.255.240 255.255.255.240
AND operation 172.20.0.16 172.20.0.0
Note that an incorrect subnet mask of 255.255.255.128 causes no apparent problem. Although this incorrect subnet mask changes Computer 6’s conception of the network, it will not prevent communication with neighboring computers.
If the subnet mask is too short, we are likely to experience problems in trying to contact remote computers. (See Figure 7.)
When Computer 6’s subnet mask is 255.255.0.0, it recognizes Server 8 as the local computer.
Computer 6 Server 8
IP address 172.20.0.20 172.20.1.45
Subnet Mask 255.255.0.0 255.255.0.0
AND operation 172.20.0.0 172.20.0.0
Since both ANDed results match, Computer 6 tries to send IP packets to Server 8 directly without using a router and fails. If the subnet mask is shorter, more remote computers are considered to be directly reachable.
Computer 6 Server 9
IP address 172.20.0.20 172.18.0.88
Subnet Mask 255.0.0.0 255.0.0.0
AND operation 172.0.0.0 172.0.0.0
When you troubleshoot your IP network, it might be beneficial to have a copy of your subnetting plan close by. This enables you to look up the broadcast addresses and the subnet masks, and check that they are not assigned to hosts.
TESTING IP COMMUNICATIONS
Once your computer has obtained an IP address and a subnet mask, you should test the IP communications. PING is the utility that can be used for verifying IP-level connectivity. As you may remember, PING sends the ICMP echo request to the destination host and analyzes ICMP echo replies.
The recommended sequence of pings is the following:
Ping the loopback address. (If you are unable to ping the loopback address, it may indicate the computer has not been restarted after TCP/IP was installed and configured — restart and try again.)
Ping the IP address of the local computer. (If you cannot ping the local IP address, check to ensure your computer has a valid IP address that is not duplicated elsewhere on the network.)
Ping the IP address of the default gateway. (If this step is unsuccessful, check the subnet mask on your computer.)
Ping the IP address of the remote host. (If this step is unsuccessful, check the default gateway address configured on the local computer, the functionality of the link between routers, and the remote computer’s availability.)
Ping the remote host by name. (If this step fails, check host name resolution.)
The PING utility has many switches that can be used to expand its functionality. To view the available command line options, type PING -?
C:\WINNT>ping -?
Usage: ping [-t] [-a] [-n count] [-l size] [-f] [-i TTL] [-v TOS]
[-r count] [-s count] [[-j host-list] | [-k host-list]]
[-w timeout] destination-list
Options:
-t Ping the specified host until interrupted.
-a Resolve addresses to hostnames.
-n count Number of echo requests to send.
-l size Send buffer size.
-f Set Don’t Fragment flag in packet.
-i TTL Time To Live.
-v TOS Type Of Service.
-r count Record route for count hops.
-s count Timestamp for count hops.
-j host-list Loose source route along host-list.
-k host-list Strict source route along host-list.
-w timeout Timeout in milliseconds to wait for each reply.
For example, you can specify the size of the packets to use, how many packets to send, and how much time to wait for a response. There are some more advanced options of the PING utility such as specifying the type of service (TOS), the initial TTL value, and the source route. These options directly affect the header of the IP packet in which the ICMP message is encapsulated. For example, the command
will cause the following ICMP packets to be sent (note that by specifying the PING options –v and –j we set the values of some fields in the IP header– printed in bold typeface):
+ FRAME: Base frame properties
+ ETHERNET: ETYPE = 0x0800 : Protocol = IP: DOD Internet Protocol
IP: ID = 0x2155; Proto = ICMP; Len: 76
IP: Version = 4 (0x4)
IP: Header Length = 36 (0x24)
IP: Service Type = 12 (0xC)
IP: Precedence = 0x0C
IP: ...0.... = Normal Delay
IP: ....1... = High Throughput
IP: .....1.. = High Reliability
IP: Total Length = 76 (0x4C)
IP: Identification = 8533 (0x2155)
+ IP: Flags Summary = 0 (0x0)
IP: Fragment Offset = 0 (0x0) bytes
IP: Time to Live = 252 (0xFC)
IP: Protocol = ICMP - Internet Control Message
IP: Checksum = 0xF92C
IP: Source Address = 193.232.80.66
IP: Destination Address = 194.226.192.52
IP: Option Fields = 131 (0x83)
IP: Loose Source Routing Option = 131 (0x83)
IP: Option Length = 15 (0xF)
IP: Routing Pointer = 16 (0x10)
IP: Route Traveled = 194 (0xC2)
IP: Gateway = 194.85.36.30
IP: Gateway = 194.85.165.169
IP: Gateway = 194.226.192.33
IP: End of Options = 0 (0x0)
IP: Data: Number of data bytes remaining = 40 (0x0028)
+ ICMP: Echo Reply, To 194.226.192.52 From 193.232.80.66
By using the above PING command, we’ve checked whether the target host is available by specific type of service (high throughput and high reliability) and the selected route (194.226.192.33, 194.85.165.169, 194. 85.36.30).
Routing Problems
Even when your computer is properly configured, a malfunctioning router can cause difficulties. An improperly configured route typically causes the problem (in this case, improperly configured could also mean not configured). Remember, if the Windows NT router does not have an interface on a given subnet, it will need a route there. You can do this by adding a static route or by using a multi-protocol router (MPR). If a router is implemented on a Windows NT computer, you can check the existing routes by using the ROUTE utility. If inconsistencies are found in the routing table, you can correct them by using the ROUTE ADD and ROUTE DELETE commands.
Note: Having multiple network adapters on a Windows NT computer allows you to add a default route for each network card. Although it will create several 0.0.0.0 routes, only one default route will actually be used. You should conFigure only one card to have a default gateway—this will reduce confusion and ensure the results you intended.
The TRACERT Utility
The TRACERT command can be used to determine where a packet stopped on the network because of an improper router configuration or link failure. In the following example, the second router has determined that there is not a valid path for host 172.21.0.55. There is probably a router configuration problem or the 172.21.0.0 network does not exist (a bad IP address).
C:\>tracert 172.21.0.55
Tracing route to 172.21.0.55 over a maximum of 30 hops
1. <10 ms <10 ms <10 ms 172.20.0.1
2. 172.18.6.54 reports: Destination net unreachable.
Trace complete.
TRACERT is useful for troubleshooting large networks where several paths can be taken to arrive at the same point, or where many intermediate systems (routers or bridges) are involved.
If you can ping across the router, but cannot establish a session, check to see if the router is able to pass large packets. The PING utility sends its data in 74-byte blocks, but NET requests can be significantly larger. You can use the PING –l command to use a larger packet size. To correct this problem, you may want to edit the Registry to specify a smaller packet size. This must be done on every problem computer.
Note: Although we said PING sends its data in 74-byte blocks, you’ll see PING indicate that it’s using only 32 bytes of data. This is because PING reports only its data block length. The actual ICMP packet is 74 bytes: 32 byte data block + 14 bytes for the Ethernet header + 20 bytes for the IP header + 8 bytes for the ICMP header.
Study Break: Editing the Registry to Specify a Smaller Packet Size
All of the TCP/IP parameters are Registry values located under HKEY_LOCAL_MACHINE\ SYSTEM\CurrentControlSet\Services \Tcpip\Parameters or HKEY_LOCAL_MACHINE\SYSTEM\ CurrentControlSet\ Services\ \Parameters\tcpip. In this case, refers to the subkey for a network adapter that is bound to the TCP/IP protocol.
There are two Registry entries that can affect TCP/IP packet size. The first one is found in the Tcpip\Parameters subkey and is called EnablePMTUDiscovery (REG_DWORD). This entry can be set to 0 (False) or 1 (True); its default is 1. When set to 1, it directs TCP/IP to attempt to discover the Maximum Transmission Unit (MTU—largest packet size) over the path to a remote host. This permits TCP/IP to eliminate fragmentation at routers along the path that connect networks with different MTUs. If you set this value to 0, an MTU of 576 will be used for connections to all machines that are not on the local subnet.
The other Registry entry is found in the \Parameters\ Tcpip subkey and is called MTU (REG_DWORD). MTU can be set anywhere between 68 and the actual MTU of the underlying network. (The 68 minimum is required to provide space for the transport header—using a value less than this will result in an MTU of 68.) Setting this parameter overrides the default MTU for the network interface.
WinConnections Conference Fall 2008 Don’t miss the premier event for Microsoft IT Professionals in Las Vegas, November 10-13. Register and book your room by August 25 and receive a FREE room night (based on a three night minimum stay).
Master SharePoint with 3 eLearning Seminars Learn how to build a better SharePoint infrastructure and enable powerful collaboration with MVPs Dan Holme and Michael Noel. Register today!
SharePointConnections Conference Fall 2008 Don’t miss the premier event for Microsoft IT Professionals in Las Vegas, November 10-13. Register and book your room by August 25 and receive a FREE room night (based on a three night minimum stay).
VMworld 2008 - Sign Up Today! Join your peers on September 15-18 at The Venetian Hotel in Las Vegas as VMware hosts VMworld 2008, the leading Virtualization event.
Microsoft® Tech•Ed EMEA 2008 IT Professionals Advance your thinking with new ideas and practical real-world solutions at Microsoft’s FIVE day technical infrastructure conference 3-7 Nov., 2008. Register before 26 September 2008 to save €300.
Order Your Fundamentals CD Today! Gain an introduction to Exchange, learn server security requirements, and understand how unified communications can play a role in your messaging strategies with this free Exchange CD.
Are You Really Compliant with Software Regulations? View this web seminar that will help you with compliance best practices and check out a management solution to assure that you won’t be in jeopardy of an audit.
Virtualization Congress Oct. 14-16 in London Don't miss Virtualization Congress, the premiere EMEA conference dedicated to hardware, OS and application virtualization. Oct. 14-16 in London.