Troubleshooting is the fine art of figuring out what precisely is failing when a complex system isn't working to your expectations. A complex system has many partsjust like your connection to a TCP/IP network. If any one part breaks, the overall connection between you and your destination is broken. When that happens, you can reach into TCP/IP's bag of tools and start testing one component at a time in the network until you find exactly where the problem lies. Then you can hold your head up high and call in an informed trouble report to your network's help desk.
The first step in diagnosing a problem is to map out the system. After you have identified all its parts, you can then start systematically testing each one. By testing each component, you can start isolating the source of the problem by eliminating the components that still work. This process of elimination is a simple but effective way to find a problem's source.
To see how this works, walk through a basic example and see how to use some of TCP/IP's tools to start that step-by-step process of eliminating potential causes. For the sake of example, assume you are just communicating across a LAN with a nearby web server. That web server should be serving up a web page but isn't. Figure 14-9 shows you all the components in this scenario that could be causing this problem.
As you can see in Figure 14-10, everything starts with the application software: your web browser. Immediately below that is your network protocol: TCP/IP. TCP/IP runs on top of your operating system, and the operating system relies upon a LAN (Ethernet) for connectivity to the web server.
The web server features almost the same set of points of failure in reverse: Ethernet, operating system, TCP/IP, and the web service. Any one of these components, on either machine, could cause the entire communications session to fail. Troubleshooting should always begin at the top, close to home, and iterate outward toward the destination. At the top of the stack is the application software. In this case, that's the browser. Figure 14-10 shows you which of the potential problem areas you check first.
Your browser is a fairly bulletproof piece of software. Chances are really good that unless you have been deleting files at random from your hard drive, your browser is just fine. It is, however, still a good habit to systematically test and eliminate the potential sources of failure by starting at the top and working your way down through the stack.
You can check the integrity of your browser by using it to access a file on your own computer. This type of test allows you to check whether the browser is working by testing itnot it plus a whole bunch of other things. Figure 14-11 shows you how to do this.
In a Windows XP environment, it is as simple as opening up an Internet Explorer or Netscape Navigator session and entering c: in the address bar. Figure 14-11 shows an entered c: in Internet Explorer's address bar. Figure 14-12 shows what you can expect to get after you press Enter.
If you can successfully browse your hard drive and see your local folder structure and files, you have proven that both your browser and computer's operating system are working just fine. Notice that when troubleshooting, you don't always have the luxury of working systematically from the top down. This example rules out the operating system and the browser but not exonerating TCP/IP. That's okay because a systematic approach of eliminating possible causes is followed. A process of elimination, when carefully applied, gradually leads you to the source of the problem regardless of the order in which you have eliminated the possible causes.
The next level down in your stack of potential problem areas is TCP/IP. A quick look at Figure 14-13 shows you visually which area to focus on next.
TCP/IP, too, is fairly reliable. However, it is possible to misconfigure it. Thus, your next step should be to run the ipconfig command to verify that your connection is set up properly. You have already seen how to do that earlier in this chapter. Some of the things to check are whether TCP/IP is actually running, what your IP address is, and whether you are using the right subnet mask.
Many other nuggets of wisdom are waiting to be mined from netstat s. In fact, an entire chapter could be written on just this command! For the purposes of this chapterhelping you understand a basic approach to troubleshooting connectivity problems in a TCP/IP networkI limit the exploration of TCP/IP performance statistics to those in Table 14-2. Now it's time to move further down the stack in pursuit of isolating the source of the problem. Check the LANAfter checking the operating efficiency of your computer's TCP/IP protocol stack, check whether your LAN connection is working. Figure 14-16 shows that you are now on the last of the potential sources of failure that can be isolated on your computer. Figure 14-16. The Last Step: Check the LANRunning netstate on your computer's command prompt shows you exactly how well or how poorly your LAN connection is performing. Figure 14-17 shows you the results of just such a test. It is important to note that this test only checks the LAN connection on your computer! It is possible that the network itself might be having problems or that the server you are trying to reach is ailing. The purpose of netstat e is to eliminate your computer's connection to the LAN as a potential source of failure. Figure 14-17. netstat Ethernet Statistics
Looking at Figure 14-17, you can see that you don't get quite as much information as you did by running netstat s, but what you do get is quite useful. The information is presented in two columns: Sent and Received. That should be fairly self explanatory. Your computer tracks Ethernet information based on what was sent from your computer and what was received by your computer. The first line in this table is the total number of bytes processed by your computer. It is possible that, due to a wiring problem or other configuration problem, your computer might be sending but not receiving data. This is indicated by large numbers in the Sent column and 0s in the Received column. If that's the case, it's time for a call to your friendly neighborhood network administrator because there's not much you can do to fix this type of problem. Of course, if you really want to score points with your technical support person, make sure the wire is physically connected to your computer before calling! The next thing to look for is a trend. Generally speaking, you receive far more bytes than you send. The netstat Ethernet statistics show you both the total number of bytes received and the total number of packets. Don't worry too much about the unicast versus non-unicast packets. A unicast packet is one that was sent to a single, specific destination. A non-unicast packet, in this context, means packets that were broadcast to multiple machines at the same time. Remember: These metrics focus on Ethernet and do not count TCP/IP packets. Sometimes Ethernet delivers packets (frames, really, but the netstat tool convention is followed) that are poorly formed or damaged. Those packets are discarded by their recipient and a running tally of those discarded packets is recorded by the machine. As you might have guessed, netstat e shows you these under the heading Discards. The last item of interest in diagnosing your Ethernet connection is the Errors row. Ethernet is subject to many types of transmission errors. For example, the data you send might actually collide with data sent by another machine on your network. When such a collision occurs, your computer keeps a running tally and displays that total in the Errors field. If your computer is displaying multiple errors or discards, there is likely something seriously wrong with your network or your network connection. All this information is invaluable to your network administrator in further diagnosing what ails your network connection. One caveat: If your computer has been running for a while, you might be looking at an accumulation of errors that date back in time. These numbers get reset to 0s every time you reboot. Consequently, such accumulated errors might be completely unrelated to your current inability to access the web server. Many people leave their computers on forever because it saves the effort of rebooting. If you are one of those people, here's a great reason to invest the extra effort to power down your computer whenever you finish using it. If your Ethernet LAN connection appears to be working fine and you can access other computers or resources, it is time for the next step: Is the problem somewhere beyond your computer and LAN? Such errors might well be beyond your ability to control, but that doesn't mean you are completely helpless!
|