2.3. The Internet Control Message Protocol
IP has a companion protocol that we
haven't talked about yet. This is the
Internet Control
Message Protocol (ICMP),
used by the kernel networking code to communicate error messages to
other hosts. For instance, assume that you are on erdos again and want to telnet to port 12345
on quark, but
there's no process listening on that port. When the
first TCP packet for this port arrives on quark, the networking layer will recognize
this arrival and immediately return an ICMP message to erdos stating "Port
Unreachable."The ICMP
protocol provides several different messages, many of which deal with
error conditions. However, there is one very interesting message
called the Redirect message. It is generated by the routing module
when it detects that another host is using it as a gateway, even
though a much shorter route exists. For example, after booting, the
routing table of sophus may be
incomplete. It might contain the routes to the math
department's network, to the FDDI backbone, and the
default route pointing at the Groucho Computing
Center's gateway (gcc1). Thus, packets for quark would be sent to gcc1 rather than to niels, the gateway to the physics department.
When receiving such a datagram, gcc1
will notice that this is a poor choice of route and will forward the
packet to niels, meanwhile returning
an ICMP Redirect message to sophus
telling it of the superior route.This seems to be a very clever way to
avoid manually setting up any but the most basic routes. However, be
warned that relying on dynamic routing schemes, be it RIP or ICMP
Redirect messages, is not always a good idea. ICMP Redirect and RIP
offer you little or no choice in verifying that some routing
information is indeed authentic. This situation allows malicious
good-for-nothings to disrupt your entire network traffic, or even
worse. Consequently, the Linux networking code treats Network
Redirect messages as if they were Host Redirects. This minimizes the
damage of an attack by restricting it to just one host, rather than
the whole network. On the flip side, it means that a little more
traffic is generated in the event of a legitimate condition, as each
host causes the generation of an ICMP Redirect message. It is
generally considered bad practice to rely on ICMP redirects for
anything these days.
2.3.1. Resolving Hostnames
As described earlier in this chapter,
addressing in TCP/IP networking, at least for IP Version 4, revolves
around 32-bit numbers. However, you will have a hard time remembering
more than a few of these numbers. Therefore, hosts are generally
known by "ordinary" names, such as
gauss or strange. It becomes the
application's duty to find the IP address
corresponding to this name. This process is called
hostname resolution. When
an application needs to find the IP address of a given host, it
relies on the library functions gethostbyname(3)
and gethostbyaddr(3). Traditionally, these and a
number of related procedures were grouped in a separate library
called the resolverlibrary; on Linux, these
functions are part of the standard libc.
Colloquially, this collection of functions is therefore referred to
as "the resolver." Resolver name
configuration is detailed in Chapter 5.
On a
small network like an Ethernet or even a cluster of Ethernets, it is
not very difficult to maintain tables mapping hostnames to addresses.
This information is usually kept in a file named
/etc/hosts. When adding or removing hosts, or
reassigning addresses, all you have to do is update the
hosts file on all hosts. Obviously, this will
become burdensome with networks that comprise more than a handful of
machines.
On the Internet,
address information was initially stored in a single
HOSTS.TXT database, too. This file was
maintained at the NIC, and had to be downloaded and installed by all
participating sites. When the network grew, several problems with
this scheme arose. Besides the administrative overhead involved in
installing HOSTS.TXT regularly, the load on the
servers that distributed it became too high. Even more severe, all
names had to be registered with the NIC, which made sure that no name
was issued twice. This is why a new name resolution scheme
was adopted in 1994: the Domain
Name System. DNS was
designed by Paul Mockapetris and addresses both problems
simultaneously. We discuss the Domain Name System in detail in Chapter 5.