Linux Network Administratoramp;#039;s Guide (3rd Edition) [Electronic resources] نسخه متنی

1.2. TCP/IP Networks

Modern
networking applications require a sophisticated approach to carry
data from one machine to another. If you are managing a Linux machine
that has many users, each of whom may wish to simultaneously connect
to remote hosts on a network, you need a way of allowing them to
share your network connection without interfering with each other.
The approach that a large number of modern networking protocols use
is called packet switching. A packet is a small
chunk of data that is transferred from one machine to another across
the network. The switching occurs as the datagram is carried across
each link in the network. A packet-switched network shares a single
network link among many users by alternately sending packets from one
user to another across that link.

The solution that Unix systems, and
subsequently many non-Unix systems, have adopted is known as TCP/IP.
When learning about TCP/IP networks, you will hear the term
datagram, which technically has a special
meaning but is often used interchangeably with packet. In this
section, we will have a look at underlying concepts of the TCP/IP
protocols.

1.2.1. Introduction to TCP/IP Networks

TCP/IP traces its origins to a research
project funded by the United States Defense Advanced Research
Projects Agency (DARPA) in 1969. The ARPANET was an experimental
network that was converted into an operational one in 1975 after it
had proven to be a success.

In 1983, the new protocol suite TCP/IP was
adopted as a standard, and all hosts on the network were required to
use it. When ARPANET finally grew into the Internet (with ARPANET
itself passing out of existence in 1990), the use of TCP/IP had
spread to networks beyond the Internet itself. Many companies have
now built corporate TCP/IP networks, and the Internet has become a
mainstream consumer technology. It is difficult to read a newspaper
or magazine now without seeing references to the Internet; almost
everyone can use it now.

For something concrete to look at as we discuss TCP/IP throughout the
following sections, we will consider Groucho Marx University (GMU),
situated somewhere in Freedonia, as an example. Most departments run
their own Local Area Networks, while some share one and others run
several of them. They are all interconnected and hooked to the
Internet through a single high-speed link.

Suppose your Linux box is connected to a LAN of Unix hosts at the
mathematics department, and its name is erdos. To access a host at the physics
department, say quark, you enter the
following command:

$ ssh quark.school.edu
Enter password:
Last login: Wed Dec  3 18:21:25 2003 from 10.10.0.1
quark$

At the prompt, you enter your password. You are then given a
shell[2] on quark, to which you can type as if you were
sitting at the system's console. After you exit the
shell, you are returned to your own machine's
prompt. You have just used one of the instantaneous, interactive
applications that uses TCP/IP: secure shell.

[2] The shell is a command-line interface to the
Unix operating system. It's similar to the DOS
prompt in a Microsoft Windows environment, albeit much more
powerful.

While being
logged into quark, you might also
want to run a graphical user interface application, like a word
processing program, a graphics drawing program, or even a World Wide
Web browser. The X Windows System is a fully network-aware graphical
user environment, and it is available for many different computing
systems. To tell this application that you want to have its windows
displayed on your host's screen, you will need to
make sure that you're SSH server and client are
capable of tunneling X. To do this, you can check the
sshd_config file on the system, which should
contain a line like this:

X11Forwarding yes

If you now start your application, it will tunnel your X Window
System applications so that they will be displayed on your X server
instead of quark's.
Of course, this requires that you have X11 runnning on erdos. The point here is that TCP/IP allows
quark and erdos to send X11 packets back and forth to
give you the illusion that you're on a single
system. The network is almost transparent here.

Of course, these are only examples of what you can do with TCP/IP
networks. The possibilities are almost limitless, and
we'll introduce you to more as you read on through
the book.

We will now have a closer look at the way TCP/IP works. This
information will help you understand how and why you have to
configure your machine. We will start by examining the hardware and
slowly work our way up.

1.2.2. Ethernets

The
most common type of LAN hardware is known as
Ethernet. In its simplest form, it consists of a
single cable with hosts attached to it through connectors, taps, or
transceivers. Simple Ethernets are relatively inexpensive to install,
which together with a net transfer rate of 10, 100, 1,000, and now
even 10,000 megabits per second (Mbps), accounts for much of its
popularity.

Ethernets come in many flavors:
thick, thin, and
twisted pair. Older
Ethernet types such as thin and thick Ethernet, rarely in use today,
each use a coaxial cable, differing in diameter and the way you may
attach a host to this cable. Thin Ethernet uses a T-shaped
"BNC" connector, which you insert
into the cable and twist onto a plug on the back of your computer.
Thick Ethernet requires that you drill a small hole into the cable
and attach a transceiver using a "vampire
tap." One or more hosts can then be connected to the
transceiver. Thin and thick Ethernet cable can run for a maximum of
200 and 500 meters, respectively, and are also called 10-base2 and
10-base5. The "base" refers to
"baseband modulation" and simply
means that the data is directly fed onto the cable without any modem.
The number at the start refers to the speed in megabits per second,
and the number at the end is the maximum length of the cable in
hundreds of metres. Twisted pair uses a cable made of two pairs of
copper wires and usually requires additional hardware known as
active hubs. Twisted pair
is also known as 10-baseT, the "T"
meaning twisted pair. The 100 Mbps version is known as 100-baseT, and
not surprisingly, 1000 Mbps is called 1000-baseT or
gigabit.

To add a host to a thin Ethernet
installation, you have to disrupt network service for at least a few
minutes because you have to cut the cable to insert the connector.
Although adding a host to a thick Ethernet system is a little
complicated, it does not typically bring down the network. Twisted
pair Ethernet is even simpler. It uses a device called a
hub or switch that serves
as an interconnection point. You can insert and remove hosts from a
hub or switch without interrupting any other users at all.

Thick and thin Ethernet deployments are somewhat difficult to find
anymore because they have been mostly replaced by twisted pair
deployments. This has likely become a standard because of the cheap
networking cards and cablesnot to mention that
it's almost impossible to find an old BNC connector
in a modern laptop machine.

Wireless LANs are also very popular. These
are based on the 802.11a/b/g specification and provide Ethernet over
radio transmission. Offering similar functionality to its wired
counterpart, wireless Ethernet has been subject to a number of
security issues, namely surrounding encryption. However, advances in
the protocol specification combined with different encryption keying
methods are quickly helping to alleviate some of the more serious
security concerns. Wireless networking for Linux is discussed in
detail in Chapter 18.

Ethernet works like a
bus system, where a host may send packets (or
frames) of up to 1,500 bytes to another host on
the same Ethernet. A host is addressed by a 6-byte address hardcoded
into the firmware of its Ethernet network interface card (NIC). These
addresses are usually written as a sequence of two-digit hex numbers
separated by colons, as in aa:bb:cc:dd:ee:ff.

A frame sent by
one station is seen by all attached stations, but only the
destination host actually picks it up and processes it. If two
stations try to send at the same time, a
collision occurs. Collisions on an Ethernet are
detected very quickly by the electronics of the interface cards and
are resolved by the two stations aborting the send, each waiting a
random interval and re-attempting the transmission.
You'll hear lots of stories about collisions on
Ethernet being a problem and that utilization of Ethernets is only
about 30 percent of the available bandwidth because of them.
Collisions on Ethernet are a normal phenomenon,
and on a very busy Ethernet network you shouldn't be
surprised to see collision rates of up to about 30 percent. Ethernet
networks need to be more realistically limited to about 60 percent
before you need to start worrying about it.[3]

[3] The
Ethernet FAQ at
http://www.faqs.org/faqs/LANs/ethernet-faq/talks
about this issue, and a wealth of detailed historical and technical
information is available at Charles Spurgeon's
Ethernet web site at
http://www.ethermanage.com/ethernet/ethernet/.

1.2.3. Other Types of Hardware

In larger installations, or in legacy corporate environments,
Ethernet is usually not the only type of equipment used. There are
many other data communications protocols available and in use. All of
the protocols listed are supported by Linux, but due to space
constraints we'll describe them briefly. Many of the
protocols have HOWTO documents that describe them in detail, so you
should refer to those if you're interested in
exploring those that we don't describe in this book.

One older and quickly
disappearing technology is IBM's Token Ring network.
Token Ring is used as an alternative to Ethernet in some LAN
environments, and runs at lower speeds (4 Mbps or 16 Mbps). In Linux,
Token Ring networking is configured in almost precisely the same way
as Ethernet, so we don't cover it specifically.

Many national networks operated
by telecommunications companies support packet-switching protocols.
Previously, the most popular of these was a standard named X.25. It
defines a set of networking protocols that describes how data
terminal equipment, such as a host, communicates with data
communications equipment (an X.25 switch). X.25 requires a
synchronous data link and therefore special synchronous serial port
hardware. It is possible to use X.25 with normal serial ports if you
use a special device called a Packet Assembler
Disassembler (PAD). The PAD is a standalone device that
provides asynchronous serial ports and a synchronous serial port. It
manages the X.25 protocol so that simple terminal devices can make
and accept X.25 connections. X.25 is often used to carry other
network protocols, such as TCP/IP. Since IP datagrams cannot simply
be mapped onto X.25 (or vice versa), they are encapsulated in X.25
packets and sent over the network. There is an implementation of the
X.25 protocol available for Linux, but it will not be discussed in
depth here.

A protocol commonly used by
telecommunications companies is called Frame
Relay. The Frame Relay protocol shares a number
of technical features with the X.25 protocol, but is much more like
the IP protocol in behavior. Like X.25, Frame Relay requires special
synchronous serial hardware. Because of their similarities, many
cards support both of these protocols. An alternative is available
that requires no special internal hardware, again relying on an
external device called a Frame Relay Access Device (FRAD) to manage
the encapsulation of Ethernet packets into Frame Relay packets for
transmission across a network. Frame Relay is ideal for carrying
TCP/IP between sites. Linux provides drivers that support some types
of internal Frame Relay devices.

If you need higher-speed
networking that can carry many different types of data, such as
digitized voice and video, alongside your usual data,
Asynchronous Transfer Mode (ATM) is probably
what you'll be interested in. ATM is a new network
technology that has been specifically designed to provide a
manageable, high-speed, low-latency means of carrying data and
control over the Quality of Service (QoS). Many telecommunications
companies are deploying ATM network infrastructure because it allows
the convergence of a number of different network services into one
platform, in the hope of achieving savings in management and support
costs. ATM is often used to carry TCP/IP. The Networking
HOWTO offers information on the Linux support available
for ATM.

Frequently,
radio amateurs use their radio equipment to network their computers;
this is commonly called packet
radio. One of the protocols used by amateur
radio operators is called AX.25 and is loosely derived from X.25.
Amateur radio operators use the AX.25 protocol to carry TCP/IP and
other protocols, too. AX.25, like X.25, requires serial hardware
capable of synchronous operation, or an external device called a
Terminal Node Controller to convert packets transmitted via an
asynchronous serial link into packets transmitted synchronously.
There are a variety of different sorts of interface cards available
to support packet radio operation; these cards are generally referred
to as being "Z8530 SCC based,"
named after the most popular type of communications controller used
in the designs. Two of the other protocols that are commonly carried
by AX.25 are the NetRom and Rose protocols, which are network layer
protocols. Since these protocols run over AX.25, they have the same
hardware requirements. Linux supports a fully featured implementation
of the AX.25, NetRom, and Rose protocols. The AX25
HOWTO is a good source of information on the Linux
implementation of these protocols.

Other types of Internet access involve dialing up a central system
over slow but cheap serial lines (telephone, ISDN, and so on). These
require yet another protocol for transmission of packets, such as
SLIP or PPP, which will be described later.

1.2.4. The Internet Protocol

Of course, you
wouldn't want your networking to be limited to one
Ethernet or one point-to-point data link. Ideally, you would want to
be able to communicate with a host computer regardless of what type
of physical network it is connected to. For example, in larger
installations such as Groucho Marx University, you usually have a
number of separate networks that have to be connected in some way. At
GMU, the math department runs two Ethernets: one with fast machines
for professors and graduates, and another with slow machines for
students.

This connection is handled by a
dedicated host called a gateway that handles
incoming and outgoing packets by copying them between the two
Ethernets and the FDDI fiber optic cable. For example, if you are at
the math department and want to access quark on the physics
department's LAN from your Linux box, the networking
software will not send packets to quark directly because it is not on the same
Ethernet. Therefore, it has to rely on the gateway to act as a
forwarder. The gateway (named sophus) then forwards these packets to its
peer gateway niels at the physics
department, using the backbone network, with niels delivering it to the destination
machine. Data flow between erdos and
quark is shown in Figure 1-1.

Figure 1-1. The three steps of sending a datagram from erdos to quark

This scheme of directing data to a remote
host is called routing, and packets are often
referred to as datagrams in this context. To facilitate things,
datagram exchange is governed by a single protocol that is
independent of the hardware used: IP, or
Internet Protocol. In Chapter 2, we will cover IP and the issues of routing
in greater detail.

The main
benefit of IP is that it turns physically dissimilar networks into
one apparently homogeneous network. This is called internetworking,
and the resulting "meta-network" is
called an internet. Note the subtle difference
here between an internet and
the Internet. The latter is the official name of
one particular global internet.

Of course, IP also requires a
hardware-independent addressing scheme. This is achieved by assigning
each host a unique 32-bit number called the IP
address. An IP address is usually written as
four decimal numbers, one for each 8-bit portion, separated by dots.
For example, quark might have an IP
address of 0x954C0C04, which would
be written as 149.76.12.4. This
format is also called dotted
decimal notation and
sometimes dotted quad
notation. It is increasingly going under the
name IPv4 (for Internet Protocol, Version 4) because a new standard
called IPv6 offers much more flexible addressing, as well as other
modern features. It will be at least a year after the release of this
edition before IPv6 is in use.

You will notice that we now have three different types of addresses:
first there is the host's name, like quark, then there is an IP address, and
finally, there is a hardware address, such as the 6-byte Ethernet
address. All these addresses somehow have to match so that when you
type ssh quark, the
networking software can be given quark's IP address; and when
IP delivers any data to the physics department's
Ethernet, it somehow has to find out what Ethernet address
corresponds to the IP address.

We will
deal with these situations in Chapter 2. For
now, it's enough to remember that these steps of
finding addresses are called hostname
resolution, for mapping hostnames onto IP
addresses, and address
resolution, for mapping the latter to hardware
addresses.

1.2.5. IP over Serial Lines

On serial lines, a "de
facto" standard exists known as Serial
Line IP (SLIP). A modification of SLIP known as
Compressed SLIP (CSLIP), performs compression of
IP headers to make better use of the relatively low bandwidth
provided by most serial links. Another serial protocol is
Point-to-Point Protocol (PPP). PPP is more
modern than SLIP and includes a number of features that make it more
attractive. Its main advantage over SLIP is that it
isn't limited to transporting IP datagrams, but is
designed to allow just about any protocol to be carried across it.
This book discusses PPP in Chapter 6.

1.2.6. The Transmission Control Protocol

Sending
datagrams from one host to another is not the whole story. If you log
in to quark, you want to have a
reliable connection between your ssh process on
erdos and the shell process on
quark. Thus, the information sent to
and fro must be split into packets by the sender and reassembled into
a character stream by the receiver. Trivial as it seems, this
involves a number of complicated tasks.

A very important thing to know about IP is that, by intent, it is not
reliable. Assume that 10 people on your Ethernet started downloading
the latest release of the Mozilla web browser source code from
GMU's FTP server. The amount of traffic generated
might be too much for the gateway to handle because
it's too slow and it's tight on
memory. Now if you happen to send a packet to quark, sophus
might be out of buffer space for a moment and therefore unable to
forward it. IP solves this problem by simply discarding it. The
packet is irrevocably lost. It is therefore the responsibility of the
communicating hosts to check the integrity and completeness of the
data and retransmit it in case of error.

This process is performed by yet another protocol,
Transmission Control
Protocol (TCP), which builds a reliable service
on top of IP. The essential property of TCP is that it uses IP to
give you the illusion of a simple connection between the two
processes on your host and the remote machine so that you
don't have to care about how and along which route
your data actually travels. A TCP connection works essentially like a
two-way pipe that both processes may write to and read from. Think of
it as a telephone
conversation.

TCP identifies the end points of such a
connection by the IP addresses of the two hosts involved and the
number of a port on each host. Ports may be
viewed as attachment points for network connections. If we are to
strain the telephone example a little more, and you imagine that
cities are like hosts, one might compare IP addresses to area codes
(where numbers map to cities), and port numbers to local codes (where
numbers map to individual people's telephones). An
individual host may support many different services, each
distinguished by its own port number.

In the
ssh example, the client application
(ssh) opens a port on erdos and connects to port 22 on quark, to which the sshd server is known to
listen. This action establishes a TCP connection. Using this
connection, sshd performs the authorization
procedure and then spawns the shell. The shell's
standard input and output are redirected to the TCP connection so
that anything you type to ssh on your machine
will be passed through the TCP stream and be given to the shell as
standard input.

1.2.7. The User Datagram Protocol

Of course, TCP
isn't the only user protocol in TCP/IP networking.
Although suitable for applications like ssh, the
overhead involved is prohibitive for applications like NFS, which
instead uses a sibling protocol of TCP called
User Datagram
Protocol (UDP). Just like TCP, UDP allows an
application to contact a service on a certain port of the remote
machine, but it doesn't establish a connection for
this. Instead, you use it to send single packets to the destination
servicehence its name.

Assume that you want to request a small
amount of data from a database server. It takes at least three
datagrams to establish a TCP connection, another three to send and
confirm a small amount of data each way, and another three to close
the connection. UDP provides us with a means of using only two
datagrams to achieve almost the same result. UDP is said to be
connectionless, and it doesn't require us to
establish and close a session. We simply put our data into a datagram
and send it to the server; the server formulates its reply, puts the
data into a datagram addressed back to us, and transmits it back.
While this is both faster and more efficient than TCP for simple
transactions, UDP was not designed to deal with datagram loss. It is
up to the application, a nameserver, for example, to take care of
this.

1.2.8. More on Ports

Ports
may be viewed as attachment points for network connections. If an
application wants to offer a certain service, it attaches itself to a
port and waits for clients (this is also called
listening on the port). A client who wants to
use this service allocates a port on its local host and connects to
the server's port on the remote host. The same port
may be open on many different machines, but on each machine only one
process can open a port at any one time.

An important property of ports is
that once a connection has been established between the client and
the server, another copy of the server may attach to the server port
and listen for more clients. This property permits, for instance,
several concurrent remote logins to the same host, all using the same
port 513. TCP is able to tell these connections from one another
because they all come from different ports or hosts. For example, if
you log in twice to quark from
erdos, the first
ssh client may use the local port 6464, and the
second one could use port 4235. Both, however, will connect to the
same port 513 on quark. The two
connections will be distinguished by use of the port numbers used at
erdos.

This example shows the use of ports as
rendezvous points, where a client contacts a specific port to obtain
a specific service. In order for a client to know the proper port
number, an agreement has to be reached between the administrators of
both systems on the assignment of these numbers. For services that
are widely used, such as ssh, these numbers have
to be administered centrally. This is done by the Internet
Engineering Task Force (IETF), which regularly releases an RFC titled
Assigned Numbers
(RFC-1700). It describes, among other things, the port numbers
assigned to well-known services. Linux uses a file called
/etc/services that maps service names to
numbers.

It
is worth noting that, although both TCP and UDP connections rely on
ports, these numbers do not conflict. This means that TCP port 22,
for example, is different from UDP port 22.

1.2.9. The Socket Library

In Unix operating systems, the software
performing all the tasks and protocols described above is usually
part of the kernel, and so it is in Linux. The programming interface
most common in the Unix world is the Berkeley Socket Library. Its
name derives from a popular analogy that views ports as sockets and
connecting to a port as plugging in. It provides the bind call to
specify a remote host, a transport protocol, and a service that a
program can connect or listen to (using connect, listen, and accept).
The socket library is somewhat more general in that it provides not
only a class of TCP/IP-based sockets (the
AF_INET sockets), but also a class that handles
connections local to the machine (the AF_UNIX
class). Some implementations can also handle other classes, like the
Xerox Networking System (XNS) protocol or X.25.

In Linux, the socket library is part of
the standard libc C library. It supports the
AF_INET and AF_INET6
sockets for TCP/IP and AF_UNIX for Unix domain
sockets. It also supports AF_IPX for
Novell's network protocols, AF_
X25 for the X.25 network protocol,
AF_ATMPVC and AF_ATMSVC for
the ATM network protocol and AF_AX25,
AF_NETROM, and AF_
ROSE sockets for Amateur Radio protocol support.
Other protocol families are being developed and will be added in
time.

Linux Network Administratoramp;#039;s Guide (3rd Edition) [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی