An Internet Primer
You can't write a good Winsock program without understanding the concept of a socket, which is used to send and receive packets of data across the network. To fully understand sockets, you need a thorough knowledge of the underlying Internet protocols. This section contains a concentrated dose of Internet theory. It should be enough to get you going, but you might want to refer to one of the TCP/IP textbooks if you want more theory.
Network Protocols and Layering
All networks use layering for their transmission protocols, and the collection of layers is often called a stack. The application program talks to the top layer, and the bottom layer talks to the network. Figure 28-1 shows the stack for a local area network (LAN) that's running TCP/IP. Each layer is logically connected to the corresponding layer at the other end of the communications channel. The server program, which is shown at the right side of the figure, continuously listens on one end of the channel, while the client program, shown on the left, periodically connects with the server to exchange data. You can think of the server as an HTTP-based World Wide Web (WWW) server, and you can think of the client as a browser program running on your computer.
![](/image/library/english/10211_fig28_01.jpg)
Figure 28-1: The stack for a LAN running TCP/IP.
IP
The IP layer is the best place to start in your quest to understand TCP/IP. The IP protocol defines packets called datagrams that are fundamental units of Internet communication. These packets, which are typically less than 1000 bytes in length, go bouncing all over the world when you open a Web page, download a file, or send e-mail. Figure 28-2 shows a simplified layout of an IP datagram.
![](/image/library/english/10211_fig28_02.jpg)
Figure 28-2: A simple IP datagram layout.
Notice that the IP datagram contains 32-bit addresses for both the source and destination computers. These IP addresses uniquely identify computers on the Internet and are used by routers (specialized computers that act like telephone switches) to direct the individual datagrams to their destinations. The routers don't care about what's inside a datagram—they're interested only in the datagram's destination address and total length. The routers' job is to resend the datagram as quickly as possible.The IP layer doesn't tell the sending program whether a datagram has successfully reached its destination. That's a job for the next layer up the stack. The receiving program can look only at the checksum to determine whether the IP datagram header was corrupted.
UDP
The TCP/IP protocol should really be called TCP/UDP/IP because it includes User Datagram Protocol (UDP), which is a peer of TCP. All IP-based transport protocols store their own headers and data inside the IP data block. Figure 28-3 shows the UDP layout.
![](/image/library/english/10211_fig28_03.jpg)
Figure 28-3: A simple UDP layout
A complete UDP/IP datagram is shown in Figure 28-4.
![](/image/library/english/10211_fig28_04.jpg)
Figure 28-4: The relationship between the IP datagram and the UDP datagram.
UDP is only a small step up from IP, but applications never use IP directly. Like IP, UDP doesn't tell the sender when the datagram has arrived. That's up to the application. The sender can, for example, require that the receiver send a response, and the sender can retransmit the datagram if a response doesn't arrive within, say, 20 seconds. UDP is good for simple, one-shot messages and is used by the Internet Domain Name System (DNS), which we'll look at later in this chapter. (UDP is used for transmitting live audio and video, for which some lost or out-of-sequence data is not a big problem.)Figure 28-3 shows that the UDP header does convey some additional information—namely, the source and destination port numbers. The application programs on each end use these 16-bit numbers. For example, a datagram that a client program sends to a server could have a source port number of 1701 and a destination port number of 1700. The server program will listen for any datagram that includes 1700 in its destination port number, and when it finds one, it can respond by sending another datagram back to the client, which will listen for a datagram that includes 1701 in its destination port number.
IP Address Format: Network Byte Order
You know that IP addresses are 32-bits long. You might think that 232 (more than 4 billion) uniquely addressed computers could exist on the Internet, but that's not true. Part of the address identifies the LAN on which the host computer is located, and part of it identifies the host computer within the network. Most IP addresses are Class C addresses, which are formatted as shown in Figure 28-5.
![](/image/library/english/10211_fig28_05.jpg)
Figure 28-5: The layout of a Class C IP address.
This means that slightly more than 2 million networks can exist, and each of those networks can have 28 (256) addressable host computers. The Class A and Class B IP addresses, which allow more host computers on a network, are all used up.
Note | The Internet powers that be have recognized the shortage of IP addresses, so they have proposed a new standard, the IPv6 protocol (sometimes referred to as IP Next Generation, or IPng for short). IPv6 defines a new IP datagram format that uses 128-bit addresses instead of 32-bit addresses. With IPv6, you'll be able, for example, to assign a unique Internet address to each light switch in your house so you can switch off your bedroom light from your portable computer from anywhere in the world. |
By convention, IP addresses are written in dotted-decimal format. The four parts of the address refer to the individual byte values. An example of a Class C IP address is 192.168.198.201. In a computer with an Intel CPU, the address bytes are stored low-order-to-the-left, in so-called little endian order. In most other computers, including the UNIX machines that first supported the Internet, bytes are stored high-order-to-the-left, in big endian order. Because the Internet imposes a machine-independent standard for data interchange, all multibyte numbers must be transmitted in big endian order. This means that programs running on Intel-based machines must convert between network byte order (big endian) and host byte order (little endian). This rule applies to 2-byte port numbers as well as to 4-byte IP addresses.
TCP
You've learned about the limitations of UDP. What you really need is a protocol that supports error-free transmission of large blocks of data. Obviously, you want the receiving program to be able to reassemble the bytes in the exact sequence in which they were transmitted, even though the individual datagrams might arrive in the wrong sequence. TCP is that protocol, and it's the principal transport protocol for all Internet applications, including HTTP and File Transfer Protocol (FTP). Figure 28-6 shows the layout of a TCP segment. (It's not called a datagram.) The TCP segment fits inside an IP datagram, as shown in Figure 28-7.
![](/image/library/english/10211_fig28_06.jpg)
Figure 28-6: A simple layout of a TCP segment.
![](/image/library/english/10211_fig28_07.jpg)
Figure 28-7: The relationship between an IP datagram and a TCP segment.
The TCP protocol establishes a full-duplex, point-to-point connection between two computers, and a program at each end of this connection uses its own port. The combination of an IP address and a port number is called a socket. The connection is first established with a three-way handshake. The initiating program sends a segment with the SYN flag set, the responding program sends a segment with both the SYN and ACK flags set, and then the initiating program sends a segment with the ACK flag set.After the connection is established, each program can send a stream of bytes to the other program. TCP uses the sequence number fields together with ACK flags to control this flow of bytes. The sending program doesn't wait for each segment to be acknowledged but instead sends a number of segments together and then waits for the first acknowledgment. If the receiving program has data to send back to the sending program, it can piggyback its acknowledgment and outbound data together in the same segments.
The sending program's sequence numbers are not segment indexes but rather indexes into the byte stream. The receiving program sends back the sequence numbers (in the acknowledgment number field) to the sending program, thereby ensuring that all bytes are received and assembled in sequence. The sending program resends unacknowledged segments.Each program closes its end of the TCP connection by sending a segment with the FIN flag set, which must be acknowledged by the program on the other end. A program can no longer receive bytes on a connection that has been closed by the program on the other end.Don't worry about the complexity of the TCP protocol. The Winsock and WinInet APIs hide most of the details, so you don't have to worry about ACK flags and sequence numbers. Your program calls a function to transmit a block of data, and Windows takes care of splitting the block into segments and stuffing them inside IP datagrams. Windows also takes care of delivering the bytes on the receiving end, but that gets tricky, as you'll see later in this chapter.
DNS
When we surf the Web, we don't use IP addresses. Instead, we use human-friendly names such as microsoft.com or www.cnn.com. A significant portion of Internet resources is consumed when host names (such as microsoft.com) are translated into IP addresses that TCP/IP can use. A distributed network of name server (domain server) computers performs this translation by processing DNS queries. The entire Internet namespace is organized into domains, starting with an unnamed root domain. Under the root is a series of top-level domains such as com, edu, gov, and org.
Note | Don't confuse Internet domains with Windows NT/2000/XP domains. The latter are logical groups of networked computers that share a common security database. |
Servers and Domain Names
Let's look at the server end first. Suppose a company named Consolidated Messenger has two host computers connected to the Internet, one for WWW service and the other for FTP service. Following convention, these host computers are named www.consolidatedmessenger.com and ftp.consolidatedmessenger.com, respectively, and both are members of the second-level domain consolidatedmessenger, which Consolidated Messenger has registered with an organization called InterNIC. (See http://www.internic.net.)
Now Consolidated Messenger must designate two (or more) host computers as its name servers. Each name server for the com domain has a database entry for the consolidatedmessenger domain, and that entry contains the names and IP addresses of Consolidated Messenger's two name servers. Each of the two consolidatedmessenger name servers has database entries for both of Consolidated Messenger's host computers. These servers might also have database entries for hosts in other domains, and they might have entries for name servers in third-level domains. Thus, if a name server can't provide a host's IP address directly, it can redirect the query to a lower-level name server. Figure 28-8 illustrates Consolidated Messenger's domain configuration.
![](/image/library/english/10211_fig28_08.jpg)
Figure 28-8: Consolidated Messenger's domain configuration.
Note | A top-level name server runs on its own host computer. InterNIC manages (at last count) 13 computers that serve the root domain and top-level domains. Lower-level name servers can be programs running on host computers anywhere on the Internet. Consolidated Messenger's Internet service provider (ISP), A.Datum Corporation, can furnish one of Consolidated Messenger's name servers. If the ISP is running Windows NT/2000 Server, the name server is usually the DNS program that comes bundled with the operating system. That name server might be designated ns1.adatum.com. |
Clients and Domain Names
Now for the client side. A user types http://www.consolidatedmessenger.com in the browser. (The http:// prefix tells the browser to use the HTTP protocol when it eventually finds the host computer.) The browser must then resolve www.consolidatedmessenger.com into an IP address, so it uses TCP/IP to send a DNS query to the default gateway IP address for which TCP/IP is configured. This default gateway address identifies a local name server, which might have the needed host IP address in its cache. If not, the local name server relays the DNS query up to one of the root name servers. The root server looks up consolidatedmessenger in its database and sends the query back down to one of Consolidated Messenger's designated name servers. In the process, the IP address for www.consolidatedmessenger.com is cached for later use if it was not cached already. If you want to go the other way, name servers are also capable of converting an IP address to a name.
HTTP
We'll do some Winsock programming soon, but just sending raw byte streams back and forth isn't very interesting. You need to use a higher-level protocol in order to be compatible with existing Internet servers and browsers. HTTP is a good place to start because it's the protocol of the Web and it's relatively simple.HTTP is built on TCP, and this is the way it works: First, a server program listens on port 80. Then a client program (typically a browser) connects to the server (www.consolidatedmessenger.com in this case) after receiving the server's IP address from a name server. Using its own port number, the client sets up a two-way TCP connection to the server. When the connection is established, the client sends a request to the server, which might look like this:
GET /customers/newproductsl HTTP/1.0
The server identifies the request as a GET, the most common type, and it concludes that the client wants a file named newproductsl that's located in a server directory known as /customers (which might or might not be \customers on the server's hard disk). Immediately following are request headers, which mostly describe the client's capabilities.
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
image/x-jg, */*
Accept-Language: en
UA-pixels: 1024x768
UA-color: color8
UA-OS: Windows NT 5.0
UA-CPU: x86
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; AK; Windows NT 5.0)
Host: www.consolidatedmessenger.com
Connection: Keep-Alive
If-Modified-Since: Wed, 24 Apr 2002 20:23:04 GMT
(blank line)
The If-Modified-Since header tells the server not to bother to transmit newproductsl unless the file has been modified since April 24, 2002. This implies that the browser already has a dated copy of this file stored in its cache. The blank line at the end of the request is crucial; it provides the only way for the server to tell that it is time to stop receiving and start transmitting, and that's because the TCP connection stays open.Now the server springs into action. It sends newproductsl, but first it sends an OK response:
HTTP/1.0 200 OK
This is immediately followed by some response header lines:
Server: Microsoft-IIS/6.0
Date: Thu, 25 Apr 2002 17:33:12 GMT
Content-Type: text/html
Accept-Ranges: bytes
Last-Modified: Wed, Apr 24 2002 20:23:04 GMT
Content-Length: 407
(blank line)
The contents of newproductsl immediately follow the blank line:
<html>
<head><title>Consolidated Messenger's New Products</title></head>
<body><body background="/images/clouds.jpg">
<h1><center>Welcome to Consolidated Messenger's New Products List
</center></h1><p>
Unfortunately, budget constraints have prevented Consolidated Messenger
from introducing any new products this year. We suggest you keep
enjoying the old products.<p>
<a href=">Consolidated Messenger's Home Page</a><p>
</body>
</html>
You're looking at elementary HTML code here, and the resulting Web page won't win any prizes. We won't go into the details because dozens of HTML books are already available. From these books, you'll learn that HTML tags are contained in angle brackets and that there's often an end tag (with a / character) for every start tag. Some tags, such as <a> (hypertext anchor), have attributes. In the example above, the following line creates a link to another HTML file:
<a href=">Consolidated Messenger's Home Page</a><p>
The user clicks on Consolidated Messenger's Home Page, and the browser requests
![](/image/library/english/10211_smallcd.gif)
![](/image/library/english/10211_smallcd.gif)
The HTTP standard includes a PUT request type that enables a client program to upload a file to the server. Client programs and server programs seldom implement PUT.
FTP
FTP handles the uploading and downloading of server files plus directory navigation and browsing. A Windows command-line program called ftp (it doesn't work through a Web proxy server) lets you connect to an FTP server using UNIX-like keyboard commands. Browser programs usually support the FTP protocol (for downloading files only) in a more user-friendly manner. You can protect an FTP server's directories with a username/password combination, but both strings will be passed over the Internet as clear text. FTP is based on TCP. Two separate connections are established between the client and server, one for control and one for data.
Internet vs. Intranet
Up to now, we've assumed that client and server computers were connected to the Internet. The fact is, you can run exactly the same client and server software on a local intranet. An intranet is often implemented on a company's LAN and is used for distributed applications. Users see the familiar browser interface at their client computers, and server computers supply simple Web-like pages or do complex data processing in response to user input.An intranet offers a lot of flexibility. If, for example, you know that all your computers are Intel-based, you can use ActiveX controls and ActiveX document servers that provide ActiveX document support. If necessary, your server and client computers can run custom TCP/IP software that allows communication beyond HTTP and FTP. To secure your company's data, you can separate your intranet completely from the Internet or you can connect it through a firewall, which is a security system that protects your company's network from external threats.