Java Network Programming (3rd ed) [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Java Network Programming (3rd ed) [Electronic resources] - نسخه متنی

Harold, Elliotte Rusty

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید








3.3 HTTP


HTTP is the
standard protocol for communication between web browsers and web
servers. HTTP specifies how a client and server establish a
connection, how the client requests data from the server, how the
server responds to that request, and finally, how the connection is
closed. HTTP connections use the TCP/IP protocol for data transfer.
For each request from client to server, there is a sequence of four
steps:

Making the connection


The client establishes a TCP connection to the server on port 80, by
default; other ports may be specified in the URL.


Making a request


The client sends a message to the server requesting the page at a
specified URL. The format of this request is typically something
like:

GET /indexl HTTP/1.0

GET specifies the operation being requested. The
operation requested here is for the server to return a representation
of a resource. /indexl is a relative URL that
identifies the resource requested from the server. This resource is
assumed to reside on the machine that receives the request, so there
is no need to prefix it with
http://www.thismachine.com/.HTTP/1.0
is the version of the protocol that the client understands. The
request is terminated with two carriage return/linefeed pairs
(\r\n\r\n in Java parlance), regardless of how
lines are terminated on the client or server platform.

Although the GET line is all that is required, a
client request can include other information as well. This takes the
following form:

Keyword: Value

The most common such keyword is Accept, which
tells the server what kinds of data the client can handle (though
servers often ignore this). For example, the following line says that
the client can handle four MIME media types, corresponding to HTML
documents, plain text, and JPEG and GIF images:

Accept: text/html, text/plain, image/gif, image/jpeg

User-Agent is another common keyword that lets the
server know what browser is being used, allowing the server to send
files optimized for the particular browser type. The line below says
that the request comes from Version 2.4 of the Lynx browser:

User-Agent: Lynx/2.4 libwww/2.1.4

All but the oldest first-generation browsers also include a
Host field specifying the
server's name, which allows web servers to
distinguish between different named hosts served from the same IP
address. Here's an example:

Host: www.cafeaulait.org

Finally, the request is terminated with a blank linethat is,
two carriage return/linefeed pairs, \r\n\r\n. A
complete request might look like this:

GET /indexl HTTP/1.0
Accept: text/html, text/plain, image/gif, image/jpeg
User-Agent: Lynx/2.4 libwww/2.1.4
Host: www.cafeaulait.org

In addition to GET, there are several other
request types. HEAD retrieves only the header for
the file, not the actual data. This is commonly used to check the
modification date of a file, to see whether a copy stored in the
local cache is still valid. POST sends form data
to the server, PUT uploads a resource to the
server, and DELETE removes a resource from the
server.


The response


The server sends a response to the client. The response begins with a
response code, followed by a header full of metadata, a blank line,
and the requested document or an error message. Assuming the
requested document is found, a typical response looks like this:

HTTP/1.1 200 OK
Date: Mon, 15 Sep 2003 21:06:50 GMT
Server: Apache/2.0.40 (Red Hat Linux)
Last-Modified: Tue, 15 Apr 2003 17:28:57 GMT
Connection: close
Content-Type: text/html; charset=ISO-8859-1
Content-length: 107
<html>
<head>
<title>
A Sample HTML file
</title>
</head>
<body>
The rest of the document goes here
</body>
</html>

The first line indicates the protocol the server is using
(HTTP/1.1), followed by a response code.
200 OK is the most common
response code, indicating that the request was successful. Table 3-1
is a complete list of the response codes used by HTTP 1.0; HTTP 1.1
adds many more to this list. The other header lines identify the date
the request was made in the server's time frame, the
server software (Apache 2.0.40), the date this document was last
modified, a promise that the server will close the connection when
it's finished sending, the MIME content type, and
the length of the document delivered (not counting this
header)in this case, 107 bytes.


Closing the connection


Either the client or the server or both close the connection. Thus, a
separate network connection is used for each request. If the client
reconnects, the server retains no memory of the previous connection
or its results. A protocol that retains no memory of past requests is
called stateless; in contrast, a
stateful protocol such as FTP can process many
requests before the connection is closed. The lack of state is both a
strength and a weakness of HTTP.



Table 3-1. HTTP 1.0 response codes

Response code


Meaning


2xx Successful


Response codes between 200 and 299 indicate that the request was
received, understood, and accepted.


200 OK


This is the most common response code. If the request used
GET or POST, the requested data
is contained in the response along with the usual headers. If the
request used HEAD, only the header information is
included.


201 Created


The server has created a data file at a URL specified in the body of
the response. The web browser should now attempt to load that URL.
This is sent only in response to POST requests.


202 Accepted


This rather uncommon response indicates that a request (generally
from POST) is being processed, but the processing
is not yet complete so no response can be returned. The server should
return an HTML page that explains the situation to the user, provides
an estimate of when the request is likely to be completed, and,
ideally, has a link to a status monitor of some kind.


204 No Content


The server has successfully processed the request but has no
information to send back to the client. This is usually the result of
a poorly written form-processing program that accepts data but does
not return a response to the user indicating that it has finished.


3xx Redirection


Response codes from 300 to 399 indicate that the web browser needs to
go to a different page.


300 Multiple Choices


The page requested is available from one or more locations. The body
of the response includes a list of locations from which the user or
web browser can pick the most appropriate one. If the server prefers
one of these locations, the URL of this choice is included in a
Location header, which web browsers can use to
load the preferred page.


301 Moved Permanently


The page has moved to a new URL. The web browser should automatically
load the page at this URL and update any bookmarks that point to the
old URL.


302 Moved Temporarily


This unusual response code indicates that a page is temporarily at a
new URL but that the document's location will change
again in the foreseeable future, so bookmarks should not be updated.


304 Not Modified


The client has performed a GET request but used
the If-Modified-Since header to indicate that it
wants the document only if it has been recently updated. This status
code is returned because the document has not been updated. The web
browser will now load the page from a cache.


4xx Client Error


Response codes from 400 to 499 indicate that the client has erred in
some fashion, although the error may as easily be the result of an
unreliable network connection as of a buggy or nonconforming web
browser. The browser should stop sending data to the server as soon
as it receives a 4xx response. Unless it is responding to a
HEAD request, the server should explain the error
status in the body of its response.


400 Bad Request


The client request to the server used improper syntax. This is rather
unusual, although it is likely to happen if you're
writing and debugging a client.


401 Unauthorized


Authorization, generally username and password controlled, is
required to access this page. Either the username and password have
not yet been presented or the username and password are invalid.


403 Forbidden


The server understood the request but is deliberately refusing to
process it. Authorization will not help. One reason this occurs is
that the client asks for a directory listing but the server is not
configured to provide it, as shown in Figure 3-1.


404 Not Found


This most common error response indicates that the server cannot find
the requested page. It may indicate a bad link, a page that has moved
with no forwarding address, a mistyped URL, or something similar.


5xx Server Error


Response codes from 500 to 599 indicate that something has gone wrong
with the server, and the server cannot fix the problem.


500 Internal Server Error


An unexpected condition occurred that the server does not know how to
handle.


501 Not Implemented


The server does not have the feature that is needed to fulfill this
request. A server that cannot handle POST requests
might send this response to a client that tried to
POST form data to it.


502 Bad Gateway


This response is applicable only to servers that act as proxies or
gateways. It indicates that the proxy received an invalid response
from a server it was connecting to in an effort to fulfill the
request.


503 Service Unavailable


The server is temporarily unable to handle the request, perhaps as a
result of overloading or maintenance.

HTTP 1.1 more than doubles the number of responses. However, a
response code from 200 to 299 always indicates success, a response
code from 300 to 399 always indicates redirection, one from 400 to
499 always indicates a client error, and one from 500 to 599
indicates a server error.

HTTP 1.0 is documented in the informational RFC 1945; it is not an
official Internet standard because it was primarily developed outside
the IETF by early browser and server vendors. HTTP 1.1 is a proposed
standard being developed by the W3C and the HTTP working group of the
IETF. It provides for much more flexible and powerful communication
between the client and the server. It's also a lot
more scalable. It's documented in RFC 2616. HTTP 1.0
is the basic version of the protocol. All current web servers and
browsers understand it. HTTP 1.1 adds numerous features to HTTP 1.0,
but doesn't change the underlying design or
architecture in any significant way. For the purposes of this book,
it will usually be sufficient to understand HTTP 1.0.

The primary improvement in HTTP 1.1 is connection
reuse. HTTP 1.0 opens a new connection for every request.
In practice, the time taken to open and close all the connections in
a typical web session can outweigh the time taken to transmit the
data, especially for sessions with many small documents. HTTP 1.1
allows a browser to send many different requests over a single
connection; the connection remains open until it is explicitly
closed. The requests and responses are all asynchronous. A browser
doesn't need to wait for a response to its first
request before sending a second or a third. However, it remains tied
to the basic pattern of a client request followed by a server
response. Each request and response has the same basic form: a header
line, an HTTP header containing metadata, a blank line, and then the
data itself.

There are a lot of other, smaller improvements in HTTP 1.1. Requests
include a Host header field so that one web server
can easily serve different sites at different URLs. Servers and
browsers can exchange compressed files and particular byte ranges of
a document, both of which decrease network traffic. And HTTP 1.1 is
designed to work much better with proxy servers. HTTP 1.1 is a
superset of HTTP 1.0, so HTTP 1.1 web servers have no trouble
interacting with older browsers that only speak HTTP 1.0, and vice
versa.


/ 164