15.11 HttpURLConnection
The
java.net.HttpURLConnection class is an abstract
subclass of URLConnection; it provides some
additional methods that are helpful when working specifically with
http URLs:
public abstract class HttpURLConnection extends URLConnectionIn particular, it contains methods to get and set the request method,
decide whether to follow redirects, get the response code and
message, and figure out whether a proxy server is being used. It also
includes several dozen mnemonic constants matching the various HTTP
response codes. Finally, it overrides the getPermission() method from the
URLConnection superclass, although it
doesn't change the semantics of this method at all.Since this class is abstract and its only constructor is protected,
you can't directly create instances of
HttpURLConnection. However, if you construct a
URL object using an http URL
and invoke its openConnection( ) method, the
URLConnection object returned will be an instance
of HttpURLConnection. Cast that
URLConnection to
HttpURLConnection like this:
URL u = new URL("http://www.amnesty.org/");Or, skipping a step, like this:
URLConnection uc = u.openConnection( );
HttpURLConnection http = (HttpURLConnection) uc;
URL u = new URL("http://www.amnesty.org/");
HttpURLConnection http = (HttpURLConnection) u.openConnection( );
|
public class HttpURLConnection extends java.net.HttpURLConnection
|
15.11.1 The Request Method
When a web client contacts a web
server, the first thing it sends is a request line. Typically, this
line begins with GET and is followed by the name of the file that the
client wants to retrieve and the version of the HTTP protocol that
the client understands. For example:
GET /catalog/jfcnut/indexl HTTP/1.0However, web clients can do more than simply GET files from web
servers. They can POST responses to forms. They can PUT a file on a
web server or DELETE a file from a server. And they can ask for just
the HEAD of a document. They can ask the web server for a list of the
OPTIONS supported at a given URL. They can even TRACE the request
itself. All of these are accomplished by changing the request method
from GET to a different keyword. For example, here's
how a browser asks for just the header of a document using HEAD:
HEAD /catalog/jfcnut/indexl HTTP/1.1By default, HttpURLConnection uses the GET method.
User-Agent: Java/1.4.2_05
Host: www.oreilly.com
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: close
However, you can change this with the setRequestMethod(
) method:
public void setRequestMethod(String method) throws ProtocolExceptionThe method argument should be one of these seven case-sensitive
strings:GETPOSTHEADPUTOPTIONSDELETETRACE
If it's some other method, then a
java.net.ProtocolException, a subclass of
IOException, is thrown. However,
it's generally not enough to simply set the request
method. Depending on what you're trying to do, you
may need to adjust the HTTP header and provide a message body as
well. For instance, POSTing a form requires you to provide a
Content-length header. We've already explored the
GET and POST methods. Let's look at the other five
possibilities.
|
15.11.1.1 HEAD
The HEAD function
is possibly the simplest of all the request methods. It behaves much
like GET. However, it tells the server only to return the HTTP
header, not to actually send the file. The most common use of this
method is to check whether a file has been modified since the last
time it was cached. Example 15-9 is a simple program
that uses the HEAD request method and prints the last time a file on
a server was modified.
Example 15-9. Get the time when a URL was last changed
import java.net.*;Here's the output from one run:
import java.io.*;
import java.util.*;
public class LastModified {
public static void main(String args[]) {
for (int i=0; i < args.length; i++) {
try {
URL u = new URL(args[i]);
HttpURLConnection http = (HttpURLConnection) u.openConnection( );
http.setRequestMethod("HEAD");
System.out.println(u + "was last modified at "
+ new Date(http.getLastModified( )));
} // end try
catch (MalformedURLException ex) {
System.err.println(args[i] + " is not a URL I understand");
}
catch (IOException ex) {
System.err.println(ex);
}
System.out.println( );
} // end for
} // end main
} // end LastModified
D:\JAVA\JNP3\examples\15>java LastModified http://www.ibiblio.org/xml/It wasn't absolutely necessary to use the HEAD
http://www.ibiblio.org/xml/was last modified at Thu Aug 19 06:06:57 PDT 2004
method here. We'd have gotten the same results with
GET. But if we used GET, the entire file at http://www.ibiblio.org/xml/ would have been
sent across the network, whereas all we cared about was one line in
the header. When you can use HEAD, it's much more
efficient to do so.
15.11.1.2 OPTIONS
The OPTIONS
request method asks what options are supported for a particular URL.
If the request URL is an asterisk (*), the request applies to the
server as a whole rather than to one particular URL on the server.
For example:
OPTIONS /xml/ HTTP/1.1The server responds to an OPTIONS request by sending an HTTP header
User-Agent: Java/1.4.2_05
Host: www.ibiblio.org
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: close
with a list of the commands allowed on that URL. For example, when
the previous command was sent, here's what Apache
responded:
Date: Thu, 21 Oct 2004 18:06:10 GMTThe list of legal commands is found in the Allow field. However, in
Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17
Content-Length: 0
Allow: GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, PATCH, PROPFIND,
PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK, TRACE
Connection: close
practice these are just the commands the server understands, not
necessarily the ones it will actually perform on that URL. For
instance, let's look at what happens when you try
the DELETE request method.
15.11.1.3 DELETE
The DELETE
method removes a file at a specified URL from a web server. Since
this request is an obvious security risk, not all servers will be
configured to support it, and those that are will generally demand
some sort of authentication. A typical DELETE request looks like
this:
DELETE /javafaq/2004marchl HTTP/1.1The server is free to refuse this request or ask for identification.
User-Agent: Java/1.4.2_05
Host: www.ibiblio.org
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: close
For example:
Date: Thu, 19 Aug 2004 14:32:15 GMTEven if the server accepts this request, its response is
Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17
Allow: GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, PATCH, PROPFIND,
PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK, TRACE
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html
content-length: 313
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>405 Method Not Allowed</TITLE>
</HEAD><BODY>
<H1>Method Not Allowed</H1>
The requested method DELETE is not allowed for the
URL /javafaq/2004marchl.<P>
<HR>
<ADDRESS>Apache/1.3.4 Server at www.ibiblio.org Port 80</ADDRESS>
</BODY></HTML>
implementation-dependent. Some servers may delete the file; others
simply move it to a trash directory. Others simply mark it as not
readable. Details are left up to the server vendor.
15.11.1.4 PUT
Many HTML editors and other programs that want to store files on a
web server use the PUT method. It
allows clients to place documents in the abstract hierarchy of the
site without necessarily knowing how the site maps to the actual
local filesystem. This contrasts with FTP, where the user has to know
the actual directory structure as opposed to the
server's virtual directory structure.Here's a how a browser might PUT a file on a web
server:
PUT /hellol HTTP/1.0As with deleting files, allowing arbitrary users to PUT files on your
Connection: Keep-Alive
User-Agent: Mozilla/4.6 [en] (WinNT; I)
Pragma: no-cache
Host: www.ibiblio.org
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding: gzip
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8
Content-Length: 364
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="Author" content="Elliotte Rusty Harold">
<meta name="GENERATOR" content="Mozilla/4.6 [en] (WinNT; I) [Netscape]">
<title>Mine</title>
</head>
<body>
<b>Hello</b>
</body>
</html>
web server is a clear security risk. Generally, some sort of
authentication is required and the server must be specially
configured to support PUT. The details are likely to vary from server
to server. Most web servers do not include full support for PUT out
of the box. For instance, Apache requires you to install an
additional module just to handle PUT requests.
15.11.1.5 TRACE
The TRACE
request method sends the HTTP header that the server received from
the client. The main reason for this information is to see what any
proxy servers between the server and client might be changing. For
example, suppose this TRACE request is sent:
TRACE /xml/ HTTP/1.1The server should respond like this:
Hello: Push me
User-Agent: Java/1.4.2_05
Host: www.ibiblio.org
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: close
Date: Thu, 19 Aug 2004 17:50:02 GMTThe first six lines are the server's normal response
Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17
Connection: close
Transfer-Encoding: chunked
Content-Type: message/http
content-length: 169
TRACE /xml/ HTTP/1.1
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: close
Hello: Push me
Host: www.ibiblio.org
User-Agent: Java/1.4.2_05
HTTP header. The lines from TRACE
/xml/ HTTP/1.1 on are the echo
of the original client request. In this case, the echo is faithful,
although out of order. However, if there were a proxy server between
the client and server, it might not be.
15.11.2 Disconnecting from the Server
Recent versions of HTTP support
what's known as Keep-Alive.
Keep-Alive enhances the performance of some web connections by
allowing multiple requests and responses to be sent in a series over
a single TCP connection. A client indicates that
it's willing to use HTTP Keep-Alive by including a
Connection field in the HTTP request header with the value
Keep-Alive:
Connection: Keep-AliveHowever, when Keep-Alive is used, the server can no longer close the
connection simply because it has sent the last byte of data to the
client. The client may, after all, send another request.
Consequently, it is up to the client to close the connection when
it's done.Java marginally supports HTTP Keep-Alive, mostly by piggybacking on
top of browser support. It doesn't provide any
convenient API for making multiple requests over the same connection.
However, in anticipation of a day when Java will better support
Keep-Alive, the HttpURLConnection class adds a
disconnect( ) method that allows the client to
break the connection:
public abstract void disconnect( )In practice, you rarely if ever need to call this.
15.11.3 Handling Server Responses
The first line of an HTTP
server's response includes a numeric code and a
message indicating what sort of response is made. For instance, the
most common response is 200 OK, indicating that the requested
document was found. For example:
HTTP/1.1 200 OKAnother response that you're undoubtedly all too
Date: Fri, 20 Aug 2004 15:33:40 GMT
Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17
Last-Modified: Sun, 06 Jun 1999 16:30:33 GMT
ETag: "28d907-657-375aa229"
Accept-Ranges: bytes
Content-Length: 1623
Connection: close
Content-Type: text/html
<HTML>
<HEAD>
rest of document follows...
familiar with is 404 Not Found, indicating that the URL you requested
no longer points to a document. For example:
HTTP/1.1 404 Not FoundThere are many other, less common responses. For instance, code 301
Date: Fri, 20 Aug 2004 15:39:16 GMT
Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17
Last-Modified: Mon, 20 Sep 1999 19:25:05 GMT
ETag: "5-14ab-37e68a11"
Accept-Ranges: bytes
Content-Length: 5291
Connection: close
Content-Type: text/html
<html>
<head>
<title>Lost ... and lost</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body bgcolor="#FFFFFF">
<div align="left">
<h1>404 FILE NOT FOUND</h1>
Rest of error message follows...
indicates that the resource has permanently moved to a new location
and the browser should redirect itself to the new location and update
any bookmarks that point to the old location. For example:
HTTP/1.1 301 Moved PermanentlyThe first line of this response is called the response
Date: Fri, 20 Aug 2004 15:36:44 GMT
Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17
Location: http://www.ibiblio.org/javafaq/books/beans/indexl
Connection: close
Content-Type: text/html
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>301 Moved Permanently</TITLE>
</HEAD><BODY>
<H1>Moved Permanently</H1>
The document has moved <A HREF="http://www.ibiblio.org/javafaq/books/beans/index
l">here</A>.<P>
<HR>
<ADDRESS>Apache/1.3.4 Server at www.ibiblio.org Port 80</ADDRESS>
</BODY></HTML>
message. It will not be returned by the
various getHeaderField( ) methods in
URLConnection. However,
HttpURLConnection has a method to read and return
just the response message. This is the aptly named
getResponseMessage():
public String getResponseMessage( ) throws IOExceptionOften all you need from the response message is the numeric response
code. HttpURLConnection also has a
getResponseCode(
) method to
return this as an int:
public int getResponseCode( ) throws IOExceptionHTTP 1.0 defines 16 response codes. HTTP 1.1 expands this to 40
different codes. While some numbers, notably 404, have become slang
almost synonymous with their semantic meaning, most of them are less
familiar. The HttpURLConnection class includes 36
named constants representing the most common response codes. These
are summarized in Table 15-3.
now includes the response message. The lines added since
SourceViewer2 are in bold.
Example 15-10. A SourceViewer that includes the response code and message
import java.net.*;The only thing this program doesn't read that the
import java.io.*;
import javax.swing.*;
import java.awt.*;
public class SourceViewer3 {
public static void main (String[] args) {
for (int i = 0; i < args.length; i++) {
try {
//Open the URLConnection for reading
URL u = new URL(args[i]);
HttpURLConnection uc = (HttpURLConnection) u.openConnection( );
int code = uc.getResponseCode( );
String response = uc.getResponseMessage( );
System.out.println("HTTP/1.x " + code + " " + response);
for (int j = 1; ; j++) {
String header = uc.getHeaderField(j);
String key = uc.getHeaderFieldKey(j);
if (header == null || key == null) break;
System.out.println(uc.getHeaderFieldKey(j) + ": " + header);
} // end for
InputStream in = new BufferedInputStream(uc.getInputStream( ));
// chain the InputStream to a Reader
Reader r = new InputStreamReader(in);
int c;
while ((c = r.read( )) != -1) {
System.out.print((char) c);
}
}
catch (MalformedURLException ex) {
System.err.println(args[0] + " is not a parseable URL");
}
catch (IOException ex) {
System.err.println(ex);
}
} // end if
} // end main
} // end SourceViewer3
server sends is the version of HTTP the server is using.
There's currently no method to return that. If you
need it, you'll just have to use a raw socket
instead. Consequently, in this example, we just fake it as
"HTTP/1.x", like this:
% java SourceViewer3 http://www.oreilly.com
HTTP/1.x 200 OK
Server: WN/1.15.1
Date: Mon, 01 Nov 1999 23:39:19 GMT
Last-modified: Fri, 29 Oct 1999 23:40:06 GMT
Content-type: text/html
Title: www.oreilly.com -- Welcome to O'Reilly & Associates! --
computer books, software, online publishing
Link: <mailto:webmaster@ora.com>; rev="Made"
<HTML>
<HEAD>
...
15.11.3.1 Error conditions
On occasion, the server encounters
an error but returns useful information in the message body
nonetheless. For example, when a client requests a nonexistent page
from the www.ibiblio.org web site, rather than
simply returning a 404 error code, the server sends the search page
shown in Figure 15-2 to help the user figure out
where the missing page might have gone.
Figure 15-2. IBiblio's 404 page

InputStream containing this data or
null if no error was encountered or no data
returned:
public InputStream getErrorStream( ) // Java 1.2In practice, this isn't necessary. Most
implementations will return this data from getInputStream() as well.
15.11.3.2 Redirects
The
300-level response codes all indicate some sort of redirect; that is,
the requested resource is no longer available at the expected
location but it may be found at some other location. When
encountering such a response, most browsers automatically load the
document from its new location. However, this can be a security risk,
because it has the potential to move the user from a trusted site to
an untrusted one, perhaps without the user even noticing.By default, an HttpURLConnection follows
redirects. However, the HttpURLConnection class
has two static methods that let you decide whether to follow
redirects:
public static boolean getFollowRedirects( )The getFollowRedirects( ) method returns
public static void setFollowRedirects(boolean follow)
true if redirects are being followed,
false if they aren't. With an
argument of true, the setFollowRedirects( ) method
makes HttpURLConnection objects follow redirects.
With an argument of false, it prevents them from
following redirects. Since these are static methods, they change the
behavior of all HttpURLConnection objects
constructed after the method is invoked. The
setFollowRedirects( ) method may throw a
SecurityException if the security manager
disallows the change. Applets especially are not allowed to change
this value.Java has two methods to configure redirection on an
instance-by-instance basis. These are:
public boolean getInstanceFollowRedirects( ) // Java 1.3If setInstanceFollowRedirects( ) is not invoked on
public void setInstanceFollowRedirects(boolean followRedirects) // Java 1.3
a given HttpURLConnection, that
HttpURLConnection simply follows the default
behavior as set by the class method
HttpURLConnection.setFollowRedirects( ).
15.11.4 Proxies
Many
users behind firewalls or using AOL or other high-volume ISPs access
the web through proxy servers. The usingProxy( )
method tells you whether the particular
HttpURLConnection is going through a proxy server:
public abstract boolean usingProxy( ) // Java 1.3It returns true if a proxy is being used,
false if not. In some contexts, the use of a proxy
server may have security implications.
15.11.5 Streaming Mode
Every request sent to an HTTP server has
an HTTP header. One field in this header is the Content-length; that
is, the number of bytes in the body of the request. The header comes
before the body. However, to write the header you need to know the
length of the body, which you may not have yet. Normally the way Java
solves this Catch-22 is by caching every thing you write onto the
OutputStream retrieved from the
HttpURLConnection until the stream is closed. At
that point, it knows how many bytes are in the body so it has enough
information to write the Content-length header.This scheme is fine for small requests sent in response to typical
web forms. However, it's burdensome for responses to
very long forms or some SOAP messages. It's very
wasteful and slow for medium-to-large documents sent with HTTP PUT.
It's much more efficient if Java
doesn't have to wait for the last byte of data to be
written before sending the first byte of data over the network. Java
1.5 offers two solutions to this problem. If you know the size of
your datafor instance, you're uploading a
file of known size using HTTP PUTyou can tell the
HttpURLConnection object the size of that data. If
you don't know the size of the data in advance, the
you can use chunked transfer encoding instead. In chunked transfer
encoding, the body of the request is sent in multiple pieces, each
with its own separate content length. To turn on chunked transfer
encoding, just pass the size of the chunks you want to the
setChunkedStreamingMode( ) method before you
connect the URL.
public void setChunkedStreamingMode(int chunkLength) // Java 1.5Java will then use a slightly different form of HTTP than the
examples in this book. However, to the Java programmer the difference
is irrelevant. As long as you're using the
URLConnection class instead of raw sockets and as
long as the server supports chunked transfer encoding, it should all
just work without any further changes to your code. However, not all
servers support chunked encoding, though most of the late-model,
major ones do. Even more importantly, chunked transfer encoding does
get in the way of authentication and redirection. If
you're trying to send chunked files to a redirected
URL or one that requires password authentication, an
HttpRetryException will be thrown.
You'll then need to retry the request at the new URL
or at the old URL with the appropriate credentials; and this all
needs to be done manually without the full support of the HTTP
protocol handler you normally have. Therefore, don't
use chunked transfer encoding unless you really need it. As with most
performance advice, this means you shouldn't
implement this optimization until measurements prove the
non-streaming default is a bottleneck.If you do happen to know the size of the request data in advance,
Java 1.5 lets you optimize the connection by providing this
information to the HttpURLConnection object. If
you do this Java can start streaming the data over the network
immediately. Otherwise, it has to cache everything you write in order
to determine the content length, and only send it over the network
after you've closed the stream. If you know exactly
how big your data is, pass that number to the
setFixedLengthStreamingMode( ) method:
public void setFixedLengthStreamingMode(int contentLength)Java will use this number in the HTTP Content-length HTTP header
field. However, if you then try to write more or less than the number
of bytes given here, Java will throw an
IOException. Of course, that will happen later,
when you're writing data, not when you first call
this method. The setFixedLengthStreamingMode( )
method itself will throw an
IllegalArgumentException if you pass in a negative
number, or an IllegalStateException if the
connection is connected or has already been set to chunked transfer
encoding. (You can't use both chunked transfer
encoding and fixed-length streaming mode on the same request.)Fixed-length streaming mode is transparent on the server side.
Servers neither know nor care how the Content-length was set as long
as it's correct. However, like chunked transfer
encoding, streaming mode does interfere authentication and
redirection. If either of these is required for a given URL, an
HttpRetryException will be thrown; you have to
manually retry. Therefore, don't use this mode
unless you really need it.