4.2 Input Streams
Java's basic input
class is java.io.InputStream:
public abstract class InputStreamThis class provides the fundamental methods needed to read data as
raw bytes. These are:
public abstract int read( ) throws IOExceptionConcrete subclasses of InputStream use these
public int read(byte[] input) throws IOException
public int read(byte[] input, int offset, int length) throws IOException
public long skip(long n) throws IOException
public int available( ) throws IOException
public void close( ) throws IOException
methods to read data from particular media. For instance, a
FileInputStream reads data from a file. A
TelnetInputStream reads data from a network
connection. A ByteArrayInputStream reads data from
an array of bytes. But whichever source you're
reading, you mostly use only these same six methods. Sometimes you
don't know exactly what kind of stream
you're reading from. For instance,
TelnetInputStream is an undocumented class hidden
inside the sun.net package. Instances of it are
returned by various methods in the java.net
package: for example, the openStream( ) method of
java.net.URL. However, these methods are declared
to return only InputStream, not the more specific
subclass TelnetInputStream.
That's polymorphism at work once again. The instance
of the subclass can be used transparently as an instance of its
superclass. No specific knowledge of the subclass is required.The basic method of InputStream is the noargs
read( )
method. This method reads a single byte of data from the input
stream's source and returns it as an
int from 0 to 255. End of stream is signified by
returning -1. The read( ) method waits and blocks
execution of any code that follows it until a byte of data is
available and ready to be read. Input and output can be slow, so if
your program is doing anything else of importance, try to put I/O in
its own thread.The read( ) method is declared abstract because
subclasses need to change it to handle their particular medium. For
instance, a ByteArrayInputStream can implement
this method with pure Java code that copies the byte from its array.
However, a TelnetInputStream needs to use a native
library that understands how to read data from the network interface
on the host platform.The following code fragment reads 10 bytes from the
InputStream in and stores them
in the byte array input.
However, if end of stream is detected, the loop is terminated early:
byte[] input = new byte[10];Although read( ) only reads a byte, it returns an
for (int i = 0; i < input.length; i++) {
int b = in.read( );
if (b == -1) break;
input[i] = (byte) b;
}
int. Thus, a cast is necessary before storing the
result in the byte array. Of course, this produces a signed byte from
-128 to 127 instead of the unsigned byte from 0 to 255 returned by
the read( ) method. However, as long as
you're clear about which one you're
working with, this is not a major problem. You can convert a signed
byte to an unsigned byte like this:
int i = b >= 0 ? b : 256 + b;Reading a byte at a time is as inefficient as writing data one byte
at a time. Consequently, there are two overloaded read() methods that fill a specified array with multiple bytes
of data read from the stream, read(byte[] input)
and read(byte[] input,
int offset,
int length). The first method
attempts to fill the specified array input. The
second attempts to fill the specified subarray of
input, starting at offset and
continuing for length bytes.Notice I said these methods attempt to fill the
array, not that they necessarily succeed. An attempt may fail in
several ways. For instance, it's not unheard of that
while your program is reading data from a remote web server over a
PPP dialup link, a bug in a switch at a phone company central office
will disconnect you and several thousand of your neighbors from the
rest of the world. This would cause an
IOException. More commonly, however, a read
attempt won't completely fail but
won't completely succeed, either. Some of the
requested bytes may be read, but not all of them. For example, you
may try to read 1,024 bytes from a network connection, when only 512
have actually arrived from the server; the rest are still in transit.
They'll arrive eventually, but they
aren't available at this moment. To account for
this, the multibyte read methods return the number of bytes actually
read. For example, consider this code fragment:
byte[] input = new byte[1024];It attempts to read 1,024 bytes from the
int bytesRead = in.read(input);
InputStream in into the array
input. However, if only 512 bytes are available,
that's all that will be read, and
bytesRead will be set to 512. To guarantee that
all the bytes you want are actually read, place the read in a loop
that reads repeatedly until the array is filled. For example:
int bytesRead = 0;This technique is especially crucial for network streams. Chances are
int bytesToRead = 1024;
byte[] input = new byte[bytesToRead];
while (bytesRead < bytesToRead) {
bytesRead += in.read(input, bytesRead, bytesToRead - bytesRead);
}
that if a file is available at all, all the bytes of a file are also
available. However, since networks move much more slowly than CPUs,
it is very easy for a program to empty a network buffer before all
the data has arrived. In fact, if one of these two methods tries to
read from a temporarily empty but open network buffer, it will
generally return 0, indicating that no data is available but the
stream is not yet closed. This is often preferable to the behavior of
the single-byte read( ) method, which blocks the
running thread in the same circumstances.All three read( ) methods return -1 to signal the
end of the stream. If the stream ends while there's
still data that hasn't been read, the multibyte
read( ) methods return the data until the buffer
has been emptied. The next call to any of the read(
) methods will return -1. The -1 is never placed in the
array. The array only contains actual data. The previous code
fragment had a bug because it didn't consider the
possibility that all 1,024 bytes might never arrive (as opposed to
not being immediately available). Fixing that bug requires testing
the return value of read( ) before adding it to
bytesRead. For example:
int bytesRead=0;If you do not want to wait until all the bytes you need are
int bytesToRead=1024;
byte[] input = new byte[bytesToRead];
while (bytesRead < bytesToRead) {
int result = in.read(input, bytesRead, bytesToRead - bytesRead);
if (result == -1) break;
bytesRead += result;
}
immediately available, you can use the available() method to determine how many bytes can
be read without blocking. This returns the minimum number of bytes
you can read. You may in fact be able to read more, but you will be
able to read at least as many bytes as available() suggests. For example:
int bytesAvailable = in.available( );In this case, you can assert that bytesRead is
byte[] input = new byte[bytesAvailable];
int bytesRead = in.read(input, 0, bytesAvailable);
// continue with rest of program immediately...
exactly equal to bytesAvailable. You cannot,
however, assert that bytesRead is greater than
zero. It is possible that no bytes were available. On end of stream,
available( ) returns 0. Generally,
read(byte[] input, int
offset, int length)
returns -1 on end of stream; but if length is 0,
then it does not notice the end of stream and returns 0 instead.On rare occasions, you may want to skip over data without reading it.
The skip( ) method accomplishes this
task. It's less useful on network connections than
when reading from files. Network connections are sequential and
normally quite slow, so it's not significantly more
time-consuming to read data than to skip over it. Files are random
access so that skipping can be implemented simply by repositioning a
file pointer rather than processing each byte to be skipped.As with output streams, once your program has finished with an input
stream, it should close it by invoking its close() method. This releases any resources associated with the
stream, such as file handles or ports. Once an input stream has been
closed, further reads from it throw IOExceptions.
However, some kinds of streams may still allow you to do things with
the object. For instance, you generally won't want
to get the message digest from a
java.security.DigestInputStream until after the
data has been read and the stream closed.
4.2.1 Marking and Resetting
The InputStream
class also has three less commonly used methods that allow programs
to back up and reread data they've already read.
These are:
public void mark(int readAheadLimit)In order to reread data, mark the current position in the stream with
public void reset( ) throws IOException
public boolean markSupported( )
the mark( ) method. At a later point, you can
reset the stream to the marked position using the reset(
) method. Subsequent reads then return data starting from
the marked position. However, you may not be able to reset as far
back as you like. The number of bytes you can read from the mark and
still reset is determined by the readAheadLimit
argument to mark( ). If you try to reset too far
back, an IOException is thrown. Furthermore, there
can be only one mark in a stream at any given time. Marking a second
location erases the first mark.Marking and resetting are usually implemented by storing every byte
read from the marked position on in an internal buffer. However, not
all input streams support this. Before trying to use marking and
resetting, check to see whether the markSupported() method returns true. If it does, the stream supports
marking and resetting. Otherwise, mark( ) will do
nothing and reset( ) will throw an
IOException.
|
always support marking are BufferedInputStream and
ByteArrayInputStream. However, other input streams
such as TelnetInputStream may support marking if
they're chained to a buffered input stream
first.