4.4 Readers and Writers
Many programmers have a bad habit
of writing code as if all text were ASCII or at least in the native encoding of
the platform. While some older, simpler network protocols, such as
daytime, quote of the day, and chargen, do specify ASCII encoding for
text, this is not true of HTTP and many other more modern protocols,
which allow a wide variety of localized encodings, such as K0I8-R
Cyrillic, Big-5 Chinese, and ISO 8859-2 for most Central European
languages. Java's native character set is the UTF-16
encoding of Unicode. When the encoding is no longer ASCII, the
assumption that bytes and chars are essentially the same things also
breaks down. Consequently, Java provides an almost complete mirror of
the input and output stream class hierarchy designed for working with
characters instead of bytes.In this mirror image hierarchy, two abstract superclasses define the
basic API for reading and writing characters. The
java.io.Reader class specifies the API by which
characters are read. The java.io.Writer class
specifies the API by which characters are written. Wherever input and
output streams use bytes, readers and writers use Unicode characters.
Concrete subclasses of Reader and
Writer allow particular sources to be read and
targets to be written. Filter readers and writers can be attached to
other readers and writers to provide additional services or
interfaces.The most important concrete subclasses of Reader
and Writer are the
InputStreamReader and the
OutputStreamWriter classes. An
InputStreamReader contains an underlying input
stream from which it reads raw bytes. It translates these bytes into
Unicode characters according to a specified encoding. An
OutputStreamWriter receives Unicode characters
from a running program. It then translates those characters into
bytes using a specified encoding and writes the bytes onto an
underlying output stream.In addition to these two classes, the java.io
package provides several raw reader and writer classes that read
characters without directly requiring an underlying input stream,
including:FileReaderFileWriterStringReaderStringWriterCharArrayReaderCharArrayWriter
The first two classes in this list work with files and the last four
work inside Java, so they aren't of great use for
network programming. However, aside from different constructors,
these classes have pretty much the same public interface as all other
reader and writer classes.
4.4.1 Writers
The Writer class
mirrors the java.io.OutputStream class.
It's abstract and has two protected constructors.
Like OutputStream, the Writer
class is never used directly; instead, it is used polymorphically,
through one of its subclasses. It has five write() methods as well as a flush( ) and a
close( ) method:
protected Writer( )The write(char[] text,
protected Writer(Object lock)
public abstract void write(char[] text, int offset, int length)
throws IOException
public void write(int c) throws IOException
public void write(char[] text) throws IOException
public void write(String s) throws IOException
public void write(String s, int offset, int length) throws IOException
public abstract void flush( ) throws IOException
public abstract void close( ) throws IOException
int offset,
int length) method is the base
method in terms of which the other four write( )
methods are implemented. A subclass must override at least this
method as well as flush( ) and close(), although most override some of the other write(
) methods as well in order to provide more efficient
implementations. For example, given a Writer
object w, you can write the string
"Network" like this:
char[] network = {'N', 'e', 't', 'w', 'o', 'r', 'k'};
w.write(network, 0, network.length);The same task can be accomplished with these other methods, as well:w.write(network);All of these examples are different ways of expressing the same
for (int i = 0; i < network.length; i++) w.write(network[i]);
w.write("Network");
w.write("Network", 0, 7);
thing. Which you use in any given situation is mostly a matter of
convenience and taste. However, how many and which bytes are written
by these lines depends on the encoding w uses. If
it's using big-endian UTF-16, it will write these 14
bytes (shown here in hexadecimal) in this order:
00 4E 00 65 00 74 00 77 00 6F 00 72 00 6BOn the other hand, if w uses little-endian UTF-16,
this sequence of 14 bytes is written:
4E 00 65 00 74 00 77 00 6F 00 72 00 6B 00If w uses Latin-1, UTF-8, or MacRoman, this
sequence of seven bytes is written:
4E 65 74 77 6F 72 6BOther encodings may write still different sequences of bytes. The
exact output depends on the encoding.Writers may be buffered, either directly by being chained to a
BufferedWriter or indirectly because their
underlying output stream is buffered. To force a write to be
committed to the output medium, invoke the flush() method:
w.flush( );The close( ) method behaves similarly to the
close( ) method of
OutputStream. close( ) flushes
the writer, then closes the underlying output stream and releases any
resources associated with it:
public abstract void close( ) throws IOExceptionAfter a writer has been closed, further writes throw
IOExceptions.
4.4.2 OutputStreamWriter
OutputStreamWriter is the most important concrete subclass
of Writer. An
OutputStreamWriter receives characters from a Java
program. It converts these into bytes according to a specified
encoding and writes them onto an underlying output stream. Its
constructor specifies the output stream to write to and the encoding
to use:
public OutputStreamWriter(OutputStream out, String encoding)Valid encodings are listed in the documentation for
throws UnsupportedEncodingException
public OutputStreamWriter(OutputStream out)
Sun's native2ascii tool included with the JDK and
available from http://java.sun.com/j2se/1.4.2/docs/guide/intl/encoding.docl.
If no encoding is specified, the default encoding for the platform is
used. (In the United States, the default encoding is ISO Latin-1 on
Solaris and Windows, MacRoman on the Mac.) For example, this code
fragment writes the string
OutputStreamWriter w = new OutputStreamWriter(Other than the constructors, OutputStreamWriter
new FileOutputStream("OdysseyB.txt"), "Cp1253");
w.write("");
has only the usual Writer methods (which are used
exactly as they are for any Writer class) and one
method to return the encoding of the object:
public String getEncoding( )
4.4.3 Readers
The Reader class
mirrors the java.io.InputStream class.
It's abstract with two protected constructors. Like
InputStream and Writer, the
Reader class is never used directly, only through
one of its subclasses. It has three
read() methods, as well as skip( ),
close( ), ready( ),
mark( ), reset( ), and
markSupported( ) methods:
protected Reader( )The read(char[] text,
protected Reader(Object lock)
public abstract int read(char[] text, int offset, int length)
throws IOException
public int read( ) throws IOException
public int read(char[] text) throws IOException
public long skip(long n) throws IOException
public boolean ready( )
public boolean markSupported( )
public void mark(int readAheadLimit) throws IOException
public void reset( ) throws IOException
public abstract void close( ) throws IOException
int offset,
int length) method is the
fundamental method through which the other two read(
) methods are implemented. A subclass must override at
least this method as well as close( ), although
most will override some of the other read( )
methods as well in order to provide more efficient implementations.Most of these methods are easily understood by analogy with their
InputStream counterparts. The read() method returns a single Unicode character as an
int with a value from 0 to 65,535 or -1 on end of
stream. The read(char[] text)
method tries to fill the array text with
characters and returns the actual number of characters read or -1 on
end of stream. The read(char[]
text, int
offset, int
length) method attempts to read
length characters into the subarray of
text beginning at offset and
continuing for length characters. It also returns
the actual number of characters read or -1 on end of stream. The
skip(long n) method skips
n characters. The mark( ) and
reset( ) methods allow some readers to reset back
to a marked position in the character sequence. The
markSupported( ) method tells you whether the
reader supports marking and resetting. The close(
) method closes the reader and any underlying input stream
so that further attempts to read from it throw
IOExceptions.The exception to the rule of similarity is ready(), which has the same general purpose as
available( ) but not quite the same semantics,
even modulo the byte-to-char conversion. Whereas available(
) returns an int specifying a minimum
number of bytes that may be read without blocking, ready(
) only returns a boolean indicating
whether the reader may be read without blocking. The problem is that
some character encodings, such as UTF-8, use different numbers of
bytes for different characters. Thus, it's hard to
tell how many characters are waiting in the network or filesystem
buffer without actually reading them out of the buffer.InputStreamReader is the most important concrete
subclass of Reader. An
InputStreamReader reads bytes from an underlying
input stream such as a FileInputStream or
TelnetInputStream. It converts these into
characters according to a specified encoding and returns them. The
constructor specifies the input stream to read from and the encoding
to use:
public InputStreamReader(InputStream in)If no encoding is specified, the default encoding for the platform is
public InputStreamReader(InputStream in, String encoding)
throws UnsupportedEncodingException
used. If an unknown encoding is specified, then an
UnsupportedEncodingException is thrown.For example, this method reads an input stream and converts it all to
one Unicode string using the MacCyrillic encoding:
public static String getMacCyrillicString(InputStream in)
throws IOException {
InputStreamReader r = new InputStreamReader(in, "MacCyrillic");
StringBuffer sb = new StringBuffer( );
int c;
while ((c = r.read( )) != -1) sb.append((char) c);
r.close( );
return sb.toString( );
}
4.4.4 Filter Readers and Writers
The
InputStreamReader and
OutputStreamWriter classes act as decorators on
top of input and output streams that change the interface from a
byte-oriented interface to a character-oriented interface. Once this
is done, additional character-oriented filters can be layered on top
of the reader or writer using the
java.io.FilterReader and
java.io.FilterWriter classes. As with filter
streams, there are a variety of subclasses that perform specific
filtering, including:BufferedReaderBufferedWriterLineNumberReaderPushbackReaderPrintWriter
4.4.4.1 Buffered readers and writers
The BufferedReader and BufferedWriter
classes are the character-based equivalents of the byte-oriented
BufferedInputStream and
BufferedOutputStream classes. Where
BufferedInputStream and
BufferedOutputStream use an internal array of
bytes as a buffer, BufferedReader and
BufferedWriter use an internal array of chars.When a program reads from a BufferedReader, text
is taken from the buffer rather than directly from the underlying
input stream or other text source. When the buffer empties, it is
filled again with as much text as possible, even if not all of it is
immediately needed, making future reads much faster. When a program
writes to a BufferedWriter, the text is placed in
the buffer. The text is moved to the underlying output stream or
other target only when the buffer fills up or when the writer is
explicitly flushed, which can make writes much faster than would
otherwise be the case.BufferedReader and
BufferedWriter have the usual methods associated
with readers and writers, like read( ),
ready( ), write( ), and
close( ). They each have two constructors that
chain the BufferedReader or
BufferedWriter to an underlying reader or writer
and set the size of the buffer. If the size is not set, the default
size of 8,192 characters is used:
public BufferedReader(Reader in, int bufferSize)For example, the earlier getMacCyrillicString( )
public BufferedReader(Reader in)
public BufferedWriter(Writer out)
public BufferedWriter(Writer out, int bufferSize)
example was less than efficient because it read characters one at a
time. Since MacCyrillic is a 1-byte character set, it also read bytes
one at a time. However, it's straightforward to make
it run faster by chaining a BufferedReader to the
InputStreamReader, like this:
public static String getMacCyrillicString(InputStream in)All that was needed to buffer this method was one additional line of
throws IOException {
Reader r = new InputStreamReader(in, "MacCyrillic");
r = new BufferedReader(r, 1024);
StringBuffer sb = new StringBuffer( );
int c;
while ((c = r.read( )) != -1) sb.append((char) c);
r.close( );
return sb.toString( );
}
code. None of the rest of the algorithm had to change, since the only
InputStreamReader methods used were the
read( ) and close( ) methods
declared in the Reader superclass and shared by
all Reader subclasses, including
BufferedReader.The BufferedReader class also has a
readLine( ) method that reads a single line of
text and returns it as a string:
public String readLine( ) throws IOExceptionThis method is supposed to replace the deprecated readLine() method in DataInputStream, and it has
mostly the same behavior as that method. The big difference is that
by chaining a BufferedReader to an
InputStreamReader, you can correctly read lines in
character sets other than the default encoding for the platform.
Unfortunately, this method shares the same bugs as the
readLine( ) method in
DataInputStream, discussed earlier in this
chapter. That is, readline( ) tends to hang its
thread when reading streams where lines end in carriage returns, as
is commonly the case when the streams derive from a Macintosh or a
Macintosh text file. Consequently, you should scrupulously avoid this
method in network programs.It's not all that difficult, however, to write a
safe version of this class that correctly implements the
readLine( ) method. Example 4-1 is such a
SafeBufferedReader class. It has exactly the same public
interface as BufferedReader; it just has a
slightly different private implementation. I'll use
this class in future chapters in situations where
it's extremely convenient to have a
readLine( ) method.
Example 4-1. The SafeBufferedReader class
package com.macfaq.io;The BufferedWriter( ) class adds one new method
import java.io.*;
public class SafeBufferedReader extends BufferedReader {
public SafeBufferedReader(Reader in) {
this(in, 1024);
}
public SafeBufferedReader(Reader in, int bufferSize) {
super(in, bufferSize);
}
private boolean lookingForLineFeed = false;
public String readLine( ) throws IOException {
StringBuffer sb = new StringBuffer(");
while (true) {
int c = this.read( );
if (c == -1) { // end of stream
if (sb.equals(")) return null;
return sb.toString( );
}
else if (c == '\n') {
if (lookingForLineFeed) {
lookingForLineFeed = false;
continue;
}
else {
return sb.toString( );
}
}
else if (c == '\r') {
lookingForLineFeed = true;
return sb.toString( );
}
else {
lookingForLineFeed = false;
sb.append((char) c);
}
}
}
}
not included in its superclass, called newLine( ),
also geared toward writing lines:
public void newLine( ) throws IOExceptionThis method inserts a platform-dependent line-separator string into
the output. The line.separator system property
determines exactly what the string is: probably a linefeed on Unix
and Mac OS X, a carriage return on Mac OS 9, and a carriage
return/linefeed pair on Windows. Since network protocols generally
specify the required line-terminator, you should not use this method
for network programming. Instead, explicitly write the
line-terminator the protocol requires.
4.4.4.2 LineNumberReader
LineNumberReader is a subclass of
BufferedReader that keeps track of the current
line number. This can be retrieved at any time with the
getLineNumber( ) method:
public int getLineNumber( )By default, the first line number is 0. However, the number of the
current line and all subsequent lines can be changed with the
setLineNumber( ) method:
public void setLineNumber(int lineNumber)This method adjusts only the line numbers that
getLineNumber( ) reports. It does not change the
point at which the stream is read.The LineNumberReader's
readLine( ) method shares the same bug as
BufferedReader and
DataInputStream's, and is not
suitable for network programming. However, the line numbers are also
tracked if you use only the regular read( )
methods, and these do not share that bug. Besides these methods and
the usual Reader methods,
LineNumberReader has only these two constructors:
public LineNumberReader(Reader in)Since LineNumberReader is a subclass of
public LineNumberReader(Reader in, int bufferSize)
BufferedReader, it has an internal character
buffer whose size can be set with the second constructor. The default
size is 8,192 characters.
4.4.4.3 PushbackReader
The
PushbackReader class is the mirror image of the
PushbackInputStream class. As usual, the main
difference is that it pushes back chars rather than bytes. It
provides three unread( ) methods that push
characters onto the reader's input buffer:
public void unread(int c) throws IOExceptionThe first unread( ) method pushes a single
public void unread(char[] text) throws IOException
public void unread(char[] text, int offset, int length)
throws IOException
character onto the reader. The second pushes an array of characters.
The third pushes the specified subarray of characters, starting with
text[offset] and continuing through
text[offset+length-1].By default, the size of the pushback buffer is only one character.
However, the size can be adjusted in the second constructor:
public PushbackReader(Reader in)Trying to unread more characters than the buffer will hold throws an
public PushbackReader(Reader in, int bufferSize)
IOException.
4.4.4.4 PrintWriter
The PrintWriter class is a replacement for Java
1.0's PrintStream class that
properly handles multibyte character sets and international text. Sun
originally planned to deprecate PrintStream in
favor of PrintWriter but backed off when it
realized this step would invalidate too much existing code,
especially code that depended on System.out.
Nonetheless, new code should use PrintWriter
instead of PrintStream.Aside from the constructors, the PrintWriter class
has an almost identical collection of methods to
PrintStream. These include:
public PrintWriter(Writer out)Most of these methods behave the same for
public PrintWriter(Writer out, boolean autoFlush)
public PrintWriter(OutputStream out)
public PrintWriter(OutputStream out, boolean autoFlush)
public void flush( )
public void close( )
public boolean checkError( )
protected void setError( )
public void write(int c)
public void write(char[] text, int offset, int length)
public void write(char[] text)
public void write(String s, int offset, int length)
public void write(String s)
public void print(boolean b)
public void print(char c)
public void print(int i)
public void print(long l)
public void print(float f)
public void print(double d)
public void print(char[] text)
public void print(String s)
public void print(Object o)
public void println( )
public void println(boolean b)
public void println(char c)
public void println(int i)
public void println(long l)
public void println(float f)
public void println(double d)
public void println(char[] text)
public void println(String s)
public void println(Object o)
PrintWriter as they do for
PrintStream. The exceptions are the four
write( ) methods, which write characters rather
than bytes; also, if the underlying writer properly handles character
set conversion, so do all the methods of the
PrintWriter. This is an improvement over the
noninternationalizable PrintStream class, but
it's still not good enough for network programming.
PrintWriter still has the problems of platform
dependency and minimal error reporting that plague
PrintStream.It isn't hard to write a
PrintWriter class that does work for network
programming. You simply have to require the programmer to specify a
line separator and let the IOExceptions fall where
they may. Example 4-2 demonstrates. Notice that all the constructors
require an explicit line-separator string to be provided.
Example 4-2. SafePrintWriter
/*This class actually extends Writer rather than
* @(#)SafePrintWriter.java 1.0 04/06/28
*
* Placed in the public domain
* No rights reserved.
*/
package com.macfaq.io;
import java.io.*;
/**
* @version 1.1, 2004-06-28
* @author Elliotte Rusty Harold
* @since Java Network Programming, 2nd edition
*/
public class SafePrintWriter extends Writer {
protected Writer out;
private boolean autoFlush = false;
private String lineSeparator;
private boolean closed = false;
public SafePrintWriter(Writer out, String lineSeparator) {
this(out, false, lineSeparator);
}
public SafePrintWriter(Writer out, char lineSeparator) {
this(out, false, String.valueOf(lineSeparator));
}
public SafePrintWriter(Writer out, boolean autoFlush, String lineSeparator) {
super(out);
this.out = out;
this.autoFlush = autoFlush;
if (lineSeparator == null) {
throw new NullPointerException("Null line separator");
}
this.lineSeparator = lineSeparator;
}
public SafePrintWriter(OutputStream out, boolean autoFlush,
String encoding, String lineSeparator)
throws UnsupportedEncodingException {
this(new OutputStreamWriter(out, encoding), autoFlush, lineSeparator);
}
public void flush( ) throws IOException {
synchronized (lock) {
if (closed) throw new IOException("Stream closed");
out.flush( );
}
}
public void close( ) throws IOException {
try {
this.flush( );
}
catch (IOException ex) {
}
synchronized (lock) {
out.close( );
this.closed = true;
}
}
public void write(int c) throws IOException {
synchronized (lock) {
if (closed) throw new IOException("Stream closed");
out.write(c);
}
}
public void write(char[] text, int offset, int length) throws IOException {
synchronized (lock) {
if (closed) throw new IOException("Stream closed");
out.write(text, offset, length);
}
}
public void write(char[] text) throws IOException {
synchronized (lock) {
if (closed) throw new IOException("Stream closed");
out.write(text, 0, text.length);
}
}
public void write(String s, int offset, int length) throws IOException {
synchronized (lock) {
if (closed) throw new IOException("Stream closed");
out.write(s, offset, length);
}
}
public void print(boolean b) throws IOException {
if (b) this.write("true");
else this.write("false");
}
public void println(boolean b) throws IOException {
if (b) this.write("true");
else this.write("false");
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void print(char c) throws IOException {
this.write(String.valueOf(c));
}
public void println(char c) throws IOException {
this.write(String.valueOf(c));
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void print(int i) throws IOException {
this.write(String.valueOf(i));
}
public void println(int i) throws IOException {
this.write(String.valueOf(i));
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void print(long l) throws IOException {
this.write(String.valueOf(l));
}
public void println(long l) throws IOException {
this.write(String.valueOf(l));
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void print(float f) throws IOException {
this.write(String.valueOf(f));
}
public void println(float f) throws IOException {
this.write(String.valueOf(f));
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void print(double d) throws IOException {
this.write(String.valueOf(d));
}
public void println(double d) throws IOException {
this.write(String.valueOf(d));
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void print(char[] text) throws IOException {
this.write(text);
}
public void println(char[] text) throws IOException {
this.write(text);
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void print(String s) throws IOException {
if (s == null) this.write("null");
else this.write(s);
}
public void println(String s) throws IOException {
if (s == null) this.write("null");
else this.write(s);
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void print(Object o) throws IOException {
if (o == null) this.write("null");
else this.write(o.toString( ));
}
public void println(Object o) throws IOException {
if (o == null) this.write("null");
else this.write(o.toString( ));
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
public void println( ) throws IOException {
this.write(lineSeparator);
if (autoFlush) out.flush( );
}
}
FilterWriter, unlike
PrintWriter. It could extend
FilterWriter instead; however, this would save
only one field and one line of code, since this class needs to
override every single method in FilterWriter
(close( ), flush( ), and all
three write( ) methods). The reason for this is
twofold. First, the PrintWriter class has to be
much more careful about synchronization than the
FilterWriter class. Second, some of the classes
that may be used as an underlying Writer for this
class, notably CharArrayWriter, do not implement
the proper semantics for close( ) and allow
further writes to take place even after the writer is closed.
Consequently, programmers have to handle the checks for whether the
stream is closed in this class rather than relying on the underlying
Writer out to do it for them.
• Table of Contents• Index• Reviews• Reader Reviews• Errata• AcademicJava Network Programming, 3rd EditionBy
Elliotte Rusty Harold Publisher: O'ReillyPub Date: October 2004ISBN: 0-596-00721-3Pages: 706
Thoroughly revised to cover all the 100+ significant updates
to Java Developers Kit (JDK) 1.5, Java Network
Programming is a complete introduction to
developing network programs (both applets and applications)
using Java, covering everything from networking fundamentals
to remote method invocation (RMI). It includes chapters on
TCP and UDP sockets, multicasting protocol and content
handlers, servlets, and the new I/O API. This is the
essential resource for any serious Java developer.
