Java Examples In A Nutshell (3rd Edition) [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Java Examples In A Nutshell (3rd Edition) [Electronic resources] - نسخه متنی

O'Reilly Media, Inc

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید








3.7 Filtering Character Streams


FilterReader is an abstract class that defines a
null filter; it reads characters from a specified
Reader and returns them with no modification. In
other words, FilterReader defines no-op
implementations of all the Reader methods. A
subclass must override at least the two read( )
methods to perform whatever sort of filtering is necessary. Some
subclasses may override other methods as well. Example 3-6 shows RemoveHTMLReader,
which is a custom subclass of FilterReader that
reads HTML text from a stream and filters out all of the HTML tags
from the text it returns.

In the example, we implement the HTML
tag filtration in the three-argument version of read(
)
, and then implement the no-argument version in terms of
that more complicated version. The example includes an inner
Test class with a main( )
method that shows how you might use the
RemoveHTMLReader class.

Note that we could also define a RemoveHTMLWriter
class by performing the same filtration in a
FilterWriter subclass. Or, to filter a byte stream
instead of a character stream, we could subclass
FilterInputStream and
FilterOutputStream.
RemoveHTMLReader is only one example of a filter
stream. Other possibilities include streams that count the number of
characters or bytes processed, convert characters to uppercase,
extract URLs, perform search-and-replace operations, convert
Unix-style LF line terminators to Windows-style CRLF line
terminators, and so on.

Example 3-6. RemoveHTMLReader.java

package je3.io;
import java.io.*;
/**
* A simple FilterReader that strips HTML tags (or anything between
* pairs of angle brackets) out of a stream of characters.
**/
public class RemoveHTMLReader extends FilterReader {
/** A trivial constructor. Just initialize our superclass */
public RemoveHTMLReader(Reader in) { super(in); }
boolean intag = false;// Used to remember whether we are "inside" a tag
/**
* This is the implementation of the no-op read( ) method of FilterReader.
* It calls in.read( ) to get a buffer full of characters, then strips
* out the HTML tags. (in is a protected field of the superclass).
**/
public int read(char[ ] buf, int from, int len) throws IOException {
int numchars = 0; // how many characters have been read
// Loop, because we might read a bunch of characters, then strip them
// all out, leaving us with zero characters to return.
while (numchars == 0) {
numchars = in.read(buf, from, len); // Read characters
if (numchars == -1) return -1; // Check for EOF and handle it.
// Loop through the characters we read, stripping out HTML tags.
// Characters not in tags are copied over previous tags
int last = from; // Index of last non-HTML char
for(int i = from; i < from + numchars; i++) {
if (!intag) { // If not in an HTML tag
if (buf[i] == '<') intag = true; // check for tag start
else buf[last++] = buf[i]; // and copy the character
}
else if (buf[i] == '>') intag = false; // check for end of tag
}
numchars = last - from; // Figure out how many characters remain
} // And if it is more than zero characters
return numchars; // Then return that number.
}
/**
* This is another no-op read( ) method we have to implement. We
* implement it in terms of the method above. Our superclass implements
* the remaining read( ) methods in terms of these two.
**/
public int read( ) throws IOException {
char[ ] buf = new char[1];
int result = read(buf, 0, 1);
if (result == -1) return -1;
else return (int)buf[0];
}
/** This class defines a main( ) method to test the RemoveHTMLReader */
public static class Test {
/** The test program: read a text file, strip HTML, print to console */
public static void main(String[ ] args) {
try {
if (args.length != 1)
throw new IllegalArgumentException("Wrong number of args");
// Create a stream to read from the file and strip tags from it
BufferedReader in = new BufferedReader(
new RemoveHTMLReader(new FileReader(args[0])));
// Read line by line, printing lines to the console
String line;
while((line = in.readLine( )) != null)
System.out.println(line);
in.close( ); // Close the stream.
}
catch(Exception e) {
System.err.println(e);
System.err.println("Usage: java RemoveHTMLReader$Test" +
" <filename>");
}
}
}
}


/ 285