20.20. Program: htmlsub
This program makes substitutions
in HTML files so changes happen only in normal text. If you had the
file scoobyl that contained:
<HTML><HEAD><TITLE>Hi!</TITLE></HEAD>
<BODY><H1>Welcome to Scooby World!</H1>
I have <A HREF=">pictures</A> of the crazy dog
himself. Here's one!<P>
<IMG SRC="/image/library/english/10159_/image
/library/english/10159_scooby.jpg" ><P>
<BLINK>He's my hero!</BLINK> I would like to meet him some day,
and get my picture taken with him.<P>
P.S. I am deathly ill. <A HREF=">Please send
cards</A>.
</BODY></HTML>
you could use htmlsub to change every occurrence
of the word "picture" in the document text to read "photo". It prints
the new document on STDOUT:
% htmlsub picture photo scoobyl
<HTML><HEAD><TITLE>Hi!</TITLE></HEAD>
<BODY><H1>Welcome to Scooby World!</H1>
I have <A HREF=">photos</A> of the crazy dog
himself. Here's one!<P>
<IMG SRC="/image/library/english/10159_/image
/library/english/10159_scooby.jpg" ><P>
<BLINK>He's my hero!</BLINK> I would like to meet him some day,
and get my photo taken with him.<P>
P.S. I am deathly ill. <A HREF=">Please send
cards</A>.
</BODY></HTML
The program is shown in Example 20-12.
Example 20-12. htmlsub
#!/usr/bin/perl -w
# htmlsub - make substitutions in normal text of HTML files
# from Gisle Aas <gisle@aas.no>
sub usage { die "Usage: $0 <from> <to> <file>...\n" }
my $from = shift or usage;
my $to = shift or usage;
usage unless @ARGV;
# Build the HTML::Filter subclass to do the substituting.
package MyFilter;
use HTML::Filter;
@ISA=qw(HTML::Filter);
use HTML::Entities qw(decode_entities encode_entities);
sub text
{
my $self = shift;
my $text = decode_entities($_[0]);
$text =~ s/\Q$from/$to/go; # most important line
$self->SUPER::text(encode_entities($text));
}
# Now use the class.
package main;
foreach (@ARGV) {
MyFilter->new->parse_file($_);
}