Perl Cd Bookshelf [Electronic resources]

نسخه متنی -صفحه : 875/ 704
نمايش فراداده

20.21. Program: hrefsub

hrefsub makes substitutions in HTML files, so changes apply only to text in <A HREF=" > tags. For instance, if you had the scoobyl file from the previous recipe, and you've moved shergoldl to be , you need but say:

% hrefsub shergoldl scoobyl <HTML><HEAD><TITLE>Hi!</TITLE></HEAD> <BODY><H1>Welcome to Scooby World!</H1> I have <A HREF=">pictures</A> of the crazy dog himself. Here's one!<P> <IMG SRC="/image/library/english/10159_scooby.jpg" ><P <BLINK>He's my hero!</BLINK> I would like to meet him some day, and get my picture taken with him.<P> P.S. I am deathly ill. <a href=">Please send cards</A>. </BODY></HTML>

The HTML::Filter manual page has a BUGS section that says:

Comments in declarations are removed from the declarations and then inserted as separate comments after the declaration. If you turn on strict_comment( ), then comments with embedded "-\|-" are split into multiple comments.

This version of hrefsub (shown in Example 20-13) always lowercases the a and the attribute names within this tag when substitution occurs. If $foo is a multiword string, then the text given to MyFilter->text may be broken such that these words do not come together; i.e., the substitution does not work. There should probably be a new option to HTML::Parser to make it not return text until the whole segment has been seen. Also, some people may not be happy with having their 8-bit Latin-1 characters replaced by ugly entities, so does that, too.

Example 20-13. hrefsub

#!/usr/bin/perl -w # hrefsub - make substitutions in <A HREF="> fields of HTML files # from Gisle Aas <gisle@aas.no> sub usage { die "Usage: $0 <from> <to> <file>\n" } my $from = shift or usage; my $to = shift or usage; usage unless @ARGV; # The HTML::Filter subclass to do the substitution. package MyFilter; use HTML::Filter; @ISA=qw(HTML::Filter); use HTML::Entities qw(encode_entities); sub start { my($self, $tag, $attr, $attrseq, $orig) = @_; if ($tag eq 'a' && exists $attr->{href}) { if ($attr->{href} =~ s/\Q$from/$to/g) { # must reconstruct the start tag based on $tag and $attr. # wish we instead were told the extent of the 'href' value # in $orig. my $tmp = "<$tag"; for (@$attrseq) { my $encoded = encode_entities($attr->{$_}); $tmp .= qq( $_="$encoded "); } $tmp .= ">"; $self->output($tmp); return; } } $self->output($orig); } # Now use the class. package main; foreach (@ARGV) { MyFilter->new->parse_file($_); }


20.20. Program:lsub21. mod_perl


Copyright © 2003 O'Reilly & Associates. All rights reserved.