Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید

20.7. Finding Stale Links


20.7.1. Problem



You want to check a document for invalid
links.

20.7.2. Solution


Use the technique outlined in
Recipe 20.3 to extract each link, and then
use LWP::Simple''s head function to make sure that
link exists.

20.7.3. Discussion


Example 20-5 is an applied example of the
link-extraction technique. Instead of just printing the name of the
link, we call LWP::Simple''s head function on it.
The HEAD method fetches the remote document''s metainformation without
downloading the whole document. If it fails, the link is bad, so we
print an appropriate message.

Because this program uses the get function from
LWP::Simple, it is expecting a URL, not a filename. If you want to
supply either, use the URI::Heuristic module described in Recipe 20.1.

Example 20-5. churl


  #!/usr/bin/perl -w
# churl - check urls
use HTML::LinkExtor;
use LWP::Simple;
$base_url = shift
or die "usage: $0 <start_url>\n";
$parser = HTML::LinkExtor->new(undef, $base_url);
$html = get($base_url);
die "Can''t fetch $base_url" unless defined($html);
$parser->parse($html);
@links = $parser->links;
print "$base_url: \n";
foreach $linkarray (@links) {
my @element = @$linkarray;
my $elt_type = shift @element;
while (@element) {
my ($attr_name , $attr_value) = splice(@element, 0, 2);
if ($attr_value->scheme =~ /\b(ftp|https?|file)\b/) {
print " $attr_value: ", head($attr_value) ? "OK" : "BAD", "\n";
}
}
}

Here''s an example of a program run:

% churl http://www.wizards.com
http://www.wizards.com:
FrontPage/FP_Color.gif: OK
FrontPage/FP_BW.gif: BAD
#FP_Map: OK
Games_Library/Welcomel: OK

This program has the same limitation as the HTML::LinkExtor program
in Recipe 20.3.

20.7.4. See Also


The documentation for the CPAN modules HTML::LinkExtor, LWP::Simple,
LWP::UserAgent, and HTTP::Response; Recipe 20.8

/ 875