Hacks 1917 Industrial.. Strength Tips and Tools [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Hacks 1917 Industrial.. Strength Tips and Tools [Electronic resources] - نسخه متنی

David A. Karp

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید










Hack 90 Spellcheck All Your Auctions


Ensure that your titles and descriptions are
spelled correctly.

The success of any auction is
largely due to how readily it can be found in eBay searches. As
described in Chapter 2, eBay searches show only
exact matches (with very few exceptions), which means, among other
things, that spelling most definitely counts.

Neither eBay's Sell Your Item form nor Turbo Lister
supports spellchecking of any kind. So it's left to
sellers to scrutinize their titles and auction descriptions, and to
obnoxious bidders to point out any mistakes. Once again, the API
comes to the rescue.


8.10.1 The Script


The following script requires the following modules and programs:


Module/program name


Available at


HTML::FormatText (by Sean M. Burke)


search.cpan.org/perldoc?HTML::FormatText


HTML::TreeBuilder (by Sean M. Burke)


search.cpan.org/perldoc?HTML::TreeBuilder


HTML::Entities (by Gisle Aas)


search.cpan.org/perldoc?HTML::Entities


Lingua::Ispell (by John Porter)


search.cpan.org/perldoc?Lingua::Ispell


ispell program (by Geoff Kuenning)


fmg-www.cs.ucla.edu/geoff/ispelll

#!/usr/bin/perl
require 'ebay.pl';
require HTML::TreeBuilder;
require HTML::FormatText;
use Lingua::Ispell qw( spellcheck );
Lingua::Ispell::allow_compounds(1);
$out1 = ";
$outall = ";
$numchecked = 0;
$numfound = 0;
$today = &formatdate(time);
$yesterday = &formatdate(time - 86400);
my $page_number = 1;
PAGE:
while (1) {
my $rsp = call_api({ Verb => 'GetSellerList', [1]
DetailLevel => 0,
UserId => $user_id,
StartTimeFrom => $yesterday,
StartTimeTo => $today,
PageNumber => $page_number
});
if ($rsp->{Errors}) {
print_error($rsp);
last PAGE;
}
foreach (@{$rsp->{SellerList}{Item}}) {
my %i = %$_;
$id = @i{qw/Id/};
if (! -e "$localdir/$id") {
my $rsp = call_api({ Verb => 'GetItem',
DetailLevel => 2,
Id => $id
});
if ($rsp->{Errors}) {
print_error($rsp)
} else {
my %i = %{$rsp->{Item}[0]};
my ($title, $description) = @i{qw/Title Description/};
$spellthis = $title . " " . $description; [2]
$tree = HTML::TreeBuilder->new_from_content($spellthis); [3]
$formatter = HTML::FormatText->new();
$spellthat = $formatter->format($tree);
$tree = $tree->delete; [4]
for my $r ( spellcheck( $spellthat ) ) { [5]
if ( $r->{'type'} eq 'miss' ) {
$out1 = $out1."'$r->{'term'}'";
$out1 = $out1." - near misses: @{$r->{'misses'}}\n";
$numfound++;
}
elsif ( $r->{'type'} eq 'guess' ) {
$out1 = $out1."'$r->{'term'}'";
$out1 = $out1." - guesses: @{$r->{'guesses'}}\n";
$numfound++;
}
elsif ( $r->{'type'} eq 'none' ) {
$out1 = $out1."'$r->{'term'}'";
$out1 = $out1." - no match.\n";
$numfound++;
}
}
$numchecked++;
if ($out1 ne ") {
$outall = $outall."Errors in #$id '$title':\n";
$outall = $outall."$out1\n\n";
$out1 = ";
}
}
}
}
last PAGE unless $rsp->{SellerList}{HasMoreItems};
$page_number++;
}
print "$numfound spelling errors found in $numchecked auctions:\n\n"; [6]
print "$outall\n";

This script is based on the one in [Hack #87], but has a few important additions
and changes.

First, instead of listing recently completed auctions, the
GetSellerList API call (line [1]) is used to retrieve auctions that have
started in the last 24 hours. This will work perfectly if the script
is run every 24 hours, say, at 3:00 P.M. every day, as described in
[Hack #17].

Second, since we want the auction descriptions, we need to use the
GetItem API call for each auction we spellcheck.
This means that spellchecking a dozen auctions will require 13 API
calls: one call to retrieve the list, and one for each auction.

The code actually responsible for performing spellcheck starts on
line [2], where the title and description
are concatenated into a single variable,
$spellthis, so that only one spellcheck is
necessary for each auction. Next, the
HTML::FormatText module is used (lines [3] to [4]) to convert
any HTML-formatted text to plain text.

Finally, the Lingua::Ispell module [5] uses the external ispell
program to perform a spellcheck on $spellthat (the
cleaned-up version of $spellthis). As errors are
found, suggestions are recorded into the $out1
variable, which is merged with $outall and
displayed when the spellcheck is complete.


8.10.2 Hacking the Hack


Here are a few things you might want to do with this script:

Instead of simply printing out the results of the spellcheck, as the
script does on line [6], you can quite
easily have the results emailed to you. See [Hack #93] for an example.

Currently, the script performs a spellcheck on every running auction
started in the last 24 hours. If you run the script every 24 hours,
then this won't pose a problem. But if you choose to
run the script manually and therefore specify a broader range of
dates, you may wish to include error checking to prevent the script
from needlessly checking the same auction twice.

If you're especially daring, you can have the
spellchecker submit the revisions for you, although I would never
trust a spellchecker to know how to spell all the weird names of my
items.



/ 164