Hack 17 Create a Search Robot


perform your searches for you.A collector
in search of a particular item or type of item may repeat the same
search, often several times a week. A serious collector, knowing that
items sometimes sell within hours of being listed (see [Hack #26]), may repeat a search several
times a day for an item. But who has the time?The Favorites tab of the My eBay page, which allows you to keep track
of up to 100 favorite searches (see [Hack #16]), also has a feature to email you
when new items matching your search criteria appear on the site. Just
check the Preferences link next to the search caption, and then turn
on the "Email me daily whenever there are new
items" option.Unfortunately, eBay's new-item notification feature
will send you notifications no more than once a day, and in that
time, any number of juicy auctions could've started
and ended. So I created this hack to do my searches for me, and do
them as often as I see fit.
2.10.1 Constructing the Robot
By "scraping" eBay search
results with the
WWW::Search::eBay Perl module (developed by
Martin Thurn),
any Perl program can retrieve search results from eBay and manipulate
them any way you want. You can download the module for free from
search.cpan.org/perldoc?WWW::Search::eBay and
install it on any computer that has Perl. See Installing Perl Modules for installation
details.
Installing Perl Modules
(Adapted from Google Hacks by Tara
Calishain and Rael Dornfest)A few hacks in this book make use of add-on Perl modules, useful for
turning dozens of lines of messy code into a couple of concise
commands. If your Perl script resides on a server maintained by
someone else (typically an ISP administrator),
you'll have to request that they install the module
before you can reference it in your scripts. But if
you're the administrator, you'll
have to install it yourself.Installing on Unix and Mac OS X:Assuming you have the CPAN module, have root access, and are
connected to the Internet, installation should be no more complicated
than:
% su
% perl -MCPAN -e shell
cpan> install WWW::Search::Ebay
Note that capitalization counts; copy-and-paste the module name for
an exact match. If the install fails, you can try forcing an
installation by typing:
cpan> force install WWW::Search::Ebay
Go grab yourself a cup of coffee, meander the garden, read the paper,
and check back once in a while. Your terminal's sure
to be riddled with incomprehensible gobbledegook that you can, for
the most part, summarily ignore. You may be asked a question or
three; in most cases, simply hitting Return to accept the default
answer will do the trick.Windows installation via PPM:If you're running Perl under Windows, chances are
it's ActiveState's ActivePerl
(www.activestate.com/Products/ActivePerl/).
Thankfully, ActivePerl is outfitted with a CPAN-like module
installation utility. The Programmer's Package
Manager (PPM, aspn.activestate.com/ASPN/Downloads/ActivePerl/PPM/)
grabs nicely packaged module bundles from the ActiveState archive and
drops them into place on your Windows system with little need of help
from you. Simply launch PPM from inside a DOS terminal window and
tell it to install the module:
C:\>ppm
PPM> install WWW-Search-eBay
|
WWW::Search::eBay module to create nothing more
than an alternative interface to eBay's own search
tool, but the module's real value is how it can be
used behind the scenes.A robot is a program that does automatically what
you'd otherwise have to do manually. In this case,
we want a robot that automatically performs an eBay search at a
regular interval, and then emails us any new listings.Here's the script that does it all:
#!/usr/bin/perl
$searchstring = "railex"; [1]
$email = "dave\@ebayhacks.com";
$localfile = "/usr/localweb/ebayhacks/search.txt";
use WWW::Search; [2]
$searchobject = new WWW::Search('Ebay'); [3]
$query = WWW::Search::escape_query($searchstring);
$searchobject->native_query($query); [4]
# *** put results into two arrays ***
$a = 0;
while ($resultobject = $searchobject->next_result( )) { [5]
$a++;
($itemnumber[$a]) = ($resultobject->url =~ m!item=(\d+)!); [6]
$title[$a] = $resultobject->title; [7]
}
# *** eliminate entries already in file ***
open (INFILE,"$localfile");
while ( $line = <INFILE> ) {
for ($b = $a; $b >= 1; $b--) {
if ($line =~ $itemnumber[$b]) { [8]
splice @itemnumber, $b, 1;
splice @title, $b, 1;
}
}
}
close (INFILE);
$a = @itemnumber - 1;
if ($a == 0) { exit; }
# *** save any remaining new entries to file ***
open (OUTFILE,">>$localfile");
for ($b = 1; $b <= $a; $b++) {
print OUTFILE "$itemnumber[$b]\n"; [9]
}
close (OUTFILE);
# *** send email with new entries found ***
open(MAIL,"|/usr/sbin/sendmail -t"); [10]
print MAIL "To: $email\n";
print MAIL "From: $email\n";
print MAIL "Subject: New $searchstring items found\n\n";
print MAIL "The following new items have been listed on eBay:\n";
for ($b = 1; $b <= $a; $b++) {
print MAIL "$title[$b]\n";
print MAIL "http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=$itemnumber[$b]\n\n";
}
close(MAIL);
2.10.2 How It Works
The text to search ("railex" in this case) and the
email address of the recipient of the notification emails are
specified at the beginning of the script [1]. Naturally, you'll want to
modify these lines, as well as the $localfile
variable, which points to the file in which previous search results
are stored.Next, the WWW::Search::eBay module is referenced
[2] and the search is performed [4]. The $resultobject
construct [5] is then used to enumerate the
search results (if any) and retrieve such details as the item number
[6] (taken from the URL) and title [7] for each auction returned.All search results are then checked against a list of previous search
results [8], which are stored in a text file
($localfile). Once duplicate auctions have been
filtered out, the new auction numbers (if there are any left) are
appended to the file [9].Finally, a list of new auctions that meet the search criteria is
emailed to the email address. You may have to adjust line [10] to suit your system, either to specify a
different location for the sendmail executable
or to use a different command-line-based email client.
2.10.3 Running the Hack
The
search criteria you choose are entirely up to you, but narrow
searches make more sense for this hack than broad searches. For
instance, my example script targets Railex, a small German
manufacturer of handmade brass model trains known for being very
difficult to find. At any given time, there may be only a handful of
these items for sale on eBay, which means that I may receive a single
notification per month, if that. Conversely, a search yielding
hundreds of results would quickly fill up your mailbox with dozens of
emails with erroneous results. Use some of the other hacks in this
chapter to narrow your searches, if necessary.The best way to run this script is automatically at regular
intervals, unless you enjoy waking up at 3 A.M. and typing commands
into a terminal. How frequently you run the script is up to you, but
it wouldn't make sense to run it more often than you
check your email. In most cases, it's sufficient to
activate the search robot 3-4 times a day, but given that new
auctions can show up on eBay less than a minute after being listed,
you can run it once an hour if you like.
|
crontab -u username
-e to set up a cron job,
where username is, not surprisingly, your
username. In the editor that appears, add the following four lines:
0 0 * * * /home/mydirectory/scripts/search.pl
0 6 * * * /home/mydirectory/scripts/search.pl
0 12 * * * /home/mydirectory/scripts/search.pl
0 18 * * * /home/mydirectory/scripts/search.pl
where /home/mydirectory/scripts/search.pl
is the full path and filename of the script. Save the file when
you're done. This will instruct the server to run
the script every six hours: at midnight, 6:00 A.M., noon, and 6:00
P.M. See www.superscripts.com/tutorial/crontabl
for more information on crontab.If you're using Windows, open the Scheduled Tasks
tool, right-click on an empty area of the window, and select New.
(This bypasses the cumbersome wizard and goes directly to the
so-called "advanced" properties
sheet.) Type the full path and filename of the script in the Run
field, and then choose the Schedule tab. Turn on the
"Show multiple schedules" option,
and click New three times. Set up each of the four schedules to run
as follows: Daily at 12:00 A.M., Daily at 6:00 A.M., Daily at 12:00
P.M., and Daily at 6:00 P.M. Click OK when you're
done.Assuming all goes well, you should eventually get an email that looks
something like this:
To: dave@ebayhacks.com
From: dave@ebayhacks.com
Subject: New railex items found
The following new items have been listed on eBay:
Railex Snowplow, RARE
http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=3128955953
Railex Glaskasten, Green & Black, NEW NR
http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item= 3128013702
You should continue getting emails as new auctions matching your
criteria are listed on eBay; just click the links in the emails to
view the auctions.
2.10.4 Hacking the Hack
By default, the WWW::Search::eBay module searches
only titles. To search descriptions as well, change line [4] to the following:
$searchobject->native_query($query, {srchdesc => 'y'});
The search results are sorted by listing date, with newly listed
items shown first. You can, of course, sort the results manually, or
you can use the WWW::Search::eBay::ByEndDate
module (part of the WWW::Search::eBay
distribution) to sort by end date by replacing line [3] with the following:
$searchobject = new WWW::Search('Ebay::ByEndDate');
The WWW::Search::eBay module is only for searching
the U.S. eBay site (www.ebay.com). To search
non-U.S. eBay sites, use the
WWW::Search::EBayGlobal or
WWW::Search::EBayGlobal::ByEndDate modules.One of the drawbacks to eBay's built-in email
notification is that each search generates its own email; have 20
favorite searches, and you'll get up to 20 separate
emails every day. In this hack, you can accommodate multiple searches
by modifying lines [1] to [7] so that the script retrieves a list of
individual keywords from a separate file and then compiles a single
array from the results of all the searches. That way,
you'll only get a single email, regardless of the
number of different searches the robot performs.Once you've been notified of newly listed auctions,
you'll most likely want to keep track of their
progress, as described in [Hack #24].
If you want to be a little adventurous, you can modify the search
robot script to automatically write new entries to the
track.txt file used by the
track.pl script in [Hack #24]. That way, new auctions will
automatically show up in your watching list!